Menu

Post image 1
Post image 2
Post image 3
1 / 3
0

How Data Preprocessing Impacts Machine Learning Models in Clinical Prediction

DEV Community·Carlos Peñalver Pérez·19 days ago
#mqMXSrES
Reading 0:00
15s threshold

One of the ideas I wanted to explore in this project was simple: how much does data preprocessing really affect the performance of Machine Learning models? In clinical prediction problems, this question becomes especially relevant. A model may achieve good overall accuracy, but still fail to detect the most important cases: patients at risk. For that reason, I wanted to focus not only on accuracy, but also on metrics such as recall, F1-score and the behaviour of the model on minority classes. The datasets For this project, I worked with three public clinical datasets: Diabetes Dataset : used to predict diabetes from variables such as glucose, blood pressure, insulin, BMI and age. Healthcare Stroke Dataset : focused on predicting stroke risk using demographic, clinical and lifestyle-related variables. Thyroid Disease Dataset : related to thyroid disease detection using clinical, hormonal and categorical features. Each dataset presented different challenges.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More