To predict diabetes risk from recorded health metrics such as glucose, BMI, insulin, etc. and compare Logistic Regression vs tree-based models among other observations.
Random Forest and Gradient Boosting (the tree-based models) provide feature importance scores.
Generally, Glucose, BMI, and Age have most feature importance. Some differences are present as the gradient boosting model has a surprising importance score of approx. 0.4 for glucose compared to just 0.25 for the random forest model.
Like this project
Posted Aug 28, 2025
Compared models for diabetes prediction using Pima Indians dataset.