Python Project – Predicting Medical Costs with Regression Models
Mohamed El Hamly
Data Scientist
Data Analyst
AI Model Developer
Python
scikit-learn
Overview
Project Goal: Construct and compare various predictive models for medical insurance costs, employing the OLS model as a baseline
Project Outcome: Analyzed a dataset with 1,338 entries to predict insurance costs. The polynomial regression model outperformed others, achieving a test RMSE of 4762.74 and a test R² score of 84.97%, indicating effective cost prediction
Project Methods: Log-transformed the skewed insurance charges for better normality, and selected predictors based on the observed relationships. Evaluated the models using RMSE and R² metrics while checking OLS assumptions. Also enhanced accuracy with SVR and polynomial features