Heart Disease Prediction: Machine Learning with UCI DatasetHeart Disease Prediction: Machine Learning with UCI Dataset
The network for creativity
Join 1.25M professional creatives like you
Connect with clients, get discovered, and run your business 100% commission-free
Creatives on Contra have earned over $150M and we are just getting started
โค๏ธ Heart Disease Prediction using Machine Learning
๐Ÿ“Œ Problem Statement
Can we predict whether a patient has heart disease using clinical features?
Early detection of heart disease can save lives, making this a critical real-world machine learning problem.

๐Ÿ“Š Dataset
Source: UCI Heart Disease Dataset
Includes medical attributes such as:
Age
Cholesterol levels
Chest pain type
Maximum heart rate
Blood pressure

โš™๏ธ Project Workflow
Data Exploration (EDA)
Data Preprocessing
Model Training
Model Evaluation
Hyperparameter Tuning

๐Ÿค– Models Used
Logistic Regression
K-Nearest Neighbors (KNN)
Random Forest Classifier

๐Ÿ“ˆ Model Performance
MetricScore (%)Accuracy73.47%Precision83.00%Recall74.95%F1 Score73.36%

๐Ÿ” Key Insights
Logistic Regression performed best after tuning
Cross-validation revealed a drop in performance โ†’ highlighting generalization challenges
High precision suggests the model is effective at identifying positive cases
Real-world medical datasets are complex and rarely achieve extremely high accuracy

๐Ÿง  Lessons Learned
Accuracy alone is not enough โ€” precision and recall matter more in healthcare
Overfitting can give misleading results without proper validation
Simpler models (like Logistic Regression) can outperform complex ones

๐Ÿ›  Tools & Libraries
Python
Pandas
NumPy
Scikit-learn
Matplotlib
Seaborn

๐Ÿš€ Future Improvements
Feature engineering
Try advanced models (XGBoost, LightGBM)
Improve recall (important for medical predictions)
Deploy model as a web app

๐Ÿ“Ž Project Notebook
Check the full implementation in the Jupyter Notebook.

๐Ÿ™Œ Acknowledgements
UCI Machine Learning Repository
๐Ÿ”ฌ Feature Importance (Logistic Regression)
The model coefficients reveal which features most influence heart disease prediction.
๐Ÿ”บ Features Increasing Risk
ca (number of major vessels)ย โ†’ Strongest predictor
oldpeak (ST depression)ย โ†’ Indicates heart stress during exercise
exang (exercise-induced angina)ย โ†’ Associated with higher risk
restecg abnormalitiesย โ†’ Signals irregular heart activity
๐Ÿ”ป Features Decreasing Risk
Certain chest pain types (non-anginal)
Some dataset-specific patterns
Gender-related differences (model-specific behavior)
๐Ÿ’ก Insight
The model heavily relies on cardiovascular stress indicators and blood flow patterns, which aligns with real-world medical understanding of heart disease.
Post image
Back to feed
The network for creativity
Join 1.25M professional creatives like you
Connect with clients, get discovered, and run your business 100% commission-free
Creatives on Contra have earned over $150M and we are just getting started