Chhavi Verma
Project Overview:
The Disease Prediction project aims to predict the likelihood of heart disease in patients using machine learning models. This project involves developing a web application that allows users to input patient data and receive predictions on heart disease risk.
Dataset taken from - https://www.kaggle.com/ronitf/heart-disease-uci
Prediction done by
KNN Classifier
Decision Tree
Random Forest
Dataset:
Source: UCI Heart Disease Dataset on Kaggle
Entries: 303
Features: 14 health indicators
Skills:
Machine Learning: Expertise in model development and evaluation.
Web Development: Creating user-friendly interfaces for model deployment.
Tools:
Python: For data analysis and machine learning using libraries like Pandas, Scikit-Learn, and Seaborn.
Jupyter Notebook: For exploratory data analysis and prototyping models.
Flask/Django: For developing the web application.
HTML/CSS/JavaScript: For front-end development.
AWS/Azure: For cloud deployment.
GitHub: For version control and project sharing.
Exploratory Data Analysis:
Data Exploration: Analyzed the dataset to understand the distribution of features and target classes.
Visualization: Used Seaborn to visualize correlations and distributions.
Feature Selection and Preprocessing:
Feature Engineering: Identified and selected the most relevant features.
Data Processing: Scaled numerical features and converted categorical variables into dummy variables.
Machine Learning Models:
K Neighbors Classifier (K=12): Achieved a mean accuracy of approximately 85%.
Random Forest Classifier: Achieved a mean accuracy of approximately 82%.
Decision Tree Classifier: Achieved a mean accuracy of approximately 73%.
Results:
K Neighbors Classifier: Best performance with mean accuracy ≈ 85%.
Random Forest Classifier: Mean accuracy ≈ 82%.
Decision Tree Classifier: Mean accuracy ≈ 73%.
Next Steps:
Model Optimization: Further fine-tuning and hyperparameter optimization to improve model performance.
Extended Analysis: Exploring additional classifiers and advanced techniques for better accuracy.
Lessons Learned:
Feature Engineering: Crucial for enhancing model performance.
Model Comparison: Importance of evaluating multiple classifiers and tuning their hyperparameters.
Check out the full project on GitHub
#MachineLearning #DataScience #HeartDiseasePrediction #AI #GitHub