This project focuses on analyzing Life Expectancy data to understand the key factors that influence how long people live across different countries and years. Using Python and data science techniques, the project performs data cleaning, exploratory data analysis (EDA), feature understanding, and insights generation.
The goal is to:
Explore global life expectancy trends
Identify important socio-economic and health-related factors
Practice real-world data analysis using Python
This project is suitable for Data Analyst / Data Science portfolio demonstration.
Project Structure
Life-Expectancy-Project/ │── life_expectancy_improved.ipynb # Main Jupyter Notebook │── Life Expectancy Data.csv # Dataset used in the project │── README.md # Project documentation
Dataset Information
Dataset Name: Life Expectancy Data
Format: CSV
Rows: Multiple country-year observations
Key Features Include:
Country
Year
Life expectancy
Adult Mortality
GDP
Schooling
Alcohol consumption
BMI
Health expenditure
The dataset contains missing values and real-world inconsistencies, making it ideal for practical data analysis.
Tools & Technologies Used
Programming Language: Python
Environment: Jupyter Notebook
Libraries:
pandas – data manipulation
numpy – numerical operations
matplotlib & seaborn – data visualization
scikit-learn – preprocessing and analysis
Project Workflow
1️ Data Loading
Imported the dataset using pandas
Inspected shape, columns, and data types
2️ Data Cleaning
Handled missing values
Removed or corrected inconsistent data
Converted data types where necessary
3️ Exploratory Data Analysis (EDA)
Analyzed life expectancy trends over years
Compared developed vs developing countries
Studied correlations between life expectancy and key factors
Visualized distributions and relationships using plots
4️ Insights & Observations
Higher schooling and GDP are strongly associated with higher life expectancy
Adult mortality shows a strong negative correlation
Developed countries generally have higher and more stable life expectancy trends
Key Visualizations
Life expectancy trends over time
Correlation heatmaps
Distribution plots of major variables
Country-wise comparisons
Future Improvements
Build a machine learning model to predict life expectancy
Perform feature importance analysis
Add interactive dashboards (Power BI / Tableau)
Apply advanced statistical analysis
Author
Gurkirat Bains Aspiring Data Scientist
If you like this project
Give it a ⭐ on GitHub — it helps a lot and keeps me motivated!
Like this project
Posted Feb 25, 2026
This project focuses on analyzing Life Expectancy data to understand the key factors that influence how long people live across different countries and years