COVID-19 Infection & Fatality Risk Modeling in New York State

Chaya Chaipitakporn

πŸ“š COVID-19 Infection & Fatality Risk Modeling in New York State

πŸ“Œ Project Type: Academic Research – Environmental & Epidemiological Modeling
πŸ“ Published in: Science of the Total Environment (STOTEN)
πŸ‘¨β€πŸ”¬ Role: Co-Author – Stepwise Regression & Data Support

🧩 Project Summary

This research study examined how demographics and air quality influenced COVID-19 infection and fatality rates across counties in New York State during the pandemic's first wave. The study revealed that infection and death were highest near NYC, while fatality (deaths per infection) was paradoxically higher in rural areas.

πŸ§ͺ My Technical Contributions

βœ… Stepwise Regression Modeling
Applied forward selection and backward elimination techniques to identify statistically significant predictors (demographic & environmental) for COVID-19 infection and fatality.
Helped determine which features most improved model accuracy (e.g., PM2.5, population age, distance to epicenter).
βœ… Data Wrangling & Cleaning
Merged multiple datasets (census, pollution, and epidemiological data) at the county level.
Preprocessed variables for model readiness: normalization, missing value handling, and encoding.
βœ… Feature Impact Interpretation
Analyzed how variable inclusion/exclusion altered regression accuracy and output.
Supported result validation to ensure models aligned with observed cluster behaviors.

πŸ”§ Techniques Used

πŸ” Key Insights from the Study

PM2.5 and distance to NYC were major predictors of infection spread
Fatality was more associated with elderly population and long-term pollution exposure
Spatial and demographic segmentation is crucial for targeted public health response
Model interpretability helped explain why certain rural areas had high fatality despite low infection

βœ… Relevance to Freelance Work

This project shows my ability to:
Select and justify data modeling techniques
Build and interpret multivariate regression models
Understand feature importance & business impact
Handle real-world public datasets at scale
Applicable for:
Health analytics, churn modeling, marketing attribution, or KPI drivers

πŸ› οΈ Tools & Skills Demonstrated

Python β€’ pandas β€’ stepwise regression (forward/backward) β€’ multivariate analysis β€’ data cleaning β€’ clustering analysis β€’ scientific writing
Like this project
0

Posted Apr 18, 2025

Modeled COVID-19 risks in NY using regression and data analysis.

Air Pollution & Sociodemographic Impact on COVID-19 Outcomes
Air Pollution & Sociodemographic Impact on COVID-19 Outcomes
Customer Segmentation Using RFM Model
Customer Segmentation Using RFM Model
Funnel Drop-Off Analysis for E-commerce
Funnel Drop-Off Analysis for E-commerce
Website Traffic & Engagement Dashboard Development
Website Traffic & Engagement Dashboard Development