Data Engineering & Analytics for Advanced Insights

Andrew Chauzov

Objective: Focused on enhancing football player data's quality, granularity, and applicability for diverse modeling and analytical purposes.
Performed correlation analysis and feature engineering to identify and select each player position's top 30 influential features.
Developed a four-phase data engineering pipeline comprising initial cleaning, feature selection, multi-level imputation, and feature scaling based on score correlations.
Implemented biannual data segmentation for integrating player data from leagues with different schedules, deploying the pipeline on a GCP server with results stored in BigQuery, and reprocessing three large data sources in under one day simultaneously.
Outcome: Successfully enhanced player data quality, significantly improving model quality by approximately 75% and streamlining the generation of time series features.
Like this project
0

Posted Jan 14, 2024

Improved football player data quality, boosted performance by 75% with efficient data engineering pipeline.

DTW-Based Hierarchical Clustering for FMCG Sales Time Series
DTW-Based Hierarchical Clustering for FMCG Sales Time Series
ML Development & Engineering for Football Team-Player Matching
ML Development & Engineering for Football Team-Player Matching
ML Development & Engineering for Child Speech Defect Detection
ML Development & Engineering for Child Speech Defect Detection
ML Development & Engineering for Hong Kong Horse Races
ML Development & Engineering for Hong Kong Horse Races