Data Engineering & Analytics for Advanced Insights
Andrew Chauzov
Data Scientist
Data Analyst
Data Engineer
Google BigQuery
Google Cloud Platform
Python
Objective: Focused on enhancing football player data's quality, granularity, and applicability for diverse modeling and analytical purposes.
Performed correlation analysis and feature engineering to identify and select each player position's top 30 influential features.
Developed a four-phase data engineering pipeline comprising initial cleaning, feature selection, multi-level imputation, and feature scaling based on score correlations.
Implemented biannual data segmentation for integrating player data from leagues with different schedules, deploying the pipeline on a GCP server with results stored in BigQuery, and reprocessing three large data sources in under one day simultaneously.
Outcome: Successfully enhanced player data quality, significantly improving model quality by approximately 75% and streamlining the generation of time series features.