This project is an end-to-end data engineering ETL pipeline that ingests retail datasets from Kaggle, validates and cleans them, performs feature engineering, and loads structured data into a SQL Server database. It also generates a data quality observability layer for tracking dataset health across runs.
The pipeline is designed to simulate real-world production workflows used in retail analytics and data engineering systems.
This project is an end-to-end data engineering ETL pipeline that ingests retail datasets from Kaggle, validates and cleans them, performs feature engineering...