Developed and deployed data pipelines using Azure Data Factory to ingest, transform, and load complex datasets into the Data Lake (ADLS Gen 2) using REST API.
Implemented real-time monitoring tools to ensure seamless data flow and identify bottlenecks.
Explored ways to extract relevant features from the data to develop an anomaly detection model.
Developed a proof-of-concept solution using Apache Spark and Great Expectations to demonstrate the feasibility of implementing a data validation framework.
Communicated with stakeholders to understand their needs and translated their feedback into actionable insights.
Standardized documentation procedures across projects, facilitating knowledge transfer for future developers.