Built and maintained config-driven data pipeline frameworks (Scala, Python) for efficient data ingestion, transformation, and loading into the catalog zone, ensuring adherence to data lake governance standards.
Automated big data pipelines (Python, SQL, Hive, Presto, Spark) using cutting-edge tools (Kubernetes, Docker, Airflow) for self-service reporting.
Set data lake standards and explored new tools to enhance the data lake.
Mentored junior engineers, built a collaborative data team, and ensured smooth project execution through knowledge sharing.
Translated business needs into technical solutions (frameworks & platforms) for stakeholders.
Empowered a 25-person data lake support team to independently handle 4000+ pipelines across markets.