Data Engineer, Walmart

Akhil Prasad

Data Modelling Analyst

Data Engineer

Software Engineer

Python

SQL

Developer Data pipelines using Scala Spark and PySpark in GCP to ingest, transform, and load terabytes of data for near real-time analytics.

Created Data Models to organize and structure information effectively.

Improved Spark job performance by 250% by optimizing resource allocation, refining query execution plans, and utilizing built-in features, leading to notable reductions in processing times and resource usage.

Collaborated with data science team to implement machine learning model in pipelines, for using various fraud detection metrics.

Implemented robust data validation checks, cleansing procedures, and quality control measures to ensure data accuracy and integrity.

Developed efficient SQL queries and integrated table data into UI dashboards for real-time visualization and interactive exploration.

Utilized Airflow DAGs to efficiently manage and orchestrate complex data pipelines.

Migrated applications from on-premise Hadoop Ecosystem to Cloud (GCP) saving $4500 per month and improving performance by 2.5x.

Like this project

Posted Apr 10, 2024

Migrated Data Pipelines from On-Prim Hadoop ecosystem to Google Cloud Platform while performing various optimizations to the Scala and Python code

Likes

Views

Clients

Organic Healthy Skin Juice

Data Engineer, Walmart

Join 50k+ companies and 1M+ independents