Optimizing Data Warehousing for Fortune 500 Companies

Ankit B

Data Modelling Analyst
Data Scientist
Data Engineer
Google Cloud Platform

• Execute OAuth and REST API calls to acquire client data in JSON format.

• Develop Python scripts for data extraction and categorization into cloud storage for incremental and full load using cloud infrastructure and Big Data Hadoop.

• Manage code base on GitHub, ensuring proper version control and documentation in Confluence.

• Establish Fivetran connections to facilitate ETL processes from buckets to cloud Data Warehouse.

• Collaborate on data transformations within Snowflake, optimizing query performance for analytics.

• Utilize DBT for efficient data modeling and transformations in both Dev and Prod environments.

● Designed, built, and launched new data pipeline ingestion in Enterprise Data Warehouse, mentoring others around efficient Python ETL implementation, Data Quality checks and Unit Test reducing job failures and deployment failures by 50%.

● Defined and managed SLA for all data sets in allocated areas of ownership, providing technical documentation, data flow diagrams, version control standardizing intake processes.

● Worked with product and finance teams to develop models for spend attribution to ads performance, clicked bookings, exposed bookings to support advertiser billing reports and campaign performance reporting.

● Participated in on-call rotations, responded to ad hoc data requests, and conducted analysis to provide valuable insights to partner groups, including sales, product, operations, and finance.

• Developed and implemented a scalable marketing data warehouse, dashboards, and data insights using data ingestion/transformation systems built in-house primarily on SQL Server, cloud platform, cloud data warehouse.

• Orchestrated big data pipelines (Hive, Presto, Spark) using Kubernetes, Docker, Airflow and utilized continuous integration using Terraform, Jenkins, Git that enabled self-service reporting.

• Worked closely with business and data science teams to productionize their statistical ads models such as First / Last touch attributions, Multi touch attributions for Salesforce Campaign Performance Reports.

• Ensured proper source control, documentation, Unit Test, and quality assurance processes were established, implemented, and followed to maintain high data integrity and data governance.

• Led the end-to-end ownership of data integration and reporting solutions for billing, invoicing, Salesforce, and finance teams, implementing SOX and GDPR compliance and other operational workflows for BI projects.

• Defined and managed SLA for data sets in allocated areas of ownership and implemented Correction of Errors (CoE) actions based on business requirements.

