Data Engineering Pipeline Development
Grant Fraley
Starting at
$
100
/hrAbout this service
Summary
What's included
Automated Data Pipelines
I will provide code that sources data from APIs or external systems into a data lake (e.g. AWS S3 or Google Cloud Storage), loads relevant data into the warehouse (e.g. BigQuery, Redshift, or Snowflake) and performs transformations and analytics to create production-ready tables in the warehouse. Pipelines may be written in SQL and Python and automated using workflow solutions such as dbt, Airflow, and AWS Step Functions. These automated workflows may be triggered on a schedule or upon events such as new data arriving.
Data Warehouse
Based on a review of business requirements, I'll use data engineering best practices to design a data model that can help answer a variety of questions and support ad-hoc analytic queries. The infrastructure and schemas will be created through code so that it is repeatable and maintainable, and I can work within existing database systems and cloud accounts or create new secure environments.
Data Lake
Using my experience designing petabyte scale data lakes, I can work with you to design a cloud storage layout and file formats to support ease of testing and loading data into your data warehouse.
Skills and tools
Work with me