Customer Data Hub-Hadoop: Express

Dil Gurung

Created/Updated ETL scripts of Customer Data Hub(CDH). The project was named Loyalty Relaunch. Customers were tracked for loyalty status.
The client wanted to have Customer Loyalty data pipeline introduced to their ETL architecture for reporting purpose.
Data Sources drop customer related files in an SFTP. Files are read by Bash ETL process. First loading data into Hive stage area, doing necessary transformations(data cleansing, repairing, lookup, deduplication, etc) and populating gold and smith layer of hive google cloud respectively. A script will export that file into a SFTP which was read and populated into Teradata. Operational reports built were on Teradata warehouse.
Handling of JIRA tickets (Bug Fixes) during development.
Running batches in Control-M for testing the ETL process.
Technologies used: BigQuery, Teradata, Scala, Jira, Bash Scripting.
Like this project

Posted Nov 21, 2021

Likes

0

Views

19

Clients

Express

Operational Reporting: Belk Inc.
Operational Reporting: Belk Inc.
Customer Insights Visualization: Ralph Lauren
Customer Insights Visualization: Ralph Lauren
Operational Reporting: Metro Retail Store Group
Operational Reporting: Metro Retail Store Group

Join 50k+ companies and 1M+ independents

Contra Logo

© 2025 Contra.Work Inc