Project involved building AWS Pipelines for loading data from Hana or Oracle into Redshift using Python and
Pyspark for processing data as DataFrames.
Ensured proper data population and data cleansing.
Designed pipelines based on the given requirements.
Performed performance tuning, scheduled pipelines, and developed Glue jobs and Lambda scripts.
Maintained code versioning using CI/CD Pipeline.
Like this project
0
Posted Nov 6, 2023
Led the creation of AWS pipelines to facilitate the loading of data from Oracle, PostgreSQL, Hana
databases into Redshift. Employed Python PySpark scripts.