Data Engineer

João Paulo Albuquerque

Data Modelling Analyst
Data Engineer
AWS
Python
SQL
Developed data pipelines using Azure Databricks within a medallion architecture in a Data Lakehouse, orchestrated by Databricks Workflows, mainly using PySpark and Spark SQL, to improve the efficiency of data processing and analysis for a brokerage services company.
Led, designed, and developed the implementation of data pipelines supporting OpenFinance, integrating sources like SQL Server, Oracle DB, NoSQL Databases and APIs, using Airflow, ensuring 100% regulatory compliance, contributing to a transparent financial ecosystem, and having cost savings, 30% reduction of the initial budget. Enhanced data accessibility for over one million customers.
Participated in the migration of Alteryx ETL pipelines using Python, SQL and Spark. Responsible for creating internal libraries and APIs, conducting queries in the Data Lake via Athena, orchestrating in Airflow, and storing in Parquet files. This strategic shift resulted in significant cost savings, specifically a reduction of R$ 120k and a data lake more robust and mature, reducing time consumption. After the migration, the pipelines took 10 minutes to be processed.
Partner With João Paulo
View Services

More Projects by João Paulo