Engineering Analytics Project

Hassan Nadeem

Product Data Analyst

Backend Engineer

Data Engineer

AWS

PostgreSQL

Python

Worked on an internal product that started off as a prototype to provide analytics on the performance of software engineers and their teams. For this MVP we leveraged Github's data, wrote an importer routine that would periodically fetch data and build analytics for engineering teams.

Later on, this was evolved to full fledged data engineering project. We leveraged Databricks to create data pipelines to implement the medallion (Bronze, Silver, Gold) architecture using Apache Spark. I was involved in the development of all three layers. Bronze - the layer where the data landed "as-is" from all external sources, Silver - Evolved from bronze layer where different entities were joined together providing an "enterprise view" and Gold - the layer where we stored "consumption-ready" data to specific databases. The purpose of gold layer is to provide de-normalized and read-optimize data ready for reporting. To leverage the gold layer further, we indexed it in Elasticsearch and created data service that enabled us to query this analytics data in Elasticsearch. This powered some of the core features of the company’s platform. We build these data pipelines for organizational data as well as external data sources.

Like this project

Posted Apr 27, 2023

As one of the main developers, worked on this project from start till the end. Writing code, tests, managing AWS, data pipelines also data workflow jobs.

Likes

Views

Clients

Mission

Engineering Analytics Project

Join 50k+ companies and 1M+ independents