Project: Datahub Deployment and Metadata Automation
Scope:
- Implement and deploy Datahub on Google Cloud Platform (GCP)
- Develop a metadata intake and event automation system
- Create a notification system for metadata events
- Set up infrastructure, testing, and documentation
Description:
The deliver for this project was a automated deployment of Datahub, a metadata platform, on GCP instrumented for metadata intake and event processing. The solution includes a multi-tenant setup, automated metadata event handling, and a notification system for users. The project leverages GCP's Kubernetes engine, Datahub's Python SDK with REST Emitter, n8n for workflow automation, and Twilio for notifications.
Key Components:
- GCP Infrastructure Setup
- Configure GCP project, Kubernetes cluster, and CI/CD pipeline
- Datahub Deployment
- Deploy multi-tenant Datahub using Helm charts on GCP Kubernetes
- Metadata Intake and Event Automation
- Develop multi-tenant intake form and implement REST Emitter for metadata events
- Event Reaction and Notification System
- Set up n8n workflows and Twilio integration for event processing and notification
Outcomes:
- Fully functional Datahub deployment on GCP
- Automated metadata intake and event processing system
- User-friendly notification system for metadata events
- Comprehensive documentation and knowledge transfer to the client's team
This project implements a robust, scalable solution for metadata management and automation, enhancing data governance and visibility across the organization.