YouTube Analytics ETL Automation

Milcah

Milcah Mbithi

šŸ“Š YouTube Analytics ETL using Airflow & Python
This project automates the extraction, transformation, and loading (ETL) of YouTube channel data using Python and Apache Airflow. It fetches video stats from Alex The Analyst's channel and stores them in a database.
šŸ› ļø Tech Stack
Python
YouTube Data API v3
Apache Airflow (workflow orchestration)
pandas (data processing)
PostgreSQL (data storage)
dotenv (env management)
šŸ“ Project Structure
ā”œā”€ā”€ youtube_extract.py # Extracts video data from YouTube ā”œā”€ā”€ youtube_transform.py # Transforms the data ā”œā”€ā”€ youtube_load.py # Saves data to PostgreSQL ā”œā”€ā”€ dags/youtube_etl_dag.py # Airflow DAG definition ā”œā”€ā”€ .env # Environment variables ā”œā”€ā”€ alex_videos.csv # Raw extracted data ā”œā”€ā”€ youtube_alex_data_transformed.csv # Cleaned/transformed data ā”œā”€ā”€ README.md # Project documentation
šŸ”How It Works
Extract Uses YouTube API to fetch video metadata from a channel playlist.
Transform Processes timestamps, cleans, and structures data.
Load Saves the cleaned dataset to a database.
āœ… Use Cases
YouTube performance analytics
Content strategy reporting
Automated creator dashboards
šŸ“– Article Step-by-step write-up coming soon on
Like this project

Posted Jun 15, 2025

ETL pipeline analysing YouTube channel using Airflow, YouTube API v3, and Python to extract, process, and load data; visualised insights with Grafana dashboard.

Likes

0

Views

0

Timeline

Apr 15, 2025 - Apr 21, 2025