Real-time Stock Data Streaming with Kafka and Cassandra

Milcah

Milcah Mbithi

šŸ“ˆ Stock Data Extraction using Apache Kafka, Cassandra & Confluent
This project demonstrates how to extract and stream real-time stock market data using Apache Kafka, process it with Python, and persist it in Apache Cassandra. It leverages Confluent Platform to simplify Kafka setup and management.
šŸ› ļø Tech Stack
Python
Apache Kafka (for real-time data streaming)
Confluent Platform (for easier Kafka management)
Apache Cassandra (NoSQL database for storing stock data)
Kafka-Python (Kafka client library)
JSON (data format)
šŸ“Œ Project Structure
ā”œā”€ā”€ kafka_producer.py        # Sends stock data to Kafka topic
ā”œā”€ā”€ kafka_consumer.py # Consumes stock data and inserts into Cassandra
ā”œā”€ā”€ README.md # Project documentation

šŸ” How It Works
1. Producer
Reads data extracted from polygonio
Publishes each record to Kafka topic stock_prices
2. Kafka (via Confluent Platform)
Acts as the message broker between producer and consumer
3. Consumer
Subscribes to stock_prices topic
Parses stock records and inserts them into Apache Cassandra
āœ… Use Cases
Real-time stock price dashboards
Historical stock data warehousing
Real-time analytics with Kafka + Cassandra
Financial ML model pipelines
šŸ“– Article
Want a step-by-step walkthrough? Check out the full write-up on https://dev.to/milcah03/stock-data-extraction-using-apache-kafka-59g0
Like this project

Posted Jun 15, 2025

Real-time ETL pipeline with Python, Kafka (Confluent), and Cassandra: streams stock data via the Polygon.io API into Kafka, processes, and stores in Cassandra.

Likes

0

Views

0

Timeline

Apr 1, 2025 - Apr 5, 2025