In this project, I worked on building a live product recommender system based on customer preferences using the Amazon product dataset. The data set can be downloaded from here.
Technologies Used
Apache Spark
Apache Kafka
Flask
Dask
MongoDB
Architecture
Producer (Flask)
The Flask application serves as the producer. It collects reviewerID and productIDs from the user through an HTML form. These inputs are sent to the consumer for processing.
Consumer (Apache Spark)
The consumer processes the data received from the producer. It uses a machine learning model to generate product recommendations. The recommendations are saved in a MongoDB database.
Data Flow
User Input: The user enters reviewerID and productIDs on an HTML page.
Send to Consumer: Flask sends these inputs to the Apache Spark consumer.
Model Processing: The consumer processes the inputs using a recommendation model.
Save to MongoDB: The generated list of recommended products is saved in MongoDB.
Fetch Results: After a 10-second delay, Flask fetches the latest entry from MongoDB.
Display Results: The results are displayed on a result.html page.