Sentiment Analysis of IMDB Movie Reviews by Ciro Sentiment Analysis of IMDB Movie Reviews by Ciro

Sentiment Analysis of IMDB Movie Reviews

Ciro

Ciro

Case Study: Sentiment Analysis of Movie Reviews (IMDB)

Overview

Implemented a complete Machine Learning + NLP pipeline to classify 50,000 IMDB movie reviews into positive and negative sentiments. The solution combined Logistic Regression with Tf‑idf vectorization, persisted results in PostgreSQL, and validated performance through Power BI dashboards.

Challenge

Raw text reviews are noisy and unstructured. The challenge was to clean and transform this dataset into reliable features, build a robust classifier, and integrate predictions into a relational database and BI platform for executive analysis.

Approach

Text Cleaning (NLP): Removed HTML tags, stopwords, and punctuation for high‑quality input.
Vectorization: Applied Tf‑idf to convert text into numerical features.
Modeling: Trained a Logistic Regression classifier for binary sentiment prediction.
Persistence: Stored predictions securely in PostgreSQL (resultados_prediccion table).
Visualization: Built Power BI dashboards to analyze distribution and confidence levels.

Solution

Delivered a scalable sentiment analysis workflow that integrates ML outputs into business systems. The model achieved balanced predictions (≈49% positive / 51% negative), with high confidence scores and minimal ambiguous cases

Impact

Accuracy & Balance: Maintained dataset distribution, confirming model robustness.
Confidence: Most predictions fell in high‑certainty ranges, reducing ambiguity.
Integration: Results persisted in PostgreSQL and visualized in Power BI for executive decision‑making.
Like this project

Posted Jan 2, 2026

Implemented a sentiment analysis pipeline for 50,000 IMDB reviews with ML and NLP.