Subhradip Roy
feature engineering
and designing a deep learning model
to predict ratings based on reviews. For which we will be using NLP
tools for feature extractions and preparing the data for deep learning models.nuanced understanding
of customer sentiments expressed in hotel reviews. The objective was to uncover valuable insights that could inform strategic decisions
and enhance customer satisfaction
.VADER for sentiment analysis
and Gensim's summarization.keywords module
, I extracted top keywords to provide a foundation for intuitive data visualizations. This initial phase laid the groundwork for a comprehensive analysis of customer sentiments
.NLTK
, incorporating lemmatization
to standardize words and ensure consistency. The Keras Tokenizer
class was employed to vectorize
the text corpus, optimizing
the data for subsequent deep learning endeavors
.sentiment
and rating
, people with 5-star ratings have the highest positive sentiment, whereas at lower ratings its mixed emotions showed by customers review, this can be related to sarcasm.relationship
between Ratings
and Sentiments
. From 3 to 5 rating most of the review sentiments are positive.common word
used in all three Sentiments was a hotel room
. Which is quite obvious following which hotel managers can now be directed to focus on if they want a better rating
from customers.NLTK
for natural language processing and to remove
the common words
and stop words
to enhance model performance.lemmatization
to convert words to their base form. Followed by text joining
making all the comma separated lemmatized words back into a string
. Then used PoerterStemmer
to improve the performance metric
.Keras Tokenizer
class to vectorize
the text corpus
. And finally employed the texts_to_sequences
method helps in converting tokens
of text corpus into a sequence of integers
.Long Short-Term Memory
i.e. LSTM
architecture for sentiment prediction
in hotel reviews. This deep learning model was meticulously fine-tuned
and validated
, with a keen focus on visualizing its performance metrics
. The resulting model, saved as "BiLSTM.h5",
serves as a readily accessible resource for replication and testing
, ensuring the sustainability and ease of use for future analyses. Here's a glimpse of the model architecture
.accuracy
and sparse categorical cross-entropy
for both training
and validation
setsperformance
using a classification report
trained model
has been saved as "BiLSTM.h5" for easy replication and testing."BiLSTM.h5"
model stands as a testament to the project's repeatability and sets the stage for ongoing analyses and enhancements.