This project focuses on Exploratory Data Analysis (EDA) and Feature Engineering using the Zomato dataset to gain insights into restaurant trends, pricing, ratings, and customer preferences. Through data cleaning, visualization, and feature transformation, we prepare the dataset for predictive modeling and deeper analysis.
📂 Dataset
The dataset contains information about restaurants listed on Zomato, including:
Restaurant name
Location
Ratings
Cost for two
Cuisine types
Online delivery availability
Reviews
📊 Exploratory Data Analysis (EDA)
The following steps are performed to analyze the dataset:
Data Cleaning: Handling missing values, duplicates, and inconsistencies.
Data Visualization: Understanding trends through visualizations.
Outlier Detection: Identifying anomalies in price, ratings, etc.
Correlations: Exploring relationships between variables.
Geospatial Analysis: Studying restaurant density by location.
🏗️ Feature Engineering
Key feature engineering techniques include:
Encoding categorical variables (e.g., one-hot encoding for cuisines).
Creating new features (e.g., cost-to-rating ratio, sentiment analysis on reviews).
Feature scaling (e.g., normalization for cost variables).
Handling skewness in numerical features.
🛠️ Technologies Used
Python
Pandas, NumPy
Matplotlib, Seaborn
Scikit-learn
NLP (for review sentiment analysis, if applicable)
📜 Installation & Usage
Clone the repository:
Install dependencies:
Run the Jupyter Notebook:
Open EDA_Zomato.ipynb and follow the analysis.
📈 Results & Insights
Identified factors affecting restaurant ratings.
Analyzed cost distributions across different cuisines and locations.
Created new features that can enhance predictive modeling.
📬 Contributions
Contributions are welcome! Feel free to fork the repository and submit pull requests.
📄 License
This project is licensed under the MIT License.
Like this project
0
Posted Feb 12, 2025
This project focuses on Exploratory Data Analysis (EDA) and Feature Engineering using the Zomato dataset to gain insights into restaurant trends, pricing, rati…