data_cleaning.py
): Preprocesses the raw data, handling missing values and outliers.exploratory_data_analysis.py
): Performs initial data visualization and statistical analysis to understand the dataset's characteristics.feature_engineering.py
): Creates new features and transforms existing ones to improve model performance.feature_exploratory_data_analysis.py
): Analyzes the engineered features, providing insights into their relationships and potential impact on the target variable.refined_model.py
): Trains the XGBoost model using the preprocessed and engineered features.streamlit_app.py
): Provides a user-friendly interface for interacting with the trained model and visualizing results.data/
directory of the project before running the scripts.calendar.csv
: Contains availability and pricing informationlistings.csv
: Detailed information about each Airbnb listingreviews.csv
: User reviews for the listingslistings.csv
file as it contained the most relevant information for price prediction.data/raw/
directory (see Data section)pip install -r requirements.txt
python src/data_cleaning.py
python src/feature_engineering.py
python src/refined_model.py
streamlit run streamlit_app.py
Posted Aug 6, 2024
This project aims to predict Airbnb listing prices in New York City using data from July 25, 2024. The goal is to help new Airbnb hosts set their prices.
0
1