Inventory Demand Forecasting using Machine Learning in R

Nitin Arora

Business Analyst
Data Analyst
Python
R
The dataset used is from Mexican multinational company, Grupo Bimbo. The delivery chain is present in countries of America, Europe, Asia, and Africa. It has an annual sales volume of 15 billion dollars. Grupo Bimbo delivers fresh bakery products to 1 million stores along its 45,000 routes across Mexico. There are five datasets available.
 
train.csv: It contains data available for training. The dataset contains about 7.4 billion entries over 11 features.
train.csv: It contains data available for testing.
cliente_tabla.csv: It contains the data of 93 million clients over 2 different features.
producto_tabla.csv : It contains the data for 2592 products.
town_state.csv : It contains the data for 790 towns 

Tech Stack

Language : R
Libraries : dplyr, ggplot2, caTools, xgboost, data.table, Matrix, lightgbm, gbm, caret, zoo, Data Combine, e1071
 

Approach

 
Exploratory data analysis (EDA)
Data visualization
Inference about features
Feature engineering
 
2. Data cleaning (outlier/missing values/categorical)
Missing value detection
Replacing special characters in data
 
3. Feature Engineering
Adding lag columns for target variable
Adding moving average columns for target variable
 
4. Model building on training data
XGBoost
GBM
SVM
 
5. Model validation
RMSE
 
6. Conclusion
Project Takeaways
Understanding the Business context and objective
Data Cleaning
Inference about data
Feature Engineering
Importing the dataset and importing libraries
Train and test split for model validation
Significance of Train Test split
Different Evaluation parameters and their meaning
Understanding Bagging and Boosting
Understanding XGBoost
Hyperparameter Tuning
Understanding GBM and light GBM
Understanding SVM
Building XGBoost model
Building GBM
Building SVM model

2022

Partner With Nitin
View Services

More Projects by Nitin