Predictive Modeling for Bank Customer Churn

Josip Novak

Data Scientist
Statistician
R
A snapshot from the project report.
A snapshot from the project report.
This project, conducted in R, uses the dataset about 10,000 customers of a particular bank with branches in France, Germany, and Spain. I did feature engineering, conducted EDA, and selected predictors of customer churn using random forest algorithm. To address the class imbalance in the outcome variable, I employed a combination of undersampling and oversampling, using the ROSE algorithm. Then I trained a blending ensemble comprising random forest, C5.0 tree, and support vector machine with Gaussian radial basis kernel function for binary classification. To expedite computation, I parallelized the procedure. The ensemble performed excellently in churn prediction, as indicated by accuracy, sensitivity, specificity, and AUC in excess of .95 on test dataset. The report can be found at this link, which leads to the portfolio website. https://jnova92.github.io/
Partner With Josip
View Services

More Projects by Josip