Loan Approval Prediction System

Ifenna Daniel

PROJECT TITLE: LOAN APPROVAL PREDICTION SYSTEM

INTRODUCTION

The task of this project was to create an ML algorithm to predict loan approval based on various factors.

Key Steps:

1. Data Loading: Load dataset with features: income, credit score, loan amount, DTI ratio, employment status, and loan purpose.
2. Data Preprocessing:
Handle missing values.
Encode categorical variables (e.g., employment status).
Scale numerical features.
Select key factors impacting approval
3. Model Training:
Split data into train/test (80/20)
Train a Random Forest classifier
4. Model Evaluation: Evaluate using metrics
### About Data This dataset contains information about loan applications, including various attributes of the applicants and the approval status of the loan request attributes such as Text, Income, Credit_score, Loan_amount, DTI_Ratio, and employment_status. The dataset had no missing values or any duplicates present. #### Encoding categorical variables The Employment_Status variable was converted to have 1 and 0 represent “Employed and Unemployed”, as well as the Approval variable having 1 to reprsent Approved and 0 representing Rejected.
### Visualizing the distribution of categorical Features (Approval and Employment_status Approval Distribution
Count: 0: 20067 1: 3933
Approval Distribution
Approval Distribution
Employment_Status Distribution
Count: 0: 12007 1: 11993

Balancing the Dataset

The Approval feature exhibited significant class imbalance, which could compromise the performance accuracy of the machine learning algorithm. To address this issue, I applied Random Over-Sampling Examples (ROSE) to balance the dataset. Initially, the class distribution was heavily skewed, with 83.6% of instances belonging to class 0 and 16.4% to class 1. After implementing ROSE, the dataset became more balanced, with a revised distribution of 50.6% for class 0 and 49.4% for class 1.

Building Model

The dataset was split into training and testing sets, excluding the Text feature variable. Prior to splitting and sampling, the numeric features underwent scaling to optimize performance for machine learning modeling. The model was built and used to predict the outcome of Predicted Approval in the Test data.
Performance of Model Accuracy: 0.9533 (95%) P-Value [Acc > NIR] : < 2.2e-16 Kappa: 0.9067 (90.67%) Sensitivity: 0.9316 (93.16%) Specificity: 0.9756 (97.56%) The model demonstrates high accuracy, sensitivity, and specificity, indicating strong performance in classifying both positive and negative classes.

Analysis

After balancing the classes, the classification Model achieved High performance metrics The P-value is less than 0.05 The accuracy achieved is 95% The Kappa value is 90.67%, indicating strong agreement between predicted and actual classes

Conculsion

This strong performance suggests the model is both highly accurate and reliable for predicting loan approvals
Like this project

Posted May 21, 2025

Developed an ML model to predict loan approval with 95% accuracy.

Likes

0

Views

0

Timeline

Apr 22, 2025 - May 23, 2025

Clients

Free

Retail Sales Analysis Project
Retail Sales Analysis Project
Bank Customer Churn Analysis
Bank Customer Churn Analysis

Join 50k+ companies and 1M+ independents

Contra Logo

© 2025 Contra.Work Inc