Incident Response Fairness Analysis

Mohsen M

Data Scientist
ML Engineer
Dash Plotly
Matplotlib
Python

Incident Response Fairness Analysis

This project focuses on analyzing the fairness of incident response processes within an IT company. The analysis is based on an event log extracted from the audit system of an instance of the ServiceNow™ platform used by the organization.

Dataset

The dataset provides insights into the incident management process and includes the following details:
Number of instances: 141,712 events (24,918 incidents)
Number of attributes: 36 attributes
The data was anonymized for privacy, and information from a relational database was used to enrich the event log.

Data Cleaning

The data cleaning process involved several steps, including:
Handling missing values in date columns
Dropping columns with many "?" values
Removing rows with "?" values in specific columns
Extracting numeric parts from text columns
Filtering rows based on the 'incident_state' attribute

Exploratory Data Analysis

The exploratory data analysis phase involved visualizing various aspects of the dataset, such as incident priorities, impact, urgency, and resolution times.

Fairness Analysis

The fairness analysis was conducted using the Chi-Square test, which is a statistical method used to determine if there is a significant association between categorical variables. The analysis revealed significant associations between the 'assigned_to' attribute and various factors, such as reassignment count, knowledge, priority confirmation, and resolution time. These findings suggest potential unfairness in the assignment or decision-making processes, highlighting areas for further investigation and improvement.

Machine Learning

The project also involved a machine learning component to further explore the fairness of incident handling. Key steps included:
Calculating mean values for specific attributes ('assigned_to', 'reassignment_count', and 'resolution_time')
Selecting relevant columns for fairness analysis
Defining fairness thresholds based on mean values
Creating a binary target variable 'fairness' based on the defined criteria
Preprocessing the data using a ColumnTransformer for one-hot encoding and preserving numerical features
Training a RandomForestClassifier model on the training data
Evaluating the model's performance on the test data
The trained model achieved an accuracy of 86.67% on the test dataset.

Streamlit App Demo

demo.mp4

Kaggle Link

For further exploration and analysis, you can access the dataset and related notebooks on Kaggle.
Click the Kaggle logo above to access the dataset on Kaggle.
Partner With Mohsen
View Services

More Projects by Mohsen