Institutional Performance Analysis in Board Exams 2023 by Maria Saif Institutional Performance Analysis in Board Exams 2023 by Maria Saif

Institutional Performance Analysis in Board Exams 2023

Maria Saif

Maria Saif

Institutional Performance Analysis in Board Exams 2023

This repository contains a project for analyzing institutional performance in the BISE FSc exams of 2023 using Exploratory Data Analysis (EDA) and K-Means Clustering. The aim is to gain insights into grade distribution, passing percentages, and to classify institutions based on their performance.

Dataset

The dataset used in this project includes details on institutional performance, such as:
Institutional_Code: Unique ID for each institution.
Appeared: Number of students who appeared for the exams.
Passed: Number of students who passed.
Pass%: Percentage of students who passed from each institution.
Grades (A+, A, B, C, D, E): Counts of students achieving each grade.

Files

Institutional Result of Board Exams.csv: The dataset file with institutional performance data.
analysis.ipynb: Jupyter Notebook file where EDA and clustering analyses are conducted.

Analysis Overview

1. Exploratory Data Analysis (EDA)

This part of the project examines the data to understand the distribution of grades and passing percentages, as well as handling missing values.
Steps:
Loading and Displaying Data: Load the dataset and display basic information.
Summary Statistics: Generate summary statistics for an overview of each column.
Grade Distribution Visualization: Plot a bar chart showing the number of students achieving each grade.
Passing Percentage Distribution: Plot a histogram to visualize passing percentages across institutions.
Handling Missing Values: Identify and fill missing values with column means to ensure data consistency.

2. Clustering Analysis

This part of the project applies K-Means Clustering to classify institutions based on student appearance counts and passing percentages, helping identify similar performance patterns.
Steps:
Data Scaling: Scale data using StandardScaler for consistent units in clustering.
K-Means Clustering: Apply K-Means with 3 clusters to group institutions.
Cluster Visualization: Plot clusters on a scatter plot with passing percentage and student appearance counts.

Installation and Setup

To run this project, you need Python and the following libraries:
pip install pandas matplotlib seaborn scikit-learn

Usage

Clone this repository:
git clone [https://github.com/maria-saif20/Exploratory-and-Predictive-Analysis-of-BISE-FSC.git]
Load the Dataset: Place Institutional Result of Board Exams.csv in the project directory.
Run the Analysis: Open analysis.ipynb in Jupyter Notebook and execute each cell to perform EDA and clustering analysis.
Interpret Results:
Review the summary statistics, visualizations, and clustering output for insights on institutional performance.

Code Highlights

EDA Code

Loading Data
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

data = pd.read_csv('Institutional Result of Board Exams.csv')
print(data.head())
Grade Distribution Visualization
grades = ['Grade A+', 'A', 'B', 'C', 'D', 'E']
grade_counts = data[grades].sum()

plt.figure(figsize=(10, 6))
sns.barplot(x=grade_counts.index, y=grade_counts.values, palette='viridis')
plt.title('Total Students Achieving Each Grade')
plt.xlabel('Grade')
plt.ylabel('Number of Students')
plt.show()
Missing Values Handling
print(data.isnull().sum())
data.fillna(data.mean(), inplace=True)
print(data.isnull().sum())

Clustering Code

Scaling Data and K-Means
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans

scaler = StandardScaler()
X_scaled = scaler.fit_transform(data[['Pass%', 'Appeared']])

kmeans = KMeans(n_clusters=3, random_state=42)
kmeans.fit(X_scaled)

data['Cluster'] = kmeans.labels_
Cluster Visualization
plt.figure(figsize=(10, 6))
sns.scatterplot(x='Pass%', y='Appeared', hue='Cluster', data=data, palette='viridis')
plt.title('K-Means Clustering of Institutions')
plt.xlabel('Passing Percentage')
plt.ylabel('Number of Students Appeared')
plt.legend(title='Cluster')
plt.show()

Results

Grade Distribution: Visualizes the total number of students achieving each grade.
Passing Percentage Distribution: Shows passing rates among institutions.
Cluster Analysis: Groups institutions with similar performance for comparative insights.

Contributing

Contributions are welcome! To contribute:
Fork this repository.
Create a new branch (git checkout -b feature-branch).
Make your changes and commit (git commit -am 'Add new feature').
Push to the branch (git push origin feature-branch).
Create a new Pull Request.

License

This project is licensed under the MIT License.
Like this project

Posted Jul 21, 2025

Analyzed institutional performance in BISE FSc exams using EDA and K-Means Clustering.

Likes

0

Views

0

Timeline

Jan 1, 2023 - Dec 31, 2023