Electoral Fraud Detection Using Machine Learning

Isola Emmanuel

Executive Summary

This geospatial analysis investigates voting pattern anomalies across polling units in Anambra State during the 2023 Nigerian Presidential Elections. Utilizing advanced statistical techniques, the study identified significant outliers in voting behavior for four major political parties: All Progressives Congress (APC), Labour Party (LP), People’s Democratic Party (PDP), and New Nigeria People’s Party (NNPP).
Key Findings:
All four parties exhibited notable outliers, primarily concentrated in Cluster 0
Significant variations in voting patterns were detected across multiple polling units
Geospatial and statistical analysis revealed potential electoral irregularities

1. Introduction

1.1 Research Objective

The primary objective of this study was to systematically identify and analyze outliers in voting patterns across Anambra State’s polling units, providing insights into potential electoral anomalies.

1.2 Scope

Geographic Focus: Anambra State, Nigeria
Election: 2023 Presidential Election
Parties Analyzed: APC, LP, PDP, NNPP

2. Methodology

2.1 Data Acquisition

Initially, multiple approaches were attempted to obtain geospatial data:
Google Cloud API (unsuccessful due to payment processing issues)
Geopy library (encountered connection timeout errors)
Manual data collection from cvr.inecnigeria.org (impractical given time constraints)
Final Solution: Utilized a pre-compiled CSV file from a data analysis channel, containing comprehensive longitude and latitude data for all polling units.

2.2 Analytical Approach

2.2.1 Clustering Methodology

Tools: Python with libraries including pandas, NumPy, sklearn.cluster, DBSCAN
Clustering Algorithm: DBSCAN (Density-Based Spatial Clustering of Applications with Noise)
Initial Radius: 1 km (resulting in 278 clusters)
Refined Radius: 3 km (reduced to 23 singular clusters)

2.2.2 Outlier Detection: Z-Score Calculation

Method: Standardized z-score calculation for each polling unit within clusters
Formula:
z-score = (polling unit votes - cluster mean) / cluster standard deviation
Special Handling: In cases of zero standard deviation, z-score set to zero to prevent division errors.

3. Detailed Findings

3.1 All Progressives Congress (APC)

Top Outliers in Cluster 0:
Ezinifite III (z-score: 37.2)
Awka II (z-score: 24.2)
Akwa (z-score: 15.7)
Cluster Mean Z-Score: -1.67453

3.2 Labour Party (LP)

Top Outliers:
Clusters 7 and 0 demonstrated most significant variations
Ama-Okpogba (z-score: 5.9)
Ama-Enugu II (z-score: 5.7)
Ihionuu/Nsugbe Street Area (z-score: 5.5)
Cluster Mean Z-Scores:
Cluster 7: 1.22507
Cluster 0: -1.00391

3.3 People’s Democratic Party (PDP)

Top Outliers in Cluster 0:
Amamkpu V Hall (z-score: 36.2)
Pioneer Primary School III (z-score: 22.6)
Amaudo Umuowele (z-score: 15.4)
Cluster Mean Z-Score: 0.070937

3.4 New Nigeria People’s Party (NNPP)

Top Outliers in Cluster 0:
Uruagu II (z-score: 36.1)
Ukpo I (z-score: 20.7)
Akwaeze (z-score: 16.6)
Cluster Mean Z-Score: 3.4518

4. Interpretation and Implications

4.1 Key Observations

All parties exhibited significant outliers predominantly in Cluster 0
Potential factors influencing outliers:
Differential campaign strategies
Varying candidate popularity
Localized political dynamics
Possible electoral irregularities

4.2 Methodological Insights

The geospatial and z-score analysis demonstrated effectiveness in:
Detecting voting pattern anomalies
Identifying statistically significant deviations
Providing a quantitative framework for electoral analysis

5. Recommendations

Enhance electoral monitoring techniques
Develop more sophisticated geospatial analysis tools
Conduct further investigations into polling units with extreme z-scores
Implement transparent data collection and verification processes

6. Limitations

Analysis based on available dataset
Geospatial data acquisition challenges
Potential unaccounted local contextual factors

7. Conclusion

This analysis underscores the importance of advanced statistical techniques in electoral research. By leveraging geospatial analysis and z-score calculations, we can identify and scrutinize voting pattern anomalies, contributing to more transparent and accountable electoral processes.
Like this project

Posted Nov 28, 2024

Democracy by the numbers: Tracking the untold narratives of political representation through cutting-edge data analysis.

Public Borrowing Down, Economy Up? Exploring UK Fiscal Shifts
Public Borrowing Down, Economy Up? Exploring UK Fiscal Shifts
From Repairs to Results: Strategic Data Modeling for Automotive
From Repairs to Results: Strategic Data Modeling for Automotive

Join 50k+ companies and 1M+ independents

Contra Logo

© 2025 Contra.Work Inc