Automated Attendance Sheet Processing

Khalid Faiz

Automation Engineer
ML Engineer
Software Engineer
NLTK
OpenCV
Python
Terre des hommes

Overview:

Project Details:

Objective:

The goal was to automate the tedious and error-prone process of splitting and renaming scanned PDF attendance sheets, ensuring that each file was correctly named with the attendee's name and the course they attended.

Tools and Technologies Used:

PyPDF2: For reading and manipulating PDF files.
pdf2image: To convert PDF pages into images for further processing.
Keras-OCR: A deep learning-based optical character recognition (OCR) tool to extract text from images, specifically the attendee's name and course title.
OpenCV: For image preprocessing tasks, including rotation correction and noise reduction, which enhanced the accuracy of text extraction.
Numpy: Used for efficient array manipulation and image processing tasks.
NLTK (Natural Language Toolkit): For fuzzy string matching to improve the accuracy of name and course recognition.
Poppler-Utils: A dependency for pdf2image, used to convert PDF files to images.

Accuracy and Performance:

The automated process significantly improved the efficiency and accuracy of handling attendance sheets. The OCR tool, combined with advanced image processing techniques, accurately extracted text from scanned documents, achieving high precision in identifying and matching attendee names and course titles. The fuzzy matching algorithm further enhanced accuracy by minimizing errors in text recognition, even when the scanned images were of varying quality.

Impact:

By automating the entire process, the manual effort required to manage attendance records was drastically reduced, allowing the outreach team to focus more on their core activities. This solution also minimized errors in document management, ensuring that each attendee's record was correctly documented.

This project showcases my ability to identify and solve workflow inefficiencies using technology, particularly in automating repetitive tasks and ensuring data accuracy.

Partner With Khalid
View Services

More Projects by Khalid