Arabic Manuscript Detection using Active learning by Ikram AissiouArabic Manuscript Detection using Active learning by Ikram Aissiou

Arabic Manuscript Detection using Active learning

Ikram Aissiou

Data Scientist

Data Visualizer

Data Analyst

Jupyter

Python

TensorFlow

I developed an advanced system to detect and understand Arabic manuscripts, leveraging a large dataset of historical documents. The project focused on automating the classification and interpretation of handwritten Arabic texts by combining Optical Character Recognition (OCR) and natural language processing (NLP) techniques. Utilizing active learning, I significantly reduced the amount of labeled data required, allowing the model to continuously improve as more data was introduced. This approach enabled efficient handling of complex and varied handwriting styles, common in Arabic manuscripts. Furthermore, I applied deep learning techniques to improve text recognition accuracy and semantic understanding, enabling detailed analysis of the manuscripts. The system helped digitize, classify, and extract meaningful insights from historical Arabic texts, contributing to the preservation and accessibility of these documents for future academic and cultural research.

Key Technologies:

Python (Machine Learning, NLP)

Active Learning Algorithms

TensorFlow (Deep Learning)

Optical Character Recognition (OCR) for Arabic Text

Large-Scale Data Processing and Annotation

Outcomes:

Efficient detection and classification of Arabic manuscripts with limited labeled data

Improved accuracy in recognizing and understanding historical Arabic texts

Automated extraction of key information, aiding the preservation and study of cultural heritage

Enhanced the ability to digitize large archives of Arabic manuscripts for research and educational purposes

Like this project

Posted Sep 15, 2024

Developed a system for classifying and interpreting Arabic manuscripts using OCR, NLP, and deep learning, improving accuracy and aiding in digitization.

Likes

Views

Arabic Manuscript Detection using Active learning

Challenges

Challenges