Arabic Manuscript Detection using Active learning

Ikram Aissiou

Data Scientist
Data Visualizer
Data Analyst
Jupyter
Python
TensorFlow
I developed an advanced system to detect and understand Arabic manuscripts, leveraging a large dataset of historical documents. The project focused on automating the classification and interpretation of handwritten Arabic texts by combining Optical Character Recognition (OCR) and natural language processing (NLP) techniques. Utilizing active learning, I significantly reduced the amount of labeled data required, allowing the model to continuously improve as more data was introduced. This approach enabled efficient handling of complex and varied handwriting styles, common in Arabic manuscripts. Furthermore, I applied deep learning techniques to improve text recognition accuracy and semantic understanding, enabling detailed analysis of the manuscripts. The system helped digitize, classify, and extract meaningful insights from historical Arabic texts, contributing to the preservation and accessibility of these documents for future academic and cultural research.
Key Technologies:
Python (Machine Learning, NLP)
Active Learning Algorithms
TensorFlow (Deep Learning)
Optical Character Recognition (OCR) for Arabic Text
Large-Scale Data Processing and Annotation
Outcomes:
Efficient detection and classification of Arabic manuscripts with limited labeled data
Improved accuracy in recognizing and understanding historical Arabic texts
Automated extraction of key information, aiding the preservation and study of cultural heritage
Enhanced the ability to digitize large archives of Arabic manuscripts for research and educational purposes
Partner With Ikram
View Services

More Projects by Ikram