Ahsin Shabbir
To address the need to consistently categorize millions of pages of documents on a daily basis, the client hired me to come up with an AI solution. The client is in the mortgage industry and regularly receives documents with 200 pages, and thousands of documents of this size are processed daily. The need for the AI becomes clear at this scale. Prior to me being hired, the work was being performed entirely by humans. With the solution I provided, 70% of the work was done by the AI with 90% accuracy. This saved the company an astonishing $8million per year due to reduced headcount needed thanks to the automatic sorting by the AI.
The data consists of documents in PDF format. I used Python to perform Optical Character Recognition (OCR) on the PDF documents to extract the text. Example dataset is shown below
The solution that I developed was an AI model that understands the document content and outputs what it believes is the category of the document. I trained the model by sourcing data from the client by working with business analysts and senior managers. I organized the documents into a training dataset and an evaluation dataset. I determined that the model performs with 90% accuracy on the evaluation data. I setup the Python training and deployment code so that the client can continue to use the model long-term without any new code work needed. To retrain the model the client simply executes the script I wrote with specification of where the training and evaluation data are, and the acceptable accuracy for deployment, and the script takes care of the entire process. I trained the client's business analysts on how to use the script.
As you can see, there is a massive cost saving of using a machine learning AI model to perform document categorization. It is a one-time investment to hire me to develop the AI, after that, you have total ownership to retrain the model.
1) Exploratory Data Analysis
2) Data Preprocessing
3) Feature Engineering
4) One Hot Encoding
5) Base Models
6) Model Tuning
7) Natural Language Processing
8) Comparison of Final Models
9) Reporting
Work with me https://contra.com/ashabbi00_ursnb1o3?utm_campaign=social_sharing&utm_medium=independent_share&utm_source=copy_link