The project was to develop a smart ML/AI drove tool that would ease up the submission process in the insurance industry.
Objective
The main goal was to create a NER (name entity recognition) model which would extract names from e-mails tag with specific entities.
Procedure
1. The annotation was the most important step(other than reading mails in python console via tika, other text pre-processing). Tagging the name with correct entities which were done by regex.
2. After that, the entire corpus was used to create a word2vec model of 100 dimensions(found 100 after several iterations).
3. The results of the word2vec were then fed to a BLSTM model which has 3 normal layers and 1 bi-directional layer. Results: The BLSTM model was trained successfully for this problem statement we were focusing more on recall rather than precision. Because we wanted to capture more names rather than tagging only the correct ones.
Results
We are successfully able to build and deploy our NER model via Flask.
2018