o
Patent filed for
pAIges – an intelligent document extraction product based on AI/ML.
o Designed, Implemented MLOps (for pAIges) using Azure
ML studio. Built data storage, pre-processing, hyper-parameter search
(Bayesian, grid), model training, model selection, versioning and model
deployment (as REST Api)
o Designed, implemented pAIges as a cloud based
(private cloud) multi-tenant, subscription model with customizable customer
specific post-extraction and data storage components. Implemented data security
at each architecture component level, assessed by external auditors for
Vulnerablity & Penetration testing. Brought governance and processes for
data security within team and ready for ISO 27001 assessment
o Designed, Implemented Event-driven architecture for
pAIges extraction and metering using kafka, and Azure serverless components
o Achieved high throughput and scale using Docker +
Azure Kubernetes and serverless architecture with optimized cost. E.g.
extraction for 500 documents in an hour with a cost of 1 Re/page
o Extraction of signatures, seals/stamps, tick marks
implemented using object detection CV model technique with 99.9% accuracy in
pAIges
o GIS mapping of buildings, roads, and trees using
Segment Anything model with 90% accuracy. Fine tuning the model using semantic
segmentation technique. Implemented a technique to split the image into smaller
tiles, applying the model and joining the tiles back with exact tracing of the
masks from each split