Convert unstructured data to structured

Hariharamoorthy Theriappan

ML Engineer
Software Architect
Software Engineer
Azure
Azure Functions
Python
Tao Automation
o    Patent filed for pAIges – an intelligent document extraction product based on AI/ML.
o    Designed, Implemented MLOps (for pAIges) using Azure ML studio. Built data storage, pre-processing, hyper-parameter search (Bayesian, grid), model training, model selection, versioning and model deployment (as REST Api)
o    Designed, implemented pAIges as a cloud based (private cloud) multi-tenant, subscription model with customizable customer specific post-extraction and data storage components. Implemented data security at each architecture component level, assessed by external auditors for Vulnerablity & Penetration testing. Brought governance and processes for data security within team and ready for ISO 27001 assessment
o    Designed, Implemented Event-driven architecture for pAIges extraction and metering using kafka, and Azure serverless components
o    Achieved high throughput and scale using Docker + Azure Kubernetes and serverless architecture with optimized cost. E.g. extraction for 500 documents in an hour with a cost of 1 Re/page
o    Extraction of signatures, seals/stamps, tick marks implemented using object detection CV model technique with 99.9% accuracy in pAIges
o    GIS mapping of buildings, roads, and trees using Segment Anything model with 90% accuracy. Fine tuning the model using semantic segmentation technique. Implemented a technique to split the image into smaller tiles, applying the model and joining the tiles back with exact tracing of the masks from each split
Partner With Hariharamoorthy
View Services

More Projects by Hariharamoorthy