Upload any invoice — PDF, image, or email — and instantly extract vendor, totals, line items, and dates. Powered by GPT-4o and Azure AI.
The Story
Invoice processing is one of the most tedious tasks in any business. I built this open source tool to eliminate it - upload a PDF, image, or forwarded email and the app extracts every critical field in seconds using a structured GPT-4o agent backed by regex fallbacks and MIME parsing.
The system handles messy, real-world invoices: scanned PDFs, low-quality images, multi-page documents, and inconsistent layouts. Azure OpenAI's structured output mode ensures the extracted data is always clean and ready to use - no hallucinations, no reformatting.
My Role
Full-Stack Developer
AI / Prompt Engineer
Product Designer
How I Built It
01
Document Ingestion & MIME Parsing
The pipeline starts before AI even touches the file. A MIME parser identifies the input type — PDF, image, or email attachment — and routes it to the correct pre-processor. PDFs are converted to page images, emails are stripped of HTML and extracted, and images are normalized for consistent OCR quality.
02
GPT-4o Structured Output Agent
The pre-processed content is sent to an Azure OpenAI agent using Structured Output mode — forcing the model to return a validated JSON schema with specific fields: vendor, invoice number, date, line items, subtotal, tax, and total. This eliminates hallucination risk and makes downstream parsing deterministic.
03
Regex Fallback Layer
For fields the AI misses or is uncertain about, a regex fallback layer runs pattern matching for common invoice formats — dates, currency amounts, tax ID patterns, and PO numbers. This hybrid AI + rule-based approach achieves near-perfect extraction accuracy across diverse invoice layouts.
04
Auth, Storage & Deployment
Secure cookie sessions handle auth with no external OAuth dependencies. Extracted results are stored in Supabase PostgreSQL with user-scoped access. The app is deployed on Render with environment parity to Azure App Service, keeping both options open for enterprise deployment.