Solution & Approach: I built a fully automated invoice processing system on self-hosted
n8n that handles 10,000 pages monthly with 98% accuracy. The system monitors a
Google Drive folder for new invoices. When clients upload documents (including multi-page PDFs),
Azure Document Intelligence extracts text via OCR at about $5 monthly for this volume. Then
GPT-4o mini processes the extracted data through battle-tested prompts that handle edge cases like vendor names spanning two lines or optional tax fields. The AI extracts 70+ data fields per invoice. Processing takes about one minute per document, comparable to manual entry speed but without the human. All extracted data lands in
Google Sheets with proper formatting (numbers as numbers, dates as dates). The accounting software ingests this directly and flags anomalies automatically. Human reviewer only touches flagged items, provides feedback, and we iterate the prompt. Built-in
Stripe integration handles usage-based billing per page processed.