AuraExtract — Intelligent Invoice & Receipt Data Extractor
The extraction engine uses intelligent regex pattern matching that handles real-world invoice layouts — column-per-line PDF formats, inline tabular formats, and plain text documents. It detects 10 fields automatically and parses up to 20 line items per invoice.
Supports PDF, TXT, and DOCX formats. Includes a raw text preview panel so users can verify exactly what the engine is reading. CSV export includes both the summary fields and full line items table — ready to open directly in Excel.
Pure Python. Zero external dependencies beyond pypdf for PDF reading.
Like this project
Posted Mar 9, 2026
AuraExtract — Intelligent Invoice & Receipt Data Extractor
The extraction engine uses intelligent regex pattern matching that handles real-world invoice layo...