I build a reliable PDF → structured JSON pipeline for extracting data from official/complex documents (multi-page PDFs, tables, mixed formatting, mixed languages). The focus is on accuracy and zero hallucination: outputs are schema-validated and traced back to the source.