Automated Supplier Sourcing Tool by Pedro SantannaAutomated Supplier Sourcing Tool by Pedro Santanna

Automated Supplier Sourcing Tool

Pedro Santanna

Pedro Santanna

Organic Supplier Search Engine

Built a full-stack search tool that turns thousands of PDF certification documents into an instant, searchable supplier directory.

The Problem

A food & beverage company needed to find EU-organic certified suppliers for specific ingredients — think acai, jabuticaba, mango. The issue? Hundreds of PDF certificates scattered across a government directory, each in a different language. Manually reviewing them took weeks per sourcing round.

What I Built

An automated pipeline + search interface that reduced supplier discovery from weeks to minutes.
The system works in two parts:
1. Data Pipeline (fully automated)
Pulls supplier metadata from a public certification API
Downloads PDF certificates via headless browser scraping
Parses product listings from multi-language PDFs (handles Portuguese, French, Spanish)
Loads everything into a database with full-text search indexing
One command. ~10 minutes. Hundreds of suppliers, thousands of products — indexed and searchable by the user.
2. Search Interface
Real-time search with instant results as you type
Filter by country and product category
Results grouped by supplier with matched products highlighted
Direct links to original certification documents for verification

Stack

Next.js 14 / TypeScript / Tailwind CSS / Supabase (PostgreSQL) / Vercel / Puppeteer / pdf-parse

Key Decisions

Supabase full-text search over Algolia or Elasticsearch — right-sized for the dataset, zero extra cost, built-in PostgreSQL power
Scalable architecture — the web app is just a read layer; all complexity lives in the data pipeline, making it easy to re-run and extend to new product categories
Multi-language PDF parsing — normalized spelling (açaí = acai) so searches work regardless of how the certificate was written

Impact

What used to take a sourcing team weeks of manual PDF review is now a 10-second search. The pipeline is re-runnable, so the data stays fresh as certifications update.
Like this project

Posted Apr 2, 2026

Created a search tool to streamline finding EU-organic certified suppliers. From Raw PDFs to a Live Search Tool.