Why: Single-pipeline task doesn't need LangChain's orchestration features. OpenAI's native structured outputs + Pydantic gives us schema enforcement with zero abstraction overhead.
When I'd use LangChain: Multi-step agents, tool routing, RAG systems, complex chains.
2. ThreadPoolExecutor (No Async)
Why: Simpler than async, works with sync openi client, perfect for IObound API calls. Processes 100 leads in aprox 20 seconds vs 200 seconds synchronously.
When I'd use async: 1000+ concurrent requests, websockets, or integration with async frameworks.
3. Tenacity for Retries
Why: standard retry library. Handles transient failures (rate limits, timeouts) with exponential backoff.
Alternative: Manual retry loops (+++ lines of boilerplate).
4. Fallback Enrichment
Why: The challenge requires resilience when LLM fails: rule based keyword matching ensures pipeline always produces output, even without API key.
Tradeoff: Lower accuracy than llm, but prevents total failure, the fallback also has limitations but ensures the pipeline never completely fails. (arund 70% accuracy vs LLM's 90%, but prevents total system failure.)
Production Considerations
Not implemented (out of the hrs scope):
Async processing: Celery + Redis for 10k+ leads/hour
Cost optimization: Cache by domain (70% reduction), use batch API
Monitoring: LangSmith for prompt debugging, Datadog metrics
CRM integration: Webhooks to crm
Score explainability: Breakdown by rule
Architecture: fastapi
Project Structure
This Readme has been enriched by the AI.
Like this project
Posted Apr 10, 2026
Developed AI lead enrichment and routing system using LLM.