I designed, built, and deployed a specialized Legal AI Assistant for French lawyers using agentic RAG, legal data pipelines, vector search, reranking, open-source LLMs, and citation-grounded answer generation. The system allowed lawyers to ask legal questions and receive answers grounded in French law articles, legal references, and relevant judicial cases.
Problem / Challenge
Legal data is very different from normal document data. A generic RAG pipeline using fixed-size chunks often breaks legal meaning, misses important context, or retrieves incomplete references.
The main challenges were:
š¹ Legal documents had different structures and lengths
š¹ Articles and laws could not be randomly split into fixed-size chunks
š¹ Each answer needed traceable legal references
š¹ Retrieval had to understand legal scope, not just semantic similarity
š¹ The system needed to reduce hallucinations for legal users
š¹ Deployment had to respect privacy and regulatory requirements
My Expertise
I worked as the Lead AI Engineer / Agentic RAG Developer responsible for the complete system design and implementation.
My responsibilities included:
š¹ Legal data pipeline architecture
š¹ Document parsing and preprocessing
š¹ Custom legal chunking strategy
š¹ Vector database design
š¹ Agentic RAG workflow development
š¹ Retrieval optimization and reranking
š¹ Open-source LLM deployment
š¹ Backend API development with FastAPI
š¹ Secure Azure cloud deployment
š¹ Multi-tenant system support
French Legal Data Engineering Pipeline
I built an automated ETL pipeline to process thousands of French legal documents, articles, and judicial cases.
The pipeline handled:
š¹ Raw legal document ingestion
š¹ Text cleaning and normalization
š¹ Legal article extraction
š¹ Section-aware document structuring
š¹ Custom chunk generation
š¹ Metadata extraction for article number, article title, section, source, and reference
š¹ Embedding generation
š¹ Vector database ingestion
š¹ Repeatable updates for future legal data expansion The chunking strategy was designed so legal articles were not cut in the middle or separated from their meaning.
Agentic RAG Workflow
Instead of using a simple one-step vector search, I built a LangGraph-based agentic RAG workflow.
The workflow included:
š¹ User query understanding
š¹ Legal intent detection
š¹ Legal domain and scope identification
š¹ Generation of 2ā5 targeted legal search queries
š¹ Retrieval of relevant chunks for each query
š¹ Deduplication of repeated results
š¹ Reranking of retrieved legal evidence
š¹ Source-grounded answer generation This improved tested retrieval accuracy from around 50% to 95%+.
Retrieval, Citations & Case Law
The retrieval system was designed to make answers transparent and verifiable.
I implemented:
š¹ Vector search for semantic legal retrieval
š¹ Reranking to improve relevance
š¹ Metadata-based source traceability
š¹ Citation-backed answer generation
š¹ Article-level legal references
š¹ Typesense-based retrieval for French judicial cases
š¹ Supporting case law returned with legal answers This allowed lawyers to verify the exact legal source behind each generated response.
Open-Source LLM & Cloud Deployment
I evaluated and deployed open-source LLM infrastructure for private legal AI usage.
The deployment included:
š¹ Qwen2.5:14B for French legal reasoning
š¹ Ollama and vLLM for model serving
š¹ Embedding and reranker models on a private Azure GPU VM
š¹ NVIDIA T4 16GB GPU optimization
š¹ Python/FastAPI backend APIs
š¹ Secure Azure deployment in the France region
š¹ Multi-tenant isolated access
š¹ GitHub CI/CD and Linux server management The system was designed for privacy, reliability, and regulatory compliance.
š¹ Built a production-ready legal AI assistant for lawyers
š¹ Improved retrieval accuracy from ~50% to 95%+ in tested scenarios
š¹ Reduced hallucinations through citation-grounded generation
š¹ Enabled lawyers to verify answers using article and case references
š¹ Created a scalable legal data pipeline for thousands of documents
š¹ Deployed private open-source LLM infrastructure for legal compliance
š¹ Delivered a strong foundation for future legal AI workflows
French Legal AI Assistant & Agentic RAG System
Overview
I designed, built, and deployed a specialized Legal AI Assistant for French lawyers using agentic RAG...