AI-Powered Legal Research Platform for Semantic Case SearchAI-Powered Legal Research Platform for Semantic Case Search
The network for creativity
Join 1.25M professional creatives like you
Connect with clients, get discovered, and run your business 100% commission-free
Creatives on Contra have earned over $150M and we are just getting started
Project title
Caselaw AI — Semantic Legal Research Platform
One-line hook
AI-powered semantic search engine over 5.4M US legal cases, with 30M+ vector embeddings spanning 58 jurisdictions and 350+ years of case law.
Role
Solo engineer. Architecture, ingestion pipeline, vector database, backend API, frontend, deployment.
What I built
A legal research platform that lets you search millions of court cases by meaning instead of keywords. Traditional legal search is keyword-matching on case names and citations. Caselaw AI lets a lawyer type "cases where a company was held liable for an employee's off-duty conduct" and get relevant precedent back, even if none of those exact words appear in the case text.
End to end:
Ingestion pipeline over 81GB of raw case data (1,000 parquet files from the Caselaw Access Project)
30M+ vector embeddings generated via OpenAI, stored in a 45GB+ Qdrant database
6.3GB SQLite index for metadata and full-text search
FastAPI backend with hybrid search (semantic + keyword + metadata filters)
React/TypeScript frontend with filters by jurisdiction, court, date, and case type
PDF export, saved cases, research notes
Scale
5.4M legal cases
30M+ vector embeddings
58 US jurisdictions, 3,064 courts
Case law from 1662 to 2020
150GB total infrastructure footprint
Stack
Python, FastAPI, Qdrant, OpenAI embeddings, SQLite, React, TypeScript, Vite, Tailwind
Post image
Post image
Post image
Back to feed
The network for creativity
Join 1.25M professional creatives like you
Connect with clients, get discovered, and run your business 100% commission-free
Creatives on Contra have earned over $150M and we are just getting started