Knowledge Graph RAG: 6-Stage Retrieval on Neo4j by Sergiu NicoaraKnowledge Graph RAG: 6-Stage Retrieval on Neo4j by Sergiu Nicoara

Knowledge Graph RAG: 6-Stage Retrieval on Neo4j

Sergiu Nicoara

Sergiu Nicoara

A 6-stage knowledge graph retrieval pipeline on Neo4j that combines vector search, lexical search, graph traversal, and neural re-scoring into a single grounded generation system. Classical RAG retrieves flat chunks with no understanding of relationships, no multi-hop reasoning, and no way to connect evidence across documents. This system does all three.

Retrieval pipeline

Stage 1-2: Vector ANN (3072d cosine) + BM25 lexical search
Stage 3: RRF fusion combining both ranked lists
Stage 4: Cross-encoder reranking (ms-marco-MiniLM-L-6-v2)
Stage 5: Depth-2 multi-hop graph traversal on Neo4j
Stage 6: GAT GNN re-scoring with query-adaptive α/β weights

Knowledge base

Formally modeled with OWL ontology enforcement and OWL-RL reasoning. Features bitemporal provenance, 4-stage entity resolution (exact → fuzzy → embedding cosine ≥ 0.92), 5-typed contradiction detection, and document authority hierarchy with SUPERSEDES chains.

Agentic fallback

IRCoT dual-LLM fallback (8B routing + 70B synthesis) auto-triggers on hedge signals or zero-citation responses, replacing hallucinated answers with explicit "insufficient context" refusal.

Results on real aerospace regulatory data

Faithfulness: 0.937
Context precision: 0.907
Context recall: 0.867
All measured with RAGAS, not eyeballed.
Stack: Python, FastAPI, Neo4j, LangChain, Redis, RabbitMQ, Docker, GCP.
Like this project