Advanced RAG System with PDF Parsing

Dragutin Oreški

Data Scientist

ML Engineer

Data Engineer

LangChain

Ollama

Python

I successfully designed and implemented a cutting-edge RAG system, tailored to enhance data retrieval accuracy and efficiency. The project involved parsing PDFs to extract and structure data, creating embeddings, and storing them in the vector database for optimized retrieval. Some of the techniques used are Rewrite-Retrieve-Read, Hyde, query router, and hierarchical indexing. These methods collectively improved the system’s performance, ensuring accurate and quick data retrieval.

Like this project

Posted Aug 19, 2024

Generalized approach of parsing PDFs and creating a system to chat with them

Likes

Views

Advanced RAG System with PDF Parsing

Join 50k+ companies and 1M+ independents