As multiple organizations are racing to build customized LLMs, a common question I have been asked is — what are the tools out there to streamline this process?
In this article, I show you how to build a fully functional application for engaging in conversations through a chatbot built on top of your documents. This application employs the power of ChatGPT/GPT-4 (or any other large language model) to extract information from document data stored as embeddings in a vector database, and Langchain for prompt chaining. Here’s a preview:
First, create a new folder named `app` where the source code for the application resides. This acts as the entry point for the streamlit application. Then create folders that perform different tasks like extracting text from PDF, creating text embeddings, storing embeddings, and finally — chatting. The `app` directory looks like this:
App Directory Structure | Skanda Vivek
PDF Upload
Upload a PDF and extract text for further processing.
A streamlit app that enables users to interact with the uploaded PDF. You can ask questions or doubts regarding the PDF and our Chatbot would answer them.