An open-source Python RAG app. Documents (PDF/DOCX/MD/TXT) are loaded, split into paragraph-aware overlapping chunks, embedded, and stored in a ChromaDB vector store. At query time the question is embedded, the closest passages are retrieved via cosine top-K search, injected into a grounding-and-refusal system prompt, and the answer is streamed back with its source chunks shown. LLM and embedding providers are pluggable (Groq/OpenAI and local sentence-transformers/OpenAI) via .env, and it ships with both a Streamlit UI and a CLI. An evaluation harness measures retrieval hit-rate and answer accuracy on a labelled Q/A set.