FlockParse is a fully local AI-powered document intelligence platform that:
✅ Extracts text from PDFs with multiple methods (PyPDF2 and pdftotext)
✅ Converts PDFs to multiple formats (TXT, Markdown, DOCX)
✅ Uses Ollama embeddings (mxbai-embed-large) for semantic search
✅ Enables AI-powered chat with your document knowledge base using llama3.1
✅ Works entirely offline with no data sent to external servers
✅ Preserves original document names in all converted filessudo apt-get install poppler-utilsbrew install popplersudo apt-get install tesseract-ocrbrew install tesseracthttp://0.0.0.0:8000 by default with the following endpoints:/upload/ POST Upload and process a PDF file /summarize/{file_name} GET Get an AI-generated summary of a document /search/?query=your_query GET Search for relevant documentsps aux | grep ollamasudo apt-get install tesseract-ocr (Linux)pip install ocrmypdfocrmypdf input.pdf output.pdf-layout option with pdftotext manually:lsof -i :8000pip install fastapi uvicorn/converted_files - Stores the converted document formats (flockparsecli.py)/knowledge_base - Contains the vector database and document chunks (flockparsecli.py)/uploads - Temporary storage for uploaded documents (flock_ai_api.py)/chroma_db - ChromaDB vector database (flock_ai_api.py)Posted Apr 9, 2025
FlockParser is an AI-powered PDF parser that extracts, processes, and organizes text from PDFs. It allows users to have extremely high level conversations
0
1
Feb 14, 2025 - Ongoing