Open WebUI + RAG: self-hosted AI assistant with Knowledge Base by Zehael MOpen WebUI + RAG: self-hosted AI assistant with Knowledge Base by Zehael M
Open WebUI + RAG: self-hosted AI assistant with Knowledge BaseZehael M
Cover image for Open WebUI + RAG: self-hosted AI assistant with Knowledge Base
I'll deploy Open WebUI on your server: ChatGPT for your team with RAG knowledge base
I'll set up Open WebUI — the leading open-source ChatGPT alternative (80,000+ GitHub stars) — on your own server. Your team gets a familiar chat interface connected to Claude, GPT-4o, Gemini, or free local models via Ollama. The killer feature: RAG (Retrieval-Augmented Generation) — upload your company docs, policies, and knowledge base, and AI answers questions based on your actual data. Everything stays on your server — no documents ever leave your infrastructure.

What you get

ChatGPT-like interface for your entire team — familiar UX, zero learning curve
RAG system — AI answers based on your documents: policies, manuals, knowledge bases, FAQs
Multi-provider — Claude, GPT-4o, Gemini, DeepSeek, Mistral, or free local models via Ollama
Multi-user — roles (admin/user), access control, chat history per user
Document upload — PDF, DOCX, TXT, Markdown, CSV — AI parses and indexes automatically
Full privacy — documents and conversations never leave your server
Web search — AI can pull live information from the internet when needed

What's included

A complete, production-ready Open WebUI setup:
Open WebUI installed via Docker on your VPS
Nginx reverse proxy + free SSL (Let's Encrypt)
Firewall (UFW): unnecessary ports closed
1 AI provider configured (OpenAI, Claude, or Gemini)
Up to 3 user accounts
Basic UI customization (logo, welcome message)
Setup guide included
You'll have a working, secured AI chat platform — ready for your team on day one.

Optional add-ons (priced per request)

Need RAG, local models, or enterprise features? Let's discuss your setup — I'll give you a custom quote based on scope:
RAG & Knowledge Base
Full RAG pipeline setup: embedding model, chunking strategy, retrieval optimization
Document upload and indexing (up to 50+ documents)
Multiple themed knowledge bases (e.g., "Policies," "Product," "HR") with access control
Custom system prompts tailored to your business workflows
Local Models (Ollama)
Ollama installation with Llama 3, Mistral, Qwen — no API costs
GPU optimization (CUDA) for maximum inference speed
Model selection consulting — right model for your use case and hardware
Enterprise & Access Control
Unlimited users with RBAC (groups, document-level permissions)
SSO integration (OAuth2 / LDAP / SAML)
S3-compatible storage for documents
Web search integration for real-time data
Custom Tools & Automation
Custom Tools/Functions for your business processes
API integrations with internal systems
Automated workflows and scheduled tasks
Maintenance & Support
Automated database and document backups
Usage monitoring (token consumption, user activity)
Full configuration documentation
Team onboarding session (30–60 min video call)
Extended post-setup support (7 / 14 / 30 days)
Monthly maintenance retainer
Just message me with what you need — I'll scope it out and send a quote within a few hours.

What I need from you

A VPS with Ubuntu/Debian (minimum 2 GB RAM, 2 CPU cores). No server yet? I'll help you choose one (from ~$5/mo)
SSH access (root or sudo)
Domain pointed to your server IP (optional, for SSL)
API key for your chosen AI provider (OpenAI, Claude, or Gemini)
For RAG and enterprise add-ons:
Documents for the knowledge base (PDF, DOCX, TXT, MD)
Description of knowledge base structure (topics, access levels)
User list (name, email, role)
For add-ons I may need additional access — we'll discuss specifics after you reach out.

Delivery time

1–2 business days for the base install. RAG pipeline and enterprise add-ons may take 2–4 days — confirmed after scoping.
FAQs
Setting up a "ChatGPT for your company" sounds simple until it isn't: * RAG is not plug-and-play — embedding models, chunking strategies, retrieval pipelines, and reranking all need to be configured correctly, or your AI gives useless answers * Default Open WebUI installs are wide open — no SSL, default credentials, exposed ports. Your confidential documents become public * Running local models (Ollama) is tricky — GPU passthrough, quantization levels, VRAM management, model selection — wrong choices mean slow or broken inference * Without proper RBAC, everyone sees everything — one user's financial docs visible to the entire team * Document processing pipelines break silently — PDFs fail to parse, chunks are too large, embeddings mismatch — you get confident wrong answers I handle all of this — so your team gets accurate, private AI from day one.
ChatGPT Team costs $25/user/month. A team of 10 = $3,000/year. Open WebUI on your server: VPS ~$10–20/month + API costs only for what you use. First-year savings: up to $2,700+. ChatGPT Team Open WebUI (my service) Annual cost (10 users) $3,000 From $60 (once) + VPS Document privacy OpenAI servers Your server only RAG on your docs Limited Full control Local models (no API cost) No Ollama (Llama, Mistral) Customization Minimal Unlimited Year 1 savings — Up to $2,700 Your documents, conversations, and business data never leave your server. Period.
Setup RAM CPU Storage Recommended VPS Base install (API only) 2 GB 2 cores 20 GB Hetzner CX22 (~$5/mo) RAG + documents 4 GB 2 cores 40 GB Hetzner CPX31 (~$11/mo) RAG + Ollama (local models) 8+ GB 4 cores 60 GB Hetzner CPX41 (~$18/mo) Ollama + GPU 16+ GB 4+ cores 100 GB GPU VPS (from ~$40/mo)
Starting at$60
Duration2 days
Tags
AI-assistant
OpenWebUI
RAG
self-hosted AI
Service provided by
Zehael M Amsterdam, Netherlands
Open WebUI + RAG: self-hosted AI assistant with Knowledge BaseZehael M
Starting at$60
Duration2 days
Tags
AI-assistant
OpenWebUI
RAG
self-hosted AI
Cover image for Open WebUI + RAG: self-hosted AI assistant with Knowledge Base
I'll deploy Open WebUI on your server: ChatGPT for your team with RAG knowledge base
I'll set up Open WebUI — the leading open-source ChatGPT alternative (80,000+ GitHub stars) — on your own server. Your team gets a familiar chat interface connected to Claude, GPT-4o, Gemini, or free local models via Ollama. The killer feature: RAG (Retrieval-Augmented Generation) — upload your company docs, policies, and knowledge base, and AI answers questions based on your actual data. Everything stays on your server — no documents ever leave your infrastructure.

What you get

ChatGPT-like interface for your entire team — familiar UX, zero learning curve
RAG system — AI answers based on your documents: policies, manuals, knowledge bases, FAQs
Multi-provider — Claude, GPT-4o, Gemini, DeepSeek, Mistral, or free local models via Ollama
Multi-user — roles (admin/user), access control, chat history per user
Document upload — PDF, DOCX, TXT, Markdown, CSV — AI parses and indexes automatically
Full privacy — documents and conversations never leave your server
Web search — AI can pull live information from the internet when needed

What's included

A complete, production-ready Open WebUI setup:
Open WebUI installed via Docker on your VPS
Nginx reverse proxy + free SSL (Let's Encrypt)
Firewall (UFW): unnecessary ports closed
1 AI provider configured (OpenAI, Claude, or Gemini)
Up to 3 user accounts
Basic UI customization (logo, welcome message)
Setup guide included
You'll have a working, secured AI chat platform — ready for your team on day one.

Optional add-ons (priced per request)

Need RAG, local models, or enterprise features? Let's discuss your setup — I'll give you a custom quote based on scope:
RAG & Knowledge Base
Full RAG pipeline setup: embedding model, chunking strategy, retrieval optimization
Document upload and indexing (up to 50+ documents)
Multiple themed knowledge bases (e.g., "Policies," "Product," "HR") with access control
Custom system prompts tailored to your business workflows
Local Models (Ollama)
Ollama installation with Llama 3, Mistral, Qwen — no API costs
GPU optimization (CUDA) for maximum inference speed
Model selection consulting — right model for your use case and hardware
Enterprise & Access Control
Unlimited users with RBAC (groups, document-level permissions)
SSO integration (OAuth2 / LDAP / SAML)
S3-compatible storage for documents
Web search integration for real-time data
Custom Tools & Automation
Custom Tools/Functions for your business processes
API integrations with internal systems
Automated workflows and scheduled tasks
Maintenance & Support
Automated database and document backups
Usage monitoring (token consumption, user activity)
Full configuration documentation
Team onboarding session (30–60 min video call)
Extended post-setup support (7 / 14 / 30 days)
Monthly maintenance retainer
Just message me with what you need — I'll scope it out and send a quote within a few hours.

What I need from you

A VPS with Ubuntu/Debian (minimum 2 GB RAM, 2 CPU cores). No server yet? I'll help you choose one (from ~$5/mo)
SSH access (root or sudo)
Domain pointed to your server IP (optional, for SSL)
API key for your chosen AI provider (OpenAI, Claude, or Gemini)
For RAG and enterprise add-ons:
Documents for the knowledge base (PDF, DOCX, TXT, MD)
Description of knowledge base structure (topics, access levels)
User list (name, email, role)
For add-ons I may need additional access — we'll discuss specifics after you reach out.

Delivery time

1–2 business days for the base install. RAG pipeline and enterprise add-ons may take 2–4 days — confirmed after scoping.
FAQs
Setting up a "ChatGPT for your company" sounds simple until it isn't: * RAG is not plug-and-play — embedding models, chunking strategies, retrieval pipelines, and reranking all need to be configured correctly, or your AI gives useless answers * Default Open WebUI installs are wide open — no SSL, default credentials, exposed ports. Your confidential documents become public * Running local models (Ollama) is tricky — GPU passthrough, quantization levels, VRAM management, model selection — wrong choices mean slow or broken inference * Without proper RBAC, everyone sees everything — one user's financial docs visible to the entire team * Document processing pipelines break silently — PDFs fail to parse, chunks are too large, embeddings mismatch — you get confident wrong answers I handle all of this — so your team gets accurate, private AI from day one.
ChatGPT Team costs $25/user/month. A team of 10 = $3,000/year. Open WebUI on your server: VPS ~$10–20/month + API costs only for what you use. First-year savings: up to $2,700+. ChatGPT Team Open WebUI (my service) Annual cost (10 users) $3,000 From $60 (once) + VPS Document privacy OpenAI servers Your server only RAG on your docs Limited Full control Local models (no API cost) No Ollama (Llama, Mistral) Customization Minimal Unlimited Year 1 savings — Up to $2,700 Your documents, conversations, and business data never leave your server. Period.
Setup RAM CPU Storage Recommended VPS Base install (API only) 2 GB 2 cores 20 GB Hetzner CX22 (~$5/mo) RAG + documents 4 GB 2 cores 40 GB Hetzner CPX31 (~$11/mo) RAG + Ollama (local models) 8+ GB 4 cores 60 GB Hetzner CPX41 (~$18/mo) Ollama + GPU 16+ GB 4+ cores 100 GB GPU VPS (from ~$40/mo)
$60