Open WebUI + RAG: self-hosted AI assistant with Knowledge Base by Zehael MOpen WebUI + RAG: self-hosted AI assistant with Knowledge Base by Zehael M

Open WebUI + RAG: self-hosted AI assistant with Knowledge BaseZehael M

Cover image for Open WebUI + RAG: self-hosted AI assistant with Knowledge Base

I'll deploy Open WebUI on your server: ChatGPT for your team with RAG knowledge base

I'll set up Open WebUI — the leading open-source ChatGPT alternative (80,000+ GitHub stars) — on your own server. Your team gets a familiar chat interface connected to Claude, GPT-4o, Gemini, or free local models via Ollama. The killer feature: RAG (Retrieval-Augmented Generation) — upload your company docs, policies, and knowledge base, and AI answers questions based on your actual data. Everything stays on your server — no documents ever leave your infrastructure.

What you get

ChatGPT-like interface for your entire team — familiar UX, zero learning curve

RAG system — AI answers based on your documents: policies, manuals, knowledge bases, FAQs

Multi-provider — Claude, GPT-4o, Gemini, DeepSeek, Mistral, or free local models via Ollama

Multi-user — roles (admin/user), access control, chat history per user

Document upload — PDF, DOCX, TXT, Markdown, CSV — AI parses and indexes automatically

Full privacy — documents and conversations never leave your server

Web search — AI can pull live information from the internet when needed

What's included

A complete, production-ready Open WebUI setup:

Open WebUI installed via Docker on your VPS

Nginx reverse proxy + free SSL (Let's Encrypt)

Firewall (UFW): unnecessary ports closed

1 AI provider configured (OpenAI, Claude, or Gemini)

Up to 3 user accounts

Basic UI customization (logo, welcome message)

Setup guide included

You'll have a working, secured AI chat platform — ready for your team on day one.

Optional add-ons (priced per request)

Need RAG, local models, or enterprise features? Let's discuss your setup — I'll give you a custom quote based on scope:

RAG & Knowledge Base

Full RAG pipeline setup: embedding model, chunking strategy, retrieval optimization

Document upload and indexing (up to 50+ documents)

Multiple themed knowledge bases (e.g., "Policies," "Product," "HR") with access control

Custom system prompts tailored to your business workflows

Local Models (Ollama)

Ollama installation with Llama 3, Mistral, Qwen — no API costs

GPU optimization (CUDA) for maximum inference speed

Model selection consulting — right model for your use case and hardware

Enterprise & Access Control

Unlimited users with RBAC (groups, document-level permissions)

SSO integration (OAuth2 / LDAP / SAML)

S3-compatible storage for documents

Web search integration for real-time data

Custom Tools & Automation

Custom Tools/Functions for your business processes

API integrations with internal systems

Automated workflows and scheduled tasks

Maintenance & Support

Automated database and document backups

Usage monitoring (token consumption, user activity)

Full configuration documentation

Team onboarding session (30–60 min video call)

Extended post-setup support (7 / 14 / 30 days)

Monthly maintenance retainer

Just message me with what you need — I'll scope it out and send a quote within a few hours.

What I need from you

A VPS with Ubuntu/Debian (minimum 2 GB RAM, 2 CPU cores). No server yet? I'll help you choose one (from ~$5/mo)

SSH access (root or sudo)

Domain pointed to your server IP (optional, for SSL)

API key for your chosen AI provider (OpenAI, Claude, or Gemini)

For RAG and enterprise add-ons:

Documents for the knowledge base (PDF, DOCX, TXT, MD)

Description of knowledge base structure (topics, access levels)

User list (name, email, role)

For add-ons I may need additional access — we'll discuss specifics after you reach out.

Delivery time

1–2 business days for the base install. RAG pipeline and enterprise add-ons may take 2–4 days — confirmed after scoping.

FAQs

Setting up a "ChatGPT for your company" sounds simple until it isn't: * RAG is not plug-and-play — embedding models, chunking strategies, retrieval pipelines, and reranking all need to be configured correctly, or your AI gives useless answers * Default Open WebUI installs are wide open — no SSL, default credentials, exposed ports. Your confidential documents become public * Running local models (Ollama) is tricky — GPU passthrough, quantization levels, VRAM management, model selection — wrong choices mean slow or broken inference * Without proper RBAC, everyone sees everything — one user's financial docs visible to the entire team * Document processing pipelines break silently — PDFs fail to parse, chunks are too large, embeddings mismatch — you get confident wrong answers I handle all of this — so your team gets accurate, private AI from day one.

ChatGPT Team costs $25/user/month. A team of 10 = $3,000/year. Open WebUI on your server: VPS ~$10–20/month + API costs only for what you use. First-year savings: up to $2,700+. ChatGPT Team Open WebUI (my service) Annual cost (10 users) $3,000 From $60 (once) + VPS Document privacy OpenAI servers Your server only RAG on your docs Limited Full control Local models (no API cost) No Ollama (Llama, Mistral) Customization Minimal Unlimited Year 1 savings — Up to $2,700 Your documents, conversations, and business data never leave your server. Period.

Setup RAM CPU Storage Recommended VPS Base install (API only) 2 GB 2 cores 20 GB Hetzner CX22 (~$5/mo) RAG + documents 4 GB 2 cores 40 GB Hetzner CPX31 (~$11/mo) RAG + Ollama (local models) 8+ GB 4 cores 60 GB Hetzner CPX41 (~$18/mo) Ollama + GPU 16+ GB 4+ cores 100 GB GPU VPS (from ~$40/mo)

Zehael's other services

Cover image for OpenClaw: Self-Hosted AI Agent

OpenClaw: Self-Hosted AI Agent$48

Cover image for Immich: Personal Photo Cloud

Immich: Personal Photo Cloud$25

Starting at$60

Duration2 days