AI Video Highlights: Clips, Captions, and AI Chat by Rachid OgAI Video Highlights: Clips, Captions, and AI Chat by Rachid Og

AI Video Highlights: Clips, Captions, and AI Chat

Rachid Og

Rachid Og

Verified

AI Video Highlights: Clips, captions, and AI chat grounded in your video


Project overview :

A subscription SaaS that ingests YouTube (or direct) URLs, runs an AI pipeline
to find highlight moments, renders real MP4 clips with optional burned-in
captions, and lets users chat with an AI assistant grounded in that video’s
transcript and highlights — plus credits and billing through Clerk — on a
modern serverless + worker split.

ROLE:

Full-stack product engineer (architecture, UI, API, worker pipeline, DevOps)

PROBLEM:

Long-form video is everywhere; short-form feeds are where reach lives. Editors
waste hours finding cuts, writing blurbs, and re-watching to remember what was
said. There was no single place to auto-detect highlights, ship real clips,
caption them, ask grounded questions about the content, and meter usage for a
SaaS business model.

SOLUTION:

One dashboard: paste a URL → a background worker pipeline downloads (up to
1080p), transcribes (captions or Whisper), analyzes audio, has GPT propose
highlights, and FFmpeg exports MP4s (stream copy when possible). Users preview
clips, toggle burned-in captions, and chat with an assistant fed only that
video’s transcript and clip text (timestamps in answers, history in Postgres).
Clerk handles auth and billing; credits gate heavy actions; webhooks keep plans
and balances aligned with the database
dashboard
dashboard

KEY FEATURES (SHIP LIST)😍

• Auth & monetization — Clerk (sessions + Billing UI); subscriptionItem.*
webhooks → plan/credit upserts in Postgres; usage debits on import / pipeline.
• Async video pipeline — Celery worker: yt-dlp (DASH ≤1080p) → captions or
faster-whisper → NumPy energy on audio → GPT highlight JSON → FFmpeg MP4
cuts (-c copy where possible); job rows drive a live stage UI on the video page.
• Grounded video chat — Hono + OpenAI: system context from transcript +
clip summaries; [M:SS] answers; chat_history JSON persisted; clear-thread API.
• Product surface — Dashboard exports (docx/xlsx), tokenized dark UI (shadcn +
internal DS), landing + pricing driven from shared plan config.
Pricing Plans & AI chat
Pricing Plans & AI chat

TECH STACK🔥

- Frontend: Next.js (App Router), React 19, Tailwind CSS, shadcn/ui,
TanStack Query, custom design-system components.
- API: Hono on Next route handlers, Zod validation, Clerk server auth.
- Data: PostgreSQL (Supabase), Drizzle ORM; Redis for Celery broker.
- Workers: Python, Celery, yt-dlp, FFmpeg, faster-whisper, OpenAI API,
NumPy (audio energy), Supabase Storage for production clips.
- Billing: Clerk Billing + Svix webhooks; plan → credits in Postgres.
- Deploy: Vercel (app), Fly.io (Dockerized worker), managed Redis.

CHALLENGES & DECISIONS:

1. Webhook / billing correctness — Clerk emits plan data on subscriptionItem.*
events; user id lives under payer, not top-level. Handlers use deep
scanning + upsert for credits to survive event ordering vs user.created.
2. Video quality — Pre-muxed YouTube MP4 caps at ~720p; pipeline requests
separate video+audio up to 1080p. Caption export re-encodes once at high
quality; plain cuts prefer -c copy.
3. Auth vs static assets — Public hero video and /assets/* must bypass Clerk
middleware so embeds and thumbnails are not redirected to sign-in.
4. YouTube embed UX — iframe letterboxing and chrome; mitigated with URL
params + CSS scale/crop inside a fixed hero frame.

OUTCOME:

A coherent end-to-end SaaS: sign-up → paywall-ready credits → process video
→ inspect highlights and clips → ask the AI about the content (with persisted
chat) → export. The codebase separates concerns (UI, API, worker, storage)
so each layer can scale or be swapped (e.g. different object storage or
queue) without rewriting the product surface.

LINKS :

Live Demo Here
Like this project

What the client had to say

Rachid has strong SaaS expertise, delivers clean UI/UX, maintains great communication, and completed everything within the deadline.

Abdullah ALmehmadi عبدالله المحمادي

Mar 18, 2026, Client

Posted Apr 8, 2026

Developed a SaaS to create video highlights and AI chat features.