AI-Powered Video Content Generator with Browser-based Processing

Wladimir

Wladimir Filho

🧩 The Challenge

Content creators spend hours manually creating optimized titles, descriptions, and other text content from their videos for different platforms like YouTube.
This required a solution capable of:
Processing video files without server uploads for privacy/speed.
Converting videos to audio format for transcription.
Generating accurate transcriptions from audio content.
Creating platform-optimized content using customizable AI prompts.
Providing real-time feedback during processing stages.

✨ The Solution

I architected a full-stack application that processes videos entirely in the browser using FFmpeg WebAssembly, then leverages OpenAI's APIs for intelligent content generation.
app experience
app experience

šŸ› ļø Deep Dive

The frontend handles video processing entirely client-side using FFmpeg compiled to WebAssembly, converting MP4 videos to optimized MP3 audio files. This approach ensures user privacy and reduces server load.
The conversion triggers a secure upload to the Fastify backend, where OpenAI Whisper generates accurate transcriptions. Users can then apply customizable prompt templates that inject the transcription content into GPT-3.5 Turbo requests.
The AI responses stream back in real-time, providing immediate feedback. The system includes pre-built templates for YouTube titles and descriptions, but supports fully customizable prompts for any content generation need.
Built with modern tech: React/TypeScript frontend with Tailwind CSS and Radix UI components, Node.js backend with Fastify for performance, Prisma ORM with SQLite for data persistence, and comprehensive Zod validation throughout.

šŸŽ‰ The Outcome

The application successfully delivers a seamless content creation workflow that transforms hours of manual work into minutes of automated processing:
āœ“ Browser-based video processing with FFmpeg WebAssembly for instant conversions. āœ“ High-accuracy transcriptions using OpenAI Whisper with custom prompt support. āœ“ Real-time AI content generation with streaming responses for immediate feedback. āœ“ Fully customizable prompt templates for any content type or platform. āœ“ Modern, responsive UI with intuitive drag-and-drop video upload. āœ“ Scalable architecture supporting temperature control and multiple AI models.
Like this project

Posted Oct 10, 2025

I built a full-stack app that processes videos in-browser, generates transcriptions via OpenAI Whisper, and creates optimized content with GPT-3.5.