AI-Powered Video Transcription Service with Timestamping
Developed a comprehensive AI audio processing backend. Core features include video-to-text transcription with NVIDIA Parakeet (FastAPI, Python), featuring audio chunking for long files and precise timestamping. Currently extending capabilities with Text-to-Speech (TTS) synthesis using [Kokoro/NeMo TTS Model], enabling voice generation from text. Designed for scalability and efficient AI model serving.
Like this project
Posted May 22, 2025
Developed a comprehensive AI audio processing backend. Core features include video-to-text transcription with NVIDIA Parakeet (FastAPI, Python) and React Front.