AI Powered Speech Transcription and Insights Generation

Eswar T

Fullstack Engineer
Software Engineer
AI Developer
Flask
Node.js
Python
ISRO
Government of India
Indian Army
Vakya AI, Speech Separation App & Bhasya Setu
Introduction.
1.       Vakya AI is an offline, cutting-edge Automatic Speech Recognition (ASR) tool that combines structured insights generation with chatbot integration. It allows users to interact dynamically with their transcripts through a "Recent Files" page and generate actionable insights effortlessly. As the first of its kind, Vakya AI revolutionizes transcription by offering seamless interaction and insights in a secure, offline environment—a feature not available in any commercial product.
2.       Bhasya Setu is an innovative language translation application designed to facilitate seamless communication between speakers of different languages. It supports real-time translation in continuous and dual-speaker modes, ensuring accurate and efficient language conversion. While currently functional as an online tool, development of its offline version is underway to enhance accessibility and data security for sensitive environments.
3.       The Speech Separation App complements Vakya AI by segregating mixed speech into individual speaker voices, offering precise, time-stamped outputs for multi-speaker recordings. Together, these tools redefine speech processing for secure and operationally critical environments.
Key Projects.
4.       Vakya AI (First-of-its-kind ASR Tool).
(a)      Purpose.      An advanced ASR tool for offline transcription, insights generation, and interactive transcript exploration.
(b)      Features:
(i)       Utilizes advanced AI algorithms for precise voice recognition and transcription.
(ii)      Supports transcription in multiple languages, including English and Hindi, catering to diverse user needs.
(iii)      Allows users to query and explore transcripts interactively via an integrated chatbot.
(iv)     Offers audio editing tools, translation services, and text-to-speech capabilities for enhanced functionality
(c)      Impact.        Redefines transcription and speech processing by offering unparalleled insights and usability, ensuring secure and efficient workflows.
5.       Speech Separation App (Speaker Segregation Tool).
(a)      Purpose.      An offline tool for separating mixed speech into individual speaker voices with high precision.
(b)      Features:
(i)       Identifies and segregates speakers in mixed-multi-speaker recordings.
(ii)      Operates offline, ensuring data confidentiality.
(c)      Impact.        Enables clear, actionable outputs from complex audio recordings and conferences improving intelligence analysis.
6.       Bhasya Setu.
(a)      Purpose.      Enables real-time language translation for smooth communication across diverse linguistic barriers.
(b)      Features:
(i)       Supports Continuous Mode by Translating an ongoing conversation between two speakers.
(ii)      Provides Dual-Speaker Mode by Allowing controlled speaking durations for two participants to avoid overlap and ensure clarity.
(iii)      Multi-language Support of more than 100 languages including Mandarin for diverse user needs.
(iv)     Features an intuitive design with adjustable parameters like speaker labels and speaking durations.
(v)      Displays transcripts and plays translations in real time for an interactive experience.
(c)      Impact.        Streamlines multilingual communication, making it highly useful for operational briefings, training sessions, and cross-cultural interactions.
(d)      Offline Functionality.       The offline version, currently in development, will enable secure deployment in environments where internet access is restricted or unavailable.
 
 
Partner With Eswar
View Services

More Projects by Eswar