Coqui - AI Voice Studio by Samuel FaseunCoqui - AI Voice Studio by Samuel Faseun

Coqui - AI Voice Studio

Samuel Faseun

Samuel Faseun

Coqui Studio: A new way to do voice-overs, a better way.

Overview

Coqui Studio is an advanced AI tool designed for game developers to create expressive, high-quality voice dialogue using advanced text-to-speech technology. Built as a web-based studio, it streamlines audio content creation without the need for traditional voice actors. I served as the Senior Product/Web Designer for this project, utilizing Github, Figma & FigJam and some Google Suites Applications to lead design in the team from concept through implementation.

The Problem

Game developers often face high costs, time constraints, and creative bottlenecks when working with voice actors. Coqui Studio aimed to solve this by offering an AI-powered alternative that produces human-like, customizable speech. The tool needed to allow full speech editing and voice synthesis, be easy to navigate and responsive, and boost adoption by simplifying workflows. Key success metrics included usability, adoption rate, and the volume of speech generated.

Research & Discovery

We began with user interviews involving game developers and conducted a thorough competitive analysis of existing voice tools. The research revealed that only large studios could afford traditional voice actors, and even then, the process was often slow and stressful. Many developers recorded their own dialogue, which compromised quality. Existing tools didn’t support large-scale, expressive dialogue generation. These insights led to the creation of detailed user personas and journey maps that captured pain points and guided our planning.

Ideation & Strategy

Working closely with product and engineering teams, we defined the core architecture of the studio. We outlined four essential modules: Script Management, Casting/Character Management, Scene Management, and Audio Management. I created detailed user flows, information architecture, and initial wireframes to represent each module. Our design strategy emphasized clarity, modularity, and creative control, ensuring the studio felt powerful yet approachable.

Design Execution

The design of Coqui Studio was shaped to make advanced voice dialogue creation intuitive and accessible for game developers. The interface employed a modular layout that mirrored real production workflows, with clearly organized access to tools like the Script Manager, Voice Manager, and Audio Manager. Dialogue was structured per character, with controls such as speech rate, and language selection, minimizing clicks and cognitive load.
A timeline-based editing approach allowed users to visually sequence and adjust dialogue, echoing familiar audio production tools but within a streamlined browser interface. Throughout the design process, static prototypes were tested with game developers, leading to key improvements. One major evolution was merging characters and voices into a unified Voice Manager, simplifying the assignment process. Script parsing was refined to auto-assign voices based on character metadata, and a multi-take system was introduced to let users compare and choose the best audio delivery for each line.
Accessibility remained a guiding principle. Color contrast, typography, and layout were all optimized for legibility and ease of use. For advanced users, the editor provided control over pitch, duration, and energy, down to the phoneme level—ensuring both flexibility and precision for dialogue fine-tuning.

Testing & Validation

To ensure the studio’s usability, a beta testing phase was conducted with selected game developers—many of whom participated in earlier user research—alongside extensive internal testing across product, design, and engineering. While testers appreciated the modular interface, they initially found it challenging to understand the flow from script upload to audio generation.
This feedback led to several critical improvements: a step-by-step onboarding tutorial was introduced to guide users through their first project; the Prompt-to-Performance feature was refined for better emotional accuracy; and a multi-take preview system was added for easier voice comparison. Additionally, collaboration tools like in-line comments, user permissions, and project sharing were prioritized. UI clarity was also enhanced through improved layout, iconography, and control management—making the studio more intuitive and team-friendly.

Results & Impact

The outcome of the Coqui Studio project was a fully functional, all-in-one web application that allowed game developers to generate, manage, and fine-tune realistic, emotionally expressive AI speech for their characters—without needing voice actors or complex software. The studio successfully bridged the gap between technical innovation and user-centric design, empowering indie developers and small studios to bring their game narratives to life. It achieved a 60% user adoption rate in its first month and reduced time-to-dialogue creation by 30%. We saw a 95% success rate in onboarding completion, and thousands of audio takes were created within weeks of launch.
A user remarked, "This tool cuts out weeks of back-and-forth with voice actors. It feels like I finally have a speech synthesis studio in my browser."

Reflection

This project reinforced the importance of designing for real-world complexity without compromising clarity. Working with AI-powered tools meant balancing technical possibilities with user expectations. I learned how critical it is to collaborate closely with engineering and product early on to shape feasible yet ambitious solutions, and how deeply user insight can influence both product direction and feature scope.
Like this project

Posted May 29, 2025

AI-powered voiceover tool that helps game developers create expressive dialogue without traditional voice actors.

Likes

1

Views

11

Timeline

Mar 29, 2023 - Aug 29, 2023

Clients

Coqui