Change Your Voice with AI
Play 3.0 mini: A real-time text to speech model
A lightweight, cost-efficient, multi-lingual Text-to-Speech model built for realtime conversational AI
< 130ms latency
30+ languages
Accurate voice cloning
On-prem deployments supported
Play 3.0 mini is the fastest Generative Text-to-Speech AI model on the market. With its state-of-the-art voice cloning capabilities and ultra-realistic voices across 30+ languages, Play 3.0 mini is a reliable enterprise grade model that can be used via API or deployed in your cloud.
See Play 3.0 mini in action
Do real conversations with real people in real time
For Sales and AI SDRs
Scale sales operations with cloned voices that can carry out full conversations or drop messages.
For Customer Support
Serve customers 24/7/365 with emphatic voices they’ll love talking to.
For Gaming
Power your characters and AI agents with expressive conversational voices.
Voices for Podcasts and more
Create entire AI Podcasts or edit existing ones using your AI Cloned Voice.
Model capabilities
Read the full model release postIndustry leading latency with <130ms TTFB
Play 3.0 mini is heavily optimized for low latency, and its smaller footprint means it's far more cost efficient that competitive models. Host it yourself if you need it even faster
Superior alpha numeric accuracy and reliability
3.0 mini was finetuned specifically on a diverse dataset of alpha-numeric phrases to make it reliable for critical use cases where important information such as phone numbers, passport numbers, dates, currencies, etc. can’t be misread
Supports 30+ languages and accents
Despite its small size, Play 3.0 mini supports 30+ languages, with a deep bench of voices OOTB for the most common languages.
State-of-the-Art Voice Cloning across languages and accents
Want an accurate voice clone for your application? Play 3.0 mini is the industry’s most accurate voice cloning model, and it takes as little as 30 seconds.