NEW

Change Your Voice with AI

API
EnterprisePricing
Request Demo

Dialog: The world’s 1st emotive, contextual model for AI conversations

Dialog is a large voice AI model best suited for narrations, synthetic briefings, podcasts and dubbing where accurate and engaging conversational tone, prosody and emotion are required.

Speaker 1: Alright so, instead of our regularly scheduled programming, there's something hairier that came across my feed last night that I just need to discuss. Speaker 2: Wait, where is this going? Speaker 1: I uh, I just thought we'd take a little... detour. You know, take the scenic route down the path of mystery. Specifically, into the thick, mossy woods where something like, oh I don't know… *Bigfoot* himself might be lurking. Speaker 2: Oh, for the love of, again, Briggs? Really? We did this last month. And the month before that. This is basically the podcast equivalent of your sad karaoke go-to. Speaker 1: What? No! I'm just... providing the people what they want! Listen, There's new evidence, and the public demands more attention on it. Speaker 2: The "public" is just you, Briggs. You're the one emailing us suggestions under fake names. We've all seen "Biggie O Footlore" in our inbox.

< 450ms latency

Optimized for multi-turn conversation

Wide range of prosody and emotion

On-prem deployments supported

See Dialog in action

Create engaging AI dialogs, podcasts and conversations using our proprietary Contextual Tone Prediction technology that lets the model understand each turn in a conversation and generate speech with the right prosody and emotion.

AI podcast between hosts

Generate entire AI podcasts with any voices

Conversation between characters

Create engaging contextual conversations between multiple characters

Engaging narration

Generate rich dramatic narrative content

Dramatic dialogs for a scene

Prompt and direct to generate dramatic deliveries

Model capabilities

Read the full model release post

It sounds just like a human

Dialog beta was trained on 100s of millions of conversations that represent real-world examples, and is approximately ten times larger than Play 3.0 mini. It closely matches human speech on prosody (intonation, pacing of speech), meaning it’s far harder to tell that it’s an AI model.

It uses the whole conversation as context

Unlike previous generations of speech models, Dialog understands the entire conversational context and how each sentence, or speaker, influences speech generation. 

It’s easy to code

Dialog is easy to use and is available through our API and on platforms like Fal. It also supports Websockets and streaming from LLMs.

State-of-the-Art Voice Cloning across languages and accents

Dialog supports zero shot voice cloning and custom fine-tuning to create custom voices that are indistinguishable from the original voice.

Dialog is Enterprise ready

Dialog is exceptionally well suited for generating AI conversations, podcasts or dialogs. We work with leading content creators to get them from idea to content faster. Get in touch with us to explore how Dialog can help.