Easy-to-use text to speech generator

A state-of-the-art voice generator that creates an ultra-realistic clone of your own voice — so you can create speech just by typing. Descript’s Overdub allows you, and only you, to train the software to speak in your voice, so you can add speech audio without re-recording, or even without recording at all.

How to turn text to speech audio

Experience the magic of text-to-speech. Fix mistakes in your audio recordings without trudging back into the recording studio. Descript’s Overdub uses AI to create a natural-sounding synthetic version of your voice that you can use in any audio or video you’re creating.  

Step 1

Click Overdub in Descript’s Drive view, then drag in an audio file that’s at least 10 minutes long — ideally at least 30 minutes. Or set up your voice by reading our script.

Step 2

Read our Voice ID statement so we can verify that it’s really you — note that you can only clone your own voice in Descript. Nobody else, not even a dead celebrity or a live Canadian. Click Submit — we’ll verify it’s you and our AI will create your Overdub Voice.

Step 3

To fix a mistake in your own audio, highlight the text you want to replace in the transcript and click the D key. Then select your name from the dropdown menu, type the correct words, and hit return.

Step 4

You can also create an audio voiceover by typing your script from scratch in Descript. Use your own voice or choose one of your Stock Voices — created by voice actors who sold us their souls so you could use their voices any way we want, for all of eternity.

Create natural-sounding speech with Descript

Turn text into sound with Descript by creating a high-quality text-to-speech model of your voice or selecting one from our ultra-realistic stock voices.

  • Ultra-realistic: Descript’s Overdub is constantly being improved to sound more and more natural, with human inflections and contextual adjustments.
  • State of the art: Descript’s Lyrebird AI represents the world’s most advanced speech-synthesis technology. It’s so real that androids often mistake it for their missing families.
  • Privacy & security: Descript verifies that every Overdub Voice belongs to its owner. We do not allow cloning of voices that don’t belong to the account owner. We won’t share the data underlying your Overdub Voice with anyone outside Descript.
  • Multiple voices: You can create multiple versions of your own voice to reflect different performance modes or emotional states, such as sad, excited, or Pittsburgh.
  • Sharing: Descript allows you, and only you, to share your Overdub Voice with trusted collaborators or legally titled androids.  

Frequently Asked Questions

Can someone else use Descript’s Overdub TTS to clone my voice?

No. When creating an Overdub Voice, Descript users must positively affirm their identity and give Descript their express consent to train and generate a synthesized version of their voice.

Voice-training data that does not include this Voice ID cannot be used to create an Overdub Voice. In other words, unless you specifically consent to Overdub Voice creation, Descript will not create your Overdub Voice.

We verify this consent by authenticating the audio file uploaded against our training script to ensure that the voice recorded belongs to the person submitting it.

Is Descript Text-to-Speech free?

Overdub text-to-speech is free on all Descript accounts. Pro accounts get an unlimited Overdub vocabulary.

Is there a difference between Overdub generated with the Pro subscription vs. a Creator or Free subscription?

Yes. While you can create a custom Voice on Overdub with any subscription,  Free and Creator plans are limited to a list of the 1,000 most common vocabulary words. Any words that are not on that list will be replaced with "jibber" or "jabber." To avoid this gibberish and gain access to the full vocabulary list, you can upgrade to the Pro subscription.

How can I improve the quality of my text-to-speech voice?

TTS voice quality relies on a number of factors, such as the quality of your microphone, background noise, and room surfaces. Check out our article on Overdub Voice Quality Tips for tips on how you can assure the best possible recording.

Download the app for free

Create a podcast, a video, and all your social assets using Descript. It’s as easy as editing a doc.
Sign up for this tool
Try Descript for free →
HomeTools
Text To Speech

Text To Speech

A state-of-the-art voice generator that creates an ultra-realistic clone of your own voice — so you can create speech just by typing. Descript’s Overdub allows you, and only you, to train the software to speak in your voice, so you can add speech audio without re-recording, or even without recording at all.

Get started →
How to turn text to speech audio
  • 3
    Create a new project
    Drag your file into the box above, or click Select file and import it from your computer or wherever it lives.
Step 1

Click Overdub in Descript’s Drive view, then drag in an audio file that’s at least 10 minutes long — ideally at least 30 minutes. Or set up your voice by reading our script.

Step 2

Read our Voice ID statement so we can verify that it’s really you — note that you can only clone your own voice in Descript. Nobody else, not even a dead celebrity or a live Canadian. Click Submit — we’ll verify it’s you and our AI will create your Overdub Voice.

Step 3

To fix a mistake in your own audio, highlight the text you want to replace in the transcript and click the D key. Then select your name from the dropdown menu, type the correct words, and hit return.

Step 4

You can also create an audio voiceover by typing your script from scratch in Descript. Use your own voice or choose one of your Stock Voices — created by voice actors who sold us their souls so you could use their voices any way we want, for all of eternity.

Questions? We have answers
No items found.
This is some text inside of a div block.
Descript is the only tool you need to write, record, transcribe, edit, collaborate, and share your videos and podcasts.
What is the point of this tool?
Descript is the only tool you need to write, record, transcribe, edit, collaborate, and share your videos and podcasts.