Make Overdub’s Speech Synthesis Even Better With These Tips

Last week, we launched Overdub, an AI voice generator that allows you to create a realistic clone of your own voice. To help you make the most of the software, we’re offering some tips to ensure a production-ready Overdub voice, with crisp fidelity, realistic intonation, and natural expressiveness.

Improve your recording conditions

Your training data should be recorded in a quiet, acoustically “dead” room, and you should be using an external microphone.

Record more training audio

While Overdub voices can be trained with as little as 10 minutes of audio, we recommend at least 30 minutes. The likelihood of a production-ready voice increases as you increase your training data volume, all the way up to 90 minutes.

Open your training data project to record one of our supplemental scripts.

Experiment with Styles

Styles let you copy the various delivery styles of your real audio recordings. Every Overdub is generated using a Style; your voice comes loaded with one as a default. The default style might not be optimal for the content you are creating. To create a new Style, select a range of real audio (complete sentences are recommended) that’s three to twenty-five seconds long, right-click, and select “Save as Style.” Learn more about setting up Styles.

Play with punctuation

Periods and commas affect Overdub intonation. Add and remove them to fine-tune delivery.

“Convert to audio” to tweak timing and boundaries

Right-click on an Overdub (or hover over the clip in the timeline) to convert it to normal Descript audio. Once it’s audio, you can fine-tune the word spacing and sentence boundaries just like any other clip. Learn more about Timeline Editing.

Overdub additional words

If you’re making an editorial correction of a word and it sounds unnatural, undo and experiment with grabbing another word or two on either side.

To change pronunciation, spell a word as it sounds

If Overdub mispronounces a word, try a different (incorrect, but phonetic) spelling. Once you’ve got it sounding right, you can always convert the Overdub to audio (see “Convert to audio” above) to correct the spelling in the transcript.

Ready to try a realistic voice generator for yourself?

Download Descript today and try Overdub for yourself. We have a feeling you’ll be impressed.

Featured articles:

No items found.

Articles you might find interesting


8 types of podcasts: Discover the right podcast format for your show

Picking the best format for your own podcast involves several factors, including how much you want to spend on your podcast studio, how you prefer to work, and, of course, your topic.


Video podcasting: a workflow that won't kill you

How to make a video podcast even if you don't think you have the time or resources

Other stuff

Dynamic vs. condenser microphones: What’s the difference?

A dynamic microphone uses a magnetic field to generate an electrical signal. A condenser microphone is a type that creates audio signals using a capacitor.


Podcast metrics 101: Crunch the numbers to improve your show

Having hard data on your listeners’ habits can give you a sense of what’s working and what’s not — and help you set realistic goals for yourself and the show going forward.

Related articles:

Share this article

Get started for free →