Speech to Text Converter

Descript instantly turns speech into text in real time. Just start recording and watch our AI speech recognition transcribe your voice—with 95% accuracy—into text that’s ready to edit or export.

How to automatically convert speech to text with Descript

Step 1

Start a recording session or upload voice audio

Create a project in Descript, select record, and choose your microphone input to start a recording session. Or upload a voice file to convert the audio to text.

Step 2

Talk and let the AI transcribe

As you speak into your mic, Descript’s speech-to-text software turns what you say into text in real time. Don’t worry about filler words or mistakes; Descript makes it easy to find and remove those from both the generated text and recorded audio.

Step 3

Edit and export your text

Enter Correct mode (press the C key) to edit, apply formatting, highlight sections, and leave comments on your speech-to-text transcript. Filler words will be highlighted, which you can remove by right clicking to remove some or all instances. When ready, export your text as HTML, Markdown, Plain text, Word file, or Rich Text format.

Download the app for free

Create a podcast, a video, and all your social assets using Descript. It’s as easy as editing a doc.
Sign up for this tool
Try Descript for free →
HomeTools
Speech to Text

Speech to Text

Descript instantly turns speech into text in real time. Just start recording and watch our AI speech recognition transcribe your voice—with 95% accuracy—into text that’s ready to edit or export.

Get started →
How to automatically convert speech to text with Descript
  • 3
    Create a new project
    Drag your file into the box above, or click Select file and import it from your computer or wherever it lives.
Step 1
Start a recording session or upload voice audio

Create a project in Descript, select record, and choose your microphone input to start a recording session. Or upload a voice file to convert the audio to text.

Step 2
Talk and let the AI transcribe

As you speak into your mic, Descript’s speech-to-text software turns what you say into text in real time. Don’t worry about filler words or mistakes; Descript makes it easy to find and remove those from both the generated text and recorded audio.

Step 3
Edit and export your text

Enter Correct mode (press the C key) to edit, apply formatting, highlight sections, and leave comments on your speech-to-text transcript. Filler words will be highlighted, which you can remove by right clicking to remove some or all instances. When ready, export your text as HTML, Markdown, Plain text, Word file, or Rich Text format.

A free speech-to-text converter like no other
Fast, accurate AI transcription that learns how you talk

Expand Descript’s online voice recognition powers with an expandable transcription glossary to recognize hard-to-translate words like names and jargon.

Voice to text meets video editor

Record yourself talking and turn it into text, audio, and video that’s ready to edit in Descript’s timeline. You can format, search, highlight, and other actions you’d perform in a Google Doc, while taking advantage of features like text-to-speech, captions, and more.

22+ supported languages

Go from speech to text in over 22 different languages, plus English. Transcribe audio in FrenchSpanish, Italian, German and other languages from around the world. Finnish? Oh we’re just getting started.

Questions? We have answers
Are there free speech to text converters?

Yes, basic real-time speech to text conversion is included for free with most modern devices (Android, Mac, etc.) Descript also offers a 95% accurate text-to-speech converter for up to 1 hour per month for free.

How does speech-to-text conversion work?

Speech-to-text conversion works by using AI and large quantities of diverse training data to recognize the acoustic qualities of specific words, despite the different speech patterns and accents people have, to generate it as text.

Can I turn text into speech with Descript?

Yes! Descript‘s AI-powered Overdub feature lets you not only turn speech to text but also generate human-sounding speech from a script in your choice of AI stock voices.

What languages are supported by Descript’s speech to text converter?

Descript supports speech-to-text conversion in Catalan, Finnish, Lithuanian, Slovak, Croatian, French (FR), Malay, Slovenian, Czech, German, Norwegian, Spanish (US), Danish, Hungarian, Polish, Swedish, Dutch, Italian, Portuguese (BR), Turkish.

How accurate is Descript’s speech to text?

Descript’s included AI transcription offers up to 95% accurate speech to text generation. We also offer a white glove pay-per-word transcription service and 99% accuracy. Expanding your transcription glossary makes the automatic transcription more accurate over time.

This is some text inside of a div block.
Descript is the only tool you need to write, record, transcribe, edit, collaborate, and share your videos and podcasts.
What is the point of this tool?
Descript is the only tool you need to write, record, transcribe, edit, collaborate, and share your videos and podcasts.
More than just a voice to text converter
Descript is an AI-powered audio and video editing tool that lets you edit podcasts and videos like a doc.
  • Subtitles & captions
    Creation captioned videos and subtitle files from the transcript generated when you convert speech into text with Descript.
  • Overdub
    Type with your voice or turn what you type into your voice with AI-powered voice cloning and Overdub.
  • Publishing
    Host your speech-to-text files in Descript including both the transcript and the original voice recording.
  • Remove filler words
    Speak as you naturally would — don’t worry about slipping up or saying “um”. Remove all filler words in one go with just a couple of clicks.