Text to Speech with Descript: How to Use Overdub and Clone Your Voice with AI

Text to Speech with Descript: How to Use Overdub and Clone Your Voice with AI

New Section

In this section, the video introduces the Descript Overdub feature, highlighting how artificial intelligence can be used to generate voices.

Introduction to Descript Overdub and AI Features

  • Joey discusses Descript Overdub and AI features as essential tools for editing.
  • Two ways to turn text into spoken audio are explained: using stock voices or training a model for personalized voice generation.

Text-to-Audio Conversion Process

This section delves into the process of converting text to audio within Descript.

Converting Text to Spoken Audio

  • Accessing write mode allows users to type out text for narration or voiceover purposes.
  • Adding speaker labels helps identify different types of audio content within the project.
  • Distinguishing between blue (unlinked) and black (linked to audio file) text clarifies transcription status.

Voice Assignment and Rendering

The focus here is on assigning voices and rendering audio within Descript.

Speaker Assignment and Voice Selection

  • The speaker panel enables assigning voices, including stock options like male/female variations.
  • Demonstrating voice auditioning with examples like "Malcolm" for voice selection.

Audio Rendering Process

  • After selecting a voice, rendering occurs swiftly, generating waveforms for the typed-out narration.

Enhancing Audio Output

Tips on improving audio quality through punctuation and formatting adjustments are discussed in this segment.

Improving Audio Quality

  • Adding punctuation marks and new lines aids in enhancing pronunciation during text-to-audio conversion.

Utilizing Generated Audio

This part focuses on utilizing generated audio files within projects effectively.

Practical Applications of Generated Audio

New Section

In this section, the speaker discusses the process of creating a voice model based on one's voice using existing audio or video recordings.

Creating a Voice Model

  • The speaker outlines two main methods for creating a voice model based on one's voice. The first method involves using existing projects, videos, or podcasts where the individual has spoken extensively. This serves as training data for the voice model.
  • Podcasts are highlighted as an ideal source for gathering audio data due to their long recordings with good audio quality, making them suitable for training voice models.
  • A minimum of about 10 minutes of audio featuring the individual's voice saying different things is recommended for effective training data. Properly assigning speaker labels to distinguish between voices is crucial when using recordings with multiple speakers.
  • For scenarios where there isn't enough existing data, an alternative method involves creating a new voice project within Descript by uploading audio files and providing at least 10 minutes (ideally 30 minutes) of training data in the form of recorded speech.

Reading and Generating AI Voices

In this section, the speaker discusses the process of training data to generate an AI voice, changing speakers for different models, and enhancing the generated voice with creative adjustments.

Training Data Submission and Voice Verification

  • The process involves reading the training data and submitting it for model creation.
  • Voice verification is required to authorize the creation of a training model.
  • Upon completion, an AI voice is generated based on the submitted data.

Creating Multiple Models Based on Different Microphones

  • Future plans include creating various models based on different microphones or recording locations.
  • This approach aims to ensure natural-sounding voices tailored to specific recording scenarios.

Adjusting Speaker Labels and Enhancing Voice Quality

  • Changing speaker labels allows for switching between different trained voices effectively.
  • Creative adjustments like adding punctuation or phonetically spelling words can enhance voice quality.

Utilizing Overdub for Editing

This segment focuses on using overdub for editing purposes, such as fixing audio recordings, adjusting text content, and ensuring natural-sounding modifications.

Overdub Functionality for Editing

  • Overdub is beneficial for short edits or clarifications in audio recordings.
  • It offers flexibility in adjusting text content by replacing or modifying specific words seamlessly.

Natural-Sounding Edits with Overdub

  • Overdub aids in making subtle changes like word replacements sound more natural through expanded selections.
  • The tool enables precise adjustments without altering the overall coherence of the audio content.

Rendering Audio Clips with AI

This part delves into rendering audio clips using AI-generated content, covering seamless alterations in spoken text within video contexts.

Seamless Audio Alterations in Videos

  • Altering spoken text within videos requires covering changes with b-roll footage to maintain visual coherence.

New Section

In this section, the speaker discusses the process of adjusting overdub in audio clips and the ability to revert back to the original voice if needed.

Adjusting Overdub in Audio Clips

  • The speaker mentions that by trimming the overdub clip, one can bring back the original voice if they prefer it over the overdub.

New Section

This part focuses on experimenting with overdub in audio clips and how it enhances naturalness.

Experimenting with Overdub

  • The speaker talks about adjusting the balance between overdub and original audio to enhance naturalness.
  • In experiments, changing words with overdub usually results in a more natural sound.

New Section

Here, options for using AI voices and training them are discussed.

Using AI Voices

  • Options include training AI voices using personal recordings or stock voice options.
  • Training an AI voice on personal recordings is possible by providing sufficient data.

New Section

This segment elaborates on training AI voices using personal recordings for accurate voice replication.

Training AI Voices with Personal Recordings

  • By feeding a bunch of personal audio recordings, one can train an AI voice to replicate their own or another person's voice accurately.
  • The process involves converting typed text into audio that sounds like the individual being trained.

New Section

The conclusion emphasizes seeking further tutorials for detailed guidance and assistance.

Conclusion and Call to Action

  • Encouragement is given to explore additional tutorials on related topics for comprehensive understanding.
  • Viewers are invited to ask specific questions or request more tutorials through comments.
  • Assistance will be provided in answering queries and creating more tutorials as needed.
Video description

Let's go over how to turn text into speech in Descript, and how to use Descript's Overdub feature to train a model on your own voice so you can use AI to say whatever you want. 📬 Get our free 2x weekly email packed with 5 links, 3 tools, and 1 tactic to level up your video skills: https://ntm.link/newsletter Check out all of our Descript tutorials here: https://www.youtube.com/playlist?list=PLyJvr4CeQ1PwgbGGbaFwmLWhGcl0w3V8G 🖥️ Try out Descript ► https://ntm.link/descript 🔔 Get more videos like this ► https://ntm.link/subscribe ############# 📺 MORE VIDEOS Learn Descript in 15 Minutes [Full Tutorial] https://youtu.be/YEzJ_r7geuc Why You Should Be Making Rough Cuts in Descript and Forget FCP/Premiere https://youtu.be/bbfQ6jXE5k0 How to Search & Filter Highlights in Descript https://youtu.be/22ObyDGFIFQ ############# ⏱ CHAPTERS 00:00 - Intro 00:35 - Using Descript's Stock Voices 02:10 - Descript Overdub 08:35 - Training the Model 10:25 - Demo: Using Overdub 12:00 - Demo: Overdub and B-Roll 16:20 - Recap ############# 🔔 Subscribe for weekly videos to level up your video marketing ► https://ntm.link/subscribe ############# New Territory Media ► https://ntm.link/home TikTok ► https://ntm.link/tiktok Instagram ► https://ntm.link/instagram Facebook ► https://ntm.link/facebook Twitter ► https://ntm.link/twitter LinkedIn ► https://ntm.link/linkedin * Some links go to affiliate programs where we earn a small commission #VideoBrand #Descript #Overdub