Text-to-speech converts written words into a lifelike voice. Quality now rivals human recordings, and streaming variants start speaking before the whole text is ready, which matters for live assistants. Voice cloning is a sensitive extension of TTS.
Definition
Generating spoken audio from text. Modern neural TTS sounds natural and is used for voice-overs, dubbing, assistants, and accessibility.
Text-to-speech converts written words into a lifelike voice. Quality now rivals human recordings, and streaming variants start speaking before the whole text is ready, which matters for live assistants. Voice cloning is a sensitive extension of TTS.
Also known as
text-to-speech, speech synthesis