Speech Transcription and Synthesis
Use a pretrained model or third-party APIs for text-to-speech and speech-to-text
Audio Toolbox™ provides examples for small-vocabulary recognition and
sound synthesis. Use the wav2vec 2.0 pretrained network to perform
general speech-to-text transcription with
speech2text. You can download Audio Toolbox extended functionality from File Exchange for text-to-speech and speech-to-text through interfaces to popular third-party
APIs. Supported APIs include Google®, IBM® Watson, Microsoft® Azure, and Amazon®.
You can interact with speech-to-text functionality graphically in the Signal Labeler app to quickly label regions of speech.
|Signal Labeler||Label signal attributes, regions, and points of interest, and extract features|
|Transcribe speech signal to text|
|Synthesize speech from text|
|Interface with pretrained model or third-party speech service|
- Label Spoken Words in Audio Signals
Use Signal Labeler to label spoken words in an audio signal.