Speech Transcription and Synthesis
Use a pretrained model or third-party APIs for text-to-speech and
speech-to-text
Audio Toolbox™ provides examples for small-vocabulary recognition and
sound synthesis. Use the wav2vec 2.0 pretrained network to perform
general speech-to-text transcription with speech2text
. You can download Audio Toolbox extended functionality from File Exchange for text-to-speech and speech-to-text through interfaces to popular third-party
APIs. Supported APIs include Google®, IBM® Watson, Microsoft® Azure, and Amazon®.
You can interact with speech-to-text functionality graphically in the Signal Labeler app to quickly label regions of speech.
Apps
Signal Labeler | Label signal attributes, regions, and points of interest, and extract features |
Functions
speech2text | Transcribe speech signal to text |
text2speech | Synthesize speech from text |
speechClient | Interface with pretrained model or third-party speech service |
Topics
- Label Spoken Words in Audio Signals
Use Signal Labeler to label spoken words in an audio signal.