speech2text
Syntax
Description
transcribes speech in the input audio signal to text using a pretrained wav2vec 2.0
model.transcript
= speech2text(audioIn
,fs
)
Note
Using wav2vec 2.0 requires Deep Learning Toolbox™ and installing the pretrained model.
transcribes speech using the specified pretrained deep learning model or third-party speech
service.transcript
= speech2text(audioIn
,fs
,Client=clientObj
)
Note
Using the Emformer pretrained model requires Deep Learning Toolbox and Audio Toolbox™ Interface for SpeechBrain and Torchaudio Libraries. You can download this support package from the Add-On Explorer. For more information, see Get and Manage Add-Ons.
To use third-party speech services, you must download the extended Audio Toolbox functionality from File Exchange. The File Exchange submission includes a tutorial to get started with the third-party services.
[
also returns the unprocessed server output from the third-party speech service.transcript
,rawOutput
] = speech2text(___)
Examples
Input Arguments
Output Arguments
References
[1] Baevski, Alexei, Henry Zhou, Abdelrahman Mohamed, and Michael Auli. “Wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations,” 2020. https://doi.org/10.48550/ARXIV.2006.11477.
Version History
Introduced in R2022b