How can speech be converted to text?

Question

Ismat 2023 年 6 月 2 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1977109-how-can-speech-be-converted-to-text

コメント済み: Govind KM 2023 年 6 月 2 日

Here is the flowchart of my system: User speaks >> Speech to Text conversion >> Text is sent to chatGPT >> Process ends.

My question is regarding the "Speech to Text" block: Is the "Audio Toolbox" sufficient for this task, or is an external API like the Google Speech API also required?

Furthermore, does the "Audio Toolbox" support multiple languages, or is it limited to English only?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Govind KM 2023 年 6 月 2 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1977109-how-can-speech-be-converted-to-text#answer_1248799

Hi Ismat,

As per documentation, Audio Toolbox enables you to interface with third-party speech-to-text APIs from MATLAB, requiring extended Audio Toolbox functionality available from File Exchange, and one of the following APIs : Google Speech, IBM Watson Speech, Microsoft Azure Speech, or Amazon Transcribe (Amazon Transcribe requires R2022b or later).

Starting in MATLAB R2022b, you can use convert speech to text using a pretrained wav2vec 2.0 model that does not require access to an external API, and without needing to download extended Audio Toolbox functionality from File Exchange. Using the wav2vec2.0 model will require the Deep Learning Toolbox. You can also perform speech transcription interactively using the Signal Labeler app.

You can refer to these documentation links for further information on using these tools:

https://in.mathworks.com/help/audio/ug/speech2text.html

https://in.mathworks.com/matlabcentral/fileexchange/65266-speech2text

https://in.mathworks.com/help/signal/ug/label-spoken-words-in-audio-signals-using-external-api.html

2 件のコメント
なしを表示なしを非表示

Ismat 2023 年 6 月 2 日

Thank you very much for your answer.

The wav2vec2.0 model requires both the Audio Toolbox and Deep Learning Toolbox, which is not an efficient way economically. As you know, each toolbox incurs additional expenses. However, I still have the same question.

If I purchase the "Audio Toolbox," can I use the speech2text function?
Do I need an external API if I buy a license for the "Audio Toolbox"?

Govind KM 2023 年 6 月 2 日

If you purchase a license for the Audio Toolbox, you can use the speech2text function. However, the function requires a client object as an input argument, which is an interface to either the wav2vec2.0 model or any of the four external APIs mentioned above.

Hence, you will need either the wav2vec2.0 model or one of the four mentioned external APIs to use the speech2text function effectively.

サインインしてコメントする。

How can speech be converted to text?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

2 件のコメント
なしを表示なしを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

How can speech be converted to text?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

2 件のコメント なしを表示なしを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示