How to make a matrix of data to put into gmdistribution from seperate sound files?

1 回表示 (過去 30 日間)
I'm trying to use the gmdistribution.fit function for a speech recognition program that will take in a sound file of a speaker saying different numbers and turn it into a text file. The function requires me to input a matrix of sound data into one variable, which it then uses to predict the gmm parameters. I have 10 sound files for each digit that I probably need to combine into one variable for the function--how would I go about doing that?

採用された回答

Kris Fedorenko
Kris Fedorenko 2017 年 8 月 7 日
Hi Jonathan!
You should be able to read in your audio files in MATLAB using "audioread". This will give you the audio data (m-by-n matrix, where m is the length of the audio and n is the number of audio channels) and the sample rate. You can refer to the following documentation for more details:
Now for "gmdistribution.fit" you need to input data as an n-by-d matrix, where n is the number of observations and d is the dimension of the data. Assuming that you would like to use the audio data as input to "gmdistribution.fit" and that your audio files have only one channel and are of the same length, you can construct the input using a workflow similar to this:
%%read in audio files
[data1, Fs] = audioread('audiofile1.wav');
[data2, Fs] = audioread('audiofile2.wav');
%%make an input matrix
input_matrix = [data1 data2]; % concatenate audio data
% rotate the input matrix such that first dimension is number of
% observations and second is number of dimensions:
input_matrix = input_matrix';
%%use input matrix as input to "gmdistribution.fit"
obj = gmdistribution.fit(input_matrix, number_of_components);
Note that input to "gmdistribution.fit" should have more rows than columns (i.e. more observations than number of dimensions). Depending on the length of your audio files, their number might not be enough to use a Gaussian mixture model.
Hope this helps!
Kris

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeMeasurements and Spatial Audio についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by