Word comparison using frequency domain.

2 ビュー (過去 30 日間)
Leon Ellis
Leon Ellis 2021 年 11 月 12 日
コメント済み: Leon Ellis 2021 年 11 月 17 日
Good day, I've managed to get the following from a sound wave using the FFT function:
The red circles are the peak values found using the findpeaks() function (I make use of both descending and MinProminance). This means I have their x and y -coordinates in decending order. I then only take the first 6 x and y elements and save them to a .mat file to compare the other words (Other words' .mat files to). I horizontally concatenate them and save them as a .mat file. I do this for 5 different words and then compare their .mat files via the mean square error function (immse). The word with the lowest error then corrisponds to a value k, i.e Word 1 word 2 etc. The problem is however, that it doesn't produce the right result. So I say "Five" and the algorithm says I said "Two" as it more closely resembles the .mat file of the word "Two" (So the peaks are closer).
Does anyone have any hints on where I could be going wrong or what other step I might need to take to go from the graph above to detecting which word has been said. My code is quite a mess and I don't want to discourage help by posting it... It follows the excact method I described to you for figuring out which word has been said. But if You'd like to help and request I post it I will. Thanks in advance!
  13 件のコメント
Leon Ellis
Leon Ellis 2021 年 11 月 17 日
Thank you very much. Unfortunetely it's a bit too late and I wasn't able to get it to work. I also don't think we're suppost to create an algorithm to train for word identification (We're just suppost to work with the audio file characteristics for identification.) But thanks a lot for replying!


回答 (0 件)


Find more on Time-Frequency Analysis in Help Center and File Exchange




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by