Extracting Audio File Frequency

Question

1 投票

rmscalculation.m

Hello there,

I need to find the frequency of the audio file for specific segments. In my code I find the segments of talking and take the fft of these portions and find the frequencies. But the problem arises at the frequency part I need to find different frequencies but find exactly the same values. Could you please help?

Thanks in advance.

Audio file : https://drive.google.com/drive/folders/1EQABtLT-Is-oEk5w_6U4b1-jB8PJxaMp?usp=sharing

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

William Rose 2022 年 4 月 15 日

MATLAB Online で開く

0 投票

@mehtap agirsoy,

I have listened to file A1.wav. The instances of singing are not at 15 second intervals, even though this is expected by the code. Therefore the segments analyzed do not always contain singing. The amplitude of the singing is small. There are significant unrelated background noises. The pitch being sung sounds like the E flat above middle C (Eflat4). Therefore the dected dominant frequency should be around 311.1 Hz.

Approximate times of vocalization, in seconds: 1-5, 22-27, 42-46, 61-66, 82-87, 102-107.

There is background talking during 61-66. There is coughing or some other background sound in 82-87.

Conclusion: The frequency analysis of file A1.wav by rmscalculation.m is affected by background noises and incorrect timing. The signal to noise level is poor.

Recommendations:

Improve the recording set up to increase signal amplitude and reduce background noise.
Edit the audio file to extract the exact segments that contain the singing which you want to analyze.

I have looked at your code: rmscalculation.m.

Analysis of the script:

rmscalculation.m has three nested loops.

The outer loop is: for k=1:number of participants.

The middle loop is: for l=1:number of tests. This loop reads in a different audio file on each pass. It computes envolpe of hte signal as the moving average (with width 1000 points=1/44 of a second) of the absolute vaue of the signal. When the moving average crosses a threshold is deemed to be the time when talking starts.

The inner loop is: for i=1:6. Each pass extracts a segment of the signal. The segment start times are 15 seconds apart. The segments are 4.9 seconds long. The power spectrum of the segment is determined. The frequency that has max. power, within the frequency range 236 to 367 Hz, is determined for each segment.

Does that sond correct?

The script rmscalculation.m does not run. It gives the error

Error using xlsread (line 136)
Unable to open file 'F4_A1'.
File 'F4_A1' not found.
Error in rmscalculation (line 11)
a = xlsread(fname1); % comand to read excel/ particle count file

I commented out the lines related to file F4_A1. Then the script ran without error. It does not display any results.

To see the results:

>> disp(seg_Freq')
9312
8933
9312
9312
0339
0995

The frequency range of 90% to 140% of the middle C frequency will allow detection of frequencies corresponding to pitches from just below B3 to just above F4.

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

Answer 2

William Rose 2022 年 4 月 12 日

MATLAB Online で開く

0 投票

@mehtap agirsoy,

[moved my answer from a comment to an answer]

The google drive link you provided requres access permission. You may attach the audio file if you zip it first.

You probably know this already, but I will mention this just in case you do not know this:

When you compute the FFT or power spectrum of a segment of the signal, the frequencies of the FFT or power spectrum will be the same for each different segment (assuming the segment lengths are the same). The amplitude or power at each frequency will vary from segment to segment. You can compute the mean frequency for a segment, or you can compute the frequency with maximum power in each segment, etc. The script below does both, for an 8-second signal with gradually increasing frequency, divided into 0.5 second long segments. It plots the results. It appears that the max power frequency is better behaved than the mean frequency, in this example.

%% constants

Fs=8000; %sampling rate (Hz)

T=8; %signal duration (s)

wi=220*2*pi; %initial frequency (radians/s)

wf=880*2*pi; %final frequency (radians/s)

Tseg=0.5; %segment duration (s)

%% compute the signal

dt=1/Fs; %sampling interval

N=Fs*T; %signal duration (samples)

t=dt*(0:N-1); %vector of time values

phase=wi*t+(wf-wi)*t.*t/(2*T); %phase for signal with changing frequency

x=cos(phase); %signal amplitude

%% compute FFT of each segment

N1=Fs*Tseg; %segment duration (samples)

Nseg=T/Tseg; %number of segments

fmax=zeros(1,Nseg); %allocate array for max.power frequency of each segment

fmean=zeros(1,Nseg); %allocate array for mean frequency of each segment

df=1/Tseg; %frequency interval

f=(0:N1/2)*df; %vector of frequencies, up to Nyquist frequency

Nf=length(f); %number of frequencies in one-sided FFT

Y=zeros(Nf,Nseg); %allocate array for FFTs

for i=1:Nseg

X=fft(x((i-1)*N1+1:i*N1));

Y(:,i)=abs(X(1:Nf)); %magnitude of one-sided FFT

[~,indmax]=max(Y(:,i)); %index of largest element of Y

fmax(i)=f(indmax); %frequency with maximum power

fmean(i)=sum(f'.*Y(:,i))/sum(Y(:,i)); %mean frequency (amplitude-weighted)

end

%% plot results

figure;

subplot(211), plot(1:Nseg,fmax,'rx',1:Nseg,fmean,'bo');

xlabel('Segment'); ylabel('Frequency (Hz)');

legend('Max.Freq.','Mean Freq.'); grid on

title('Max & Mean Frequency vs. Segment')

subplot(212)

colorspec=[1,0,0;1,.33,0;1,.67,0;

1,1,0;.67,1,0;.33,1,0;

0,1,0;0,1,.33;0,1,.67;

0,1,1;0,.67,1;0,.33,0;

0,0,1;.5,0,1;

1,0,1;1,0,.5];

for i=1:Nseg

plot(f,Y(:,i),'Color',colorspec(i,:));

hold on;

end

xlabel('Frequency (Hz)'); ylabel('Amplitude'); xlim([0,1200])

grid on; title('Amplitude Spectra for Segments')

Try it. Good luck.

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

mehtap agirsoy 2022 年 4 月 12 日

Hi, many thanks for the help.

Zip file exceeds thelimits so I added drive link but forgot to change permissions, now it is ok.When you've time if you can check I'd be glad.

My freq results should fluctuate aroun 262 Hz when I tried max and mean results 617 and 22049.9 respectively. My segment freq are

261.931228637695

239.893341064453

261.931228637695

255.033874511719

262.099456787109

I'm not sure these are ok or not, a bit suspicious.

サインインしてコメントする。

Answer 3

William Rose 2022 年 4 月 13 日

0 投票

@mehtap agirsoy,

Middle C! The frequency sweep in my code goes from A3 to A5.

2 件のコメント
なしを表示なしを非表示

William Rose 2022 年 4 月 13 日

I was able to see the file on google drive, which I could not do before. However, when I click "download" to put it on my drive - which I need to do in order to open it in Matlab - nothing happens. The Help for google drive says

"If you can't download a file: If you can't download a file, the owner may have disabled options to print, download, or copy for people with only comment or view access."

I suspect that's what haooening here. I can't help more since the file is impossible to access. Post a shorter file that fits within the zip limit.

mehtap agirsoy 2022 年 4 月 13 日

So sorry for the inconveninence. When I compress the file it still exceeds the limit. Anyone with the link are editor now.

サインインしてコメントする。

Answer 4

William Rose 2022 年 4 月 16 日

0 投票

estimateAudioFrequencies.m

@mehtap agirsoy,

I wrote a script that extract 3 seconds of sound from each vocalization. As I said before,the times of note-singing are approximately: 1-5, 22-27, 42-46, 61-66, 82-87, 102-107 seconds.

Therefore I extract sound from 2-5, 23-26, 43-46, 62-62, 83-86, 103-106 seconds.

I measure the mean frequency and the frequency of maxmimum power in each segment.

The max.power frequencies are about 620-630 Hz, consistent with the subjects singing E flat 5, also known as the E flat above treble C. The expected frequency of this pitch is 622 Hz, with A440 equal temperament tuning.

The script plots the max frequency for each segment and the power spectrum for each segment.

You confined the frequency search to 0.9 - 1.4 times middle C. This singing signal has very little power in that frequency range. Most of the power is around 630 Hz. I initially thought thse children were singing in octave 4 (using scientific pitch notation). Now I think they are singing an octave higher, in octave 5. It is not always easy to decide.

My code also creates a file, A1sel.wav, which is the selected audio segments, plus 1 second of silence after each segment. The graphical output from the script is below.

2 件のコメント
なしを表示なしを非表示

mehtap agirsoy 2022 年 4 月 16 日

I really appreciate your help. It's my firsy time with signal processing and your explanations and code are awesome. Thanks awfully.

William Rose 2022 年 4 月 16 日

@mehtap agirsoy, You are welcome. Good luck with your work!

サインインしてコメントする。

Extracting Audio File Frequency

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

その他の回答 (3 件)

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示

2 件のコメント
なしを表示なしを非表示

カテゴリ

タグ

Community Treasure Hunt

Extracting Audio File Frequency

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

その他の回答 (3 件)

1 件のコメント -1 件の古いコメントを表示 -1 件の古いコメントを非表示

2 件のコメント なしを表示 なしを非表示

2 件のコメント なしを表示 なしを非表示

カテゴリ

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示

2 件のコメント
なしを表示なしを非表示