Can someone explain why in this code f3 and f2 are equal to (1048576,2)?

6 ビュー (過去 30 日間)
Axel Blaze
Axel Blaze 2022 年 5 月 3 日
編集済み: Jan 2022 年 5 月 4 日
So this is a project to separate human voice from the background noise/tune, of a given song.
I had a doubt related to the code of this project. Why exactly are we assigning f3 and f2 = (1048576,2)? (These two statements appear just after the 6th comment, in line 26 and 27). Would love to know the explanation for this.
% Program to seperate Human voice and background music from music file
[a,fs]=audioread("C:\Users\benpa\AudioSample\Sunday.wav");
% setting length of array in order of 2^n and thus accordingly chopping our signal
b=a([1:1048576],:);
Length_audio=length(b);
df=fs/Length_audio;
frequency_audio=-fs/2:df:fs/2-df;
figure
% time domain plot
plot(b)
title(' Input Audio');
xlabel('Time(s)');
ylabel('Amplitude');
%sound(a,fs);
%%
FFT_audio_in=fftshift(fft(b))/length(fft(b));
f4=FFT_audio_in;
figure
plot(frequency_audio,abs(FFT_audio_in));
title('FFT of Input Audio');
xlabel('Frequency(Hz)');
ylabel('Amplitude');
% Initializing zero matrix of same size as that of original matrix
%f2 matrix is for background music and f3 contains human voice
f3=zeros(1048576,2);
f2=zeros(1048576,2);
%seleting particular band that dominates our signal i.e. has contributed
%maximum to our signal( decided by looking at amplitude in frequency domain)
%and making new matrix
% taking elements of matrix from m to n and 1048576-n to 1048576-m to get
% good result of selection of frequencies.
for i=1:1048576
for j=1:2
if (i<=400000 || i>=648576)
f2(i,j)=FFT_audio_in(i,j);
end
if ((i>=472288&&i<=514288))||(i>=534288&&i<=576288)
f3(i,j)=FFT_audio_in(i,j);
end
end
end
%f2 is for background music and f3 (has dominating part )is for voice of singer
%for converting fft of human voice to audio file
f1=(f3);
l1=length(f1);
sign=(ifft(ifftshift((f1)*length(b))));
fs=44100;
de=fs/l1;
fa=-fs/2:de:fs/2-de;
figure
plot(fa,abs(f1))
title('FFT of Human Voice Audio');
xlabel('Frequency(Hz)');
ylabel('Amplitude');
% we want real part of our signal, that's why we are extracting that using
% Re(z)=(z+z')/2
outh=(sign+transpose(ctranspose(sign)))*0.5;
audiowrite('human.wav',outh,fs);
figure
%plot of output
plot(outh);
%sound(outh,fs);
title('human voice Audio');
xlabel('time');
ylabel('Amplitude');
%%
f1=(f2);
l1=length(f1);
sign=(ifft(ifftshift((f1)*length(b))));
fs=44100;
de=fs/l1;
fa=-fs/2:de:fs/2-de;
figure
plot(fa,abs(f1))
title('FFT of Background sound ');
xlabel('Frequency(Hz)');
ylabel('Amplitude');
outb=(sign+transpose(ctranspose(sign)))*0.5;
audiowrite('human.wav',outb,fs);
audiowrite('back.wav',outb,fs);
figure
%plot of output
plot(outb);
sound(outb,fs);
title('Background Audio');
xlabel('time');
ylabel('Amplitude');

回答 (2 件)

Steven Lord
Steven Lord 2022 年 5 月 3 日
The zeros function creates an array of the specified size where every element is 0.
A = zeros(3, 4) % a 3-by-4 array where all 12 elements are 0
A = 3×4
0 0 0 0 0 0 0 0 0 0 0 0
This is being used to preallocate space that the later code can assign a chunk of memory to the array once then fill it in rather than repeatedly looking for larger and larger chunks of memory in which to move the existing data then store new data.
Picture a bookshelf: if you have a dozen new books to put on it, do you move books around to free up space for one book then move more books around to free up space for that book plus its sequel, etc.? Or do you free up enough space for the entire series at once and then just slide them in one after the other?
  1 件のコメント
Axel Blaze
Axel Blaze 2022 年 5 月 3 日
The latter i presume. Thanks a lot for the advice.

サインインしてコメントする。


Jan
Jan 2022 年 5 月 3 日
編集済み: Jan 2022 年 5 月 3 日
You mean:
f3 = zeros(1048576,2);
f2 = zeros(1048576,2);
Here the variables f2 and f3 are defines as matrices with the dimensions [1048576,2] containing zeros.
This is the size of the cropped input signal:
b = a(1:1048576, :); % No square brackets needed here
In the following loop, some sections of the processed signal are inserted into the arrays.
By the way, the loop:
for i=1:1048576
for j=1:2
if (i<=400000 || i>=648576)
f2(i,j)=FFT_audio_in(i,j);
end
if ((i>=472288&&i<=514288))||(i>=534288&&i<=576288)
f3(i,j)=FFT_audio_in(i,j);
end
end
end
can be written as:
f2(1:400000, :) = FFT_audio_in(1:400000, :);
f2(648576:end, :) = FFT_audio_in(648576:end, :);
f3(472288:514288, :) = FFT_audio_in(472288:514288, :);
f3(534288:576288, :) = FFT_audio_in(534288:576288, :);
This:
transpose(ctranspose(sign))
is less efficient than the direct:
conj(sign)
  4 件のコメント
Axel Blaze
Axel Blaze 2022 年 5 月 3 日
I do want to learn matlab thoroughly. But unfortunately in this particular situation i dont have the time to do so, so i just want to understand the code as much as i can as its part of a mini project.
Yea i understand that the code u gave does what the loop in my code does, but better and looks cleaner as well.
I was asking if you could explain to me what the loop does in general as looking at the loop im not exactly sure as to what it does? Is it just taking fft of the values in the matrix from 1:400000 and so on for f3?
Jan
Jan 2022 年 5 月 4 日
編集済み: Jan 2022 年 5 月 4 日
@Axel Blaze: As I have said already, the loop does exactly, what my suggested replacement does:
It copies the rows with the column indices 1:400000 and 648576:end from FFT_audio_in to f2. And it copies the values ofthe rows from the columns 472288:514288 and 534288:576288 from FFT_audio_in to f3.
There are no calculations inside the loop, just the copy of some blocks of data from the array FFT_audio_in to corresponding blocks in f2 and f3.
It is not an efficient idea to try to understand a code of this quality. Too many details are written in a confusing way, e.g. the mentioned "transpose(ctranspose(x))" instead of a "conj(x)". It is hard for a beginner, to understand code written by another confused beginner.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeMultirate Signal Processing についてさらに検索

製品


リリース

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by