Unable to perform assignment because the size of the left side is 100-by-198 and the size of the right side is 100-by-98. Error in backgroundSpectrograms (line 50) Xbkg(:,:,:,ind) = filterBank * spec;

Question

Barb 2020 年 1 月 6 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/499146-unable-to-perform-assignment-because-the-size-of-the-left-side-is-100-by-198-and-the-size-of-the-rig

コメント済み: imtiaz waheed 2020 年 2 月 6 日

I try to do the background spectograms its the same records as in https://www.mathworks.com/help/deeplearning/examples/deep-learning-speech-recognition.html

and it gives me that error :

Warning:

The FFT length is too small to compute the specified number of

bands. Decrease the number of bands or increase the FFT length.

> In designAuditoryFilterBank (line 104)

In backgroundSpectrograms (line 20)

nable to perform assignment because the size of the left side is

100-by-198 and the size of the right side is 100-by-98.

Error in backgroundSpectrograms (line 50)

Xbkg(:,:,:,ind) = filterBank * spec;

I dont know how to fix it its the backgrounds its the same in example so I dont know what is the error about.

Help me to fix it :

ads = 1x1 audioDatastore

numBkgClips = 4000

volumeRange = [1e-4,1]

segmentDuration= 2

hopDuration = 0.010

numBands = 100

frameDuration = 0.025

FFT length = 512 for backgroundSpectograms

help me with the values

if I set FFT length to 1000 the warning out but the error stay

I must give the hopDuration, numBands,frameDuration, segmentDuration values like this because of my own wav files .

When I try do

adsBkg = subset(ads0,ads0.Labels=="_background_noise_");

numBkgClips = 4000;

volumeRange = [1e-4,1];

XBkg = backgroundSpectrograms(adsBkg,numBkgClips,volumeRange,segmentDuration,frameDuration,hopDuration,numBands);

XBkg = log10(XBkg + epsil);

it gives me above error.

backgroundSpectogram.m

% backgroundSpectrograms(ads,numBkgClips,volumeRange,segmentDuration,frameDuration,hopDuration,numBands)
% calculates numBkgClips spectrograms of background clips taken from the
% audio files in the |ads| datastore. Approximately the same number of
% clips is taken from each audio file. Before calculating spectrograms, the
% function rescales each audio clip with a factor sampled from a
% log-uniform distribution in the range given by volumeRange.
% segmentDuration is the total duration of the speech clips (in seconds),
% frameDuration the duration of each spectrogram frame, hopDuration the
% time shift between each spectrogram frame, and numBands the number of
% frequency bands.
function Xbkg = backgroundSpectrograms(ads,numBkgClips,volumeRange,segmentDuration,frameDuration,hopDuration,numBands)
disp("Computing background spectrograms...");
fs        = 16e3;
FFTLength = 512;
persistent filterBank
if isempty(filterBank)
    filterBank = designAuditoryFilterBank(fs,'FrequencyScale','bark',...
        'FFTLength',FFTLength,...
        'NumBands',numBands,...
        'FrequencyRange',[50,7000]);
end
logVolumeRange = log10(volumeRange);
numBkgFiles = numel(ads.Files);
numClipsPerFile = histcounts(1:numBkgClips,linspace(1,numBkgClips,numBkgFiles+1));
numHops = segmentDuration/hopDuration - 2;
Xbkg = zeros(numBands,numHops,1,numBkgClips,'single');
ind = 1;
for count = 1:numBkgFiles
    
    wave = read(ads);
    
    frameLength = frameDuration*fs;
    hopLength = hopDuration*fs;
    
    for j = 1:numClipsPerFile(count)
        indStart =  randi(numel(wave)-fs);
        logVolume = logVolumeRange(1) + diff(logVolumeRange)*rand;
        volume = 10^logVolume;
        x = wave(indStart:indStart+fs-1)*volume;
        x = max(min(x,1),-1);
        
        [~,~,~,spec] = spectrogram(x,hann(frameLength,'periodic'),frameLength - hopLength,FFTLength,'onesided');
        Xbkg(:,:,:,ind) = filterBank * spec;
        if mod(ind,1000)==0
            disp("Processed " + string(ind) + " background clips out of " + string(numBkgClips))
        end
        ind = ind + 1;
    end
end
disp("...done");
end

2 件のコメント
なしを表示なしを非表示

imtiaz waheed 2020 年 2 月 6 日

numBkgClips = 4000;

volumeRange = [1e-4,1];

segmentDuration= 2;

hopDuration = 0.010;

numBands = 100;

frameDuration = 0.025;

FFTlength = 1024;

adsBkg = subset(ads,ads.Labels=='_background_noise_');

% ads is your datastore

XBkg = backgroundSpectrograms(adsBkg,numBkgClips);volumeRange;segmentDuration;frameDuration;hopDuration;numBands;FFTlength;

disp('Computing background spectrograms...');

logVolumeRange = log10(volumeRange);

numBkgFiles = numel(ads.Files);

numClipsPerFile = histcounts(1:numBkgClips,linspace(1,numBkgClips,numBkgFiles+1));

numHops = segmentDuration/hopDuration - 2;

Xbkg = zeros(numBands,numHops,1,numBkgClips,'single');

ind = 1;

for count = 1:numBkgFiles

[wave,info] = read(ads);

fs = info.SampleRate;

frameLength = frameDuration*fs;

hopLength = hopDuration*fs;

for j = 1:numClipsPerFile(count)

indStart = randi(numel(wave)-fs);

logVolume = logVolumeRange(1) + diff(logVolumeRange)*rand;

volume = 10^logVolume;

x = wave(indStart:indStart+fs-1)*volume;

x = max(min(x,1),-1);

Xbkg(:,:,:,ind) = melSpectrogram(x,fs, ...

'WindowLength',frameLength, ...

'OverlapLength',frameLength - hopLength, ...

'FFTLength',512, ...

'NumBands',numBands, ...

'FrequencyRange',[50,7000]);

if mod(ind,1000)==0

disp('Processed ' + string(ind) + ' background clips out of ' + string(numBkgClips))

end

ind = ind + 1;

end

disp('...done');

imtiaz waheed 2020 年 2 月 6 日

any one can help me please in this

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

jibrahim 2020 年 1 月 7 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/499146-unable-to-perform-assignment-because-the-size-of-the-left-side-is-100-by-198-and-the-size-of-the-rig#answer_409065

MATLAB Online で開く

backgroundSpectrograms.m

Hi Barb,

There are two problems:

1) Since you asked for 100 bands in the auditory filter ban, the hard-coded FFT length (512) is too small. 1024 should work.

2) the code hard-codes the expected segment duration to 1 second (by using fs here: x = wave(indStart:indStart+fs-1)*volume;)

I modified and attached the code. This should run now:

numBkgClips = 4000;
volumeRange = [1e-4,1];
segmentDuration= 2;
hopDuration = 0.010;
numBands = 100;
frameDuration = 0.025;
FFTlength = 1024;
adsBkg = subset(ads,ads.Labels=="_background_noise_");
% ads is your datastore
XBkg = backgroundSpectrograms(adsBkg,numBkgClips,volumeRange,segmentDuration,frameDuration,hopDuration,numBands,FFTlength);

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

Barb 2020 年 1 月 16 日

MATLAB Online で開く

if i get your backgroundSpectogram it show me error if i try this part od code:

DoTraining = true;
if doTraining
    trainedNet = trainNetwork(augimdsTrain,layers,options);
else
    load('commandNet.mat','trainedNet');
end

The error:

Error using trainNetwork (line 170)

Invalid validation data. The output size (3) of the last layer does not

match the number of classes (4).

jibrahim 2020 年 1 月 16 日

Make sure that the argument to the fullyConnectedLayer that precedes the softMaxLayer is equal to the number of classes you are trying to classify. It seems like you have 4 classes, but you using fullyConnectedLayer(3). If you indeed have 3 classes, then maybe the categorical validation array you are supplying has an unused cateogry. You can remove it using removecats:

YValidation = removecats(YValidation);

サインインしてコメントする。

Answer 2

Barb 2020 年 1 月 22 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/499146-unable-to-perform-assignment-because-the-size-of-the-left-side-is-100-by-198-and-the-size-of-the-rig#answer_411439

MATLAB Online で開く

Ok training data works but i dont know how to fix errors when i try to do this part of code

h = figure('Units','normalized','Position',[0.2 0.1 0.6 0.8]);
filterBank = designAuditoryFilterBank(fs,'FrequencyScale','bark',...
    'FFTLength',1024,...
    'NumBands',numBands,...
    'FrequencyRange',[50,7000]);
while ishandle(h)
    % Extract audio samples from the audio device and add the samples to
    % the buffer.
    x = audioIn();
    waveBuffer(1:end-numel(x)) = waveBuffer(numel(x)+1:end);
    waveBuffer(end-numel(x)+1:end) = x;
    % Compute the spectrogram of the latest audio samples.
    [~,~,~,spec] =  spectrogram(waveBuffer,hann(frameLength,'periodic'),frameLength - hopLength,512,'onesided');
    spec = filterBank * spec;
    spec = log10(spec + epsil);
    % Classify the current spectrogram, save the label to the label buffer,
    % and save the predicted probabilities to the probability buffer.
    [YPredicted,probs] = classify(trainedNet,spec,'ExecutionEnvironment','cpu');
    YBuffer(1:end-1)= YBuffer(2:end);
    YBuffer(end) = YPredicted;
    probBuffer(:,1:end-1) = probBuffer(:,2:end);
    probBuffer(:,end) = probs';
    % Plot the current waveform and spectrogram.
    subplot(2,1,1);
    plot(waveBuffer)
    axis tight
    ylim([-0.2,0.2])
    subplot(2,1,2)
    pcolor(spec)
    caxis([specMin+2 specMax])
    shading flat
   
    [YMode,count] = mode(YBuffer);
    countThreshold = ceil(classificationRate*0.2);
    maxProb = max(probBuffer(labels == YMode,:));
    probThreshold = 0.7;
    subplot(2,1,1);
    if YMode == "background" || count<countThreshold || maxProb < probThreshold
        title(" ")
    else
        title(string(YMode),'FontSize',20)
    end
    drawnow
end

: Errors :

Error using DAGNetwork/calculatePredict>predictBatch (line 151)

Incorrect input size. The input images must have a size of [100 198 1].

Error in DAGNetwork/calculatePredict (line 17)

Y = predictBatch( ...

Error in DAGNetwork/classify (line 134)

scores = this.calculatePredict( ...

Error in SeriesNetwork/classify (line 502)

[labels, scores] = this.UnderlyingDAGNetwork.classify(X,

varargin{:});

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

jibrahim 2020 年 1 月 23 日

Make sure the size of the image going into your network matches the image size you used in training:

[YPredicted,probs] = classify(trainedNet,spec,'ExecutionEnvironment','cpu');

It looks like the size of spec is not [100 98 1].

I remember you were generating spectrograms based on 2-second segments. Make sure waveBuffer holds indeed 2 seconds. I think the originsl demo uses one second, so you might have to slightly change those three lines of code:

x = audioIn();

waveBuffer(1:end-numel(x)) = waveBuffer(numel(x)+1:end);

waveBuffer(end-numel(x)+1:end) = x;

サインインしてコメントする。

Unable to perform assignment because the size of the left side is 100-by-198 and the size of the right side is 100-by-98. Error in backgroundSpectrograms (line 50) Xbkg(:,:,:,ind) = filterBank * spec;

2 件のコメント
なしを表示なしを非表示

回答 (2 件)

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

Unable to perform assignment because the size of the left side is 100-by-198 and the size of the right side is 100-by-98. Error in backgroundSpectrograms (line 50) Xbkg(:,:,:,ind) = filterBank * spec;

2 件のコメント なしを表示なしを非表示

回答 (2 件)

5 件のコメント 3 件の古いコメントを表示3 件の古いコメントを非表示

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

2 件のコメント
なしを表示なしを非表示

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示