The batchnorm() function input trainedMean, trainedVar has no effect on the result?

Question

cui,xingxing 2020 年 7 月 12 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/563546-the-batchnorm-function-input-trainedmean-trainedvar-has-no-effect-on-the-result

コメント済み: cui,xingxing 2021 年 7 月 5 日

Why does batchnorm() output the same result for random mean and variance(dlY is always same)?

height = 4;
width = 4;
channels = 3;
observations = 1;
X = rand(height,width,channels,observations);
dlX = dlarray(X,'SSCB');
offset = zeros(channels,1);
scaleFactor = ones(channels,1);
[dlY,mu,sigmaSq] = batchnorm(dlX,offset,scaleFactor)

useMean = rand(channels,1);
useVar = rand(channels,1);
[dlY,mu,sigmaSq] = batchnorm(dlX,offset,scaleFactor,useMean,useVar) % dlY is always same ???

Related question links:https://www.mathworks.com/matlabcentral/answers/551863-why-are-the-results-of-forward-and-predict-very-different-in-deep-learning

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Katja Mogalle 2021 年 6 月 30 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/563546-the-batchnorm-function-input-trainedmean-trainedvar-has-no-effect-on-the-result#answer_736883

Hello cui,

If I understand it correctly, you're wondering why the normalized data returned by batchnorm is the same, no matter if you specify mean (mu) and variance (sigmaSq) values as inputs or not.

There are basically two modes in which batchnorm is used in deep learning: training mode and inference mode.

Training mode

During training mode, mean and variance are computed directly from the current input data (aka "minibatch") and are used to normalize that minibatch of data. During training, several different minibatches of data are being processed and we're trying to compute running values for the mean and variance statistics so that we have approximate statistics for the entire data set.

For training mode, you can make use of the following two syntaxes:

Normalize dlX by using the mean and variance of dlX and the provided offset and scale factor: dlY = batchnorm(dlX,offset,scaleFactor)
Normalize the minibatch as above and build up running statistics: [dlY,updatedMu,updatedSigmaSq] = batchnorm(dlX,offset,scaleFactor,mu,sigmaSq)

This documentation example shows how to build up running statistics: https://www.mathworks.com/help/deeplearning/ref/dlarray.batchnorm.html?s_tid=doc_ta#mw_b1029c55-ab31-41f2-9792-7b047819d613

The formulas for computing the updated statistics can be found here: https://www.mathworks.com/help/deeplearning/ref/dlarray.batchnorm.html?s_tid=doc_ta#mw_2fb22643-9b65-4aca-b326-321c1cadf6ce

Inference mode

During inference mode, we want to normalize each minibatch in exactly the same way, using the same mu and sigmaSq, namely the statistics of the entire training data set.

For inference mode, you can make use of this syntax:

Normalize dlX using the provided mu and sigmaSq: dlY = batchnorm(dlX,offset,scaleFactor,mu,sigmaSq)

In conclusion ... I suspect you wanted to try out the inference mode syntax (5 input arguments, one output argument) instead of the second training mode syntax I mentioned above (5 input arguments, 3 output arguments).

I hope this helps.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

cui,xingxing 2021 年 7 月 5 日

MATLAB Online で開く

Thank you for your answer, and based on the example, a little more to add for verification!

%% test a sample
height = 4;
width = 4;
channels = 3;
observations = 1;
X = rand(height,width,channels,observations);
dlX = dlarray(X,'SSCB');
offset = zeros(channels,1);
scaleFactor = ones(channels,1);
[dlY1,mu1,sigmaSq1] = batchnorm(dlX,offset,scaleFactor);
%% Manual calculation
cal_mu1 = mean(dlX,[1,2,4]);
cal_sigmaSq1 = var(dlX,1,[1,2,4]);
cal_Y1 = (dlX -cal_mu1)./sqrt(cal_sigmaSq1);
% validate is equal 
eps = 10.^(-3);
assert(all(abs(mu1-squeeze(cal_mu1))<eps));
assert(all(abs(sigmaSq1-squeeze(cal_sigmaSq1))<eps));
assert(all(abs(dlY1-cal_Y1)<eps,'all'));
%% 5 inputs ,3 outputs
useMean = rand(channels,1);
useVar = rand(channels,1);
[dlY2,mu2,sigmaSq2] = batchnorm(dlX,offset,scaleFactor,useMean,useVar); % dlY2 is  same to dlY1 ,training mode, not use "useMean" to calculate!
[dlY3,mu3,sigmaSq3] = batchnorm(dlX,offset,scaleFactor,useMean,useVar); % dlY3 is  same to dlY1 ,training mode, not use "useMean" to calculate!
dlY4 = batchnorm(dlX,offset,scaleFactor,useMean,useVar); % dlY4 is different from dlY2,dlY3, inference mode,use "useMean" to calculate!
%% Manual calculation
decay = 0.1; % Same as the official default
cal_mu2 = decay*mean(dlX,[1,2,4])+(1-decay)*reshape(useMean,[1,1,channels,1]);
cal_sigmaSq2 = decay*var(dlX,1,[1,2,4])+(1-decay)*reshape(useVar,[1,1,channels,1]);
cal_Y2 = (dlX -mean(dlX,[1,2,4]))./sqrt(var(dlX,1,[1,2,4]));% not use "cal_mu2" and "cal_sigmsSq2"!
% validate is equal 
eps = 10.^(-3);
assert(all(abs(mu2-squeeze(cal_mu2))<eps));
assert(all(abs(sigmaSq2-squeeze(cal_sigmaSq2))<eps));
assert(all(abs(dlY2-cal_Y2)<eps,'all'));
% validate dlY4
cal_Y4 = (dlX -reshape(useMean,[1,1,channels,1]))./sqrt(reshape(useVar,[1,1,channels,1]));
assert(all(abs(dlY4-cal_Y4)<eps,'all'));

Validation passed!

サインインしてコメントする。

The batchnorm() function input trainedMean, trainedVar has no effect on the result?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

The batchnorm() function input trainedMean, trainedVar has no effect on the result?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示