The gradient of mini batches

Question

MAHSA YOUSEFI 2020 年 11 月 23 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/658543-the-gradient-of-mini-batches

コメント済み: Mahesh Taparia 2020 年 12 月 21 日

採用された回答: Mahesh Taparia

MATLAB Online で開く

Hi there.

I need your confimation or rejection for this question...

In following code, if the minibatch size is h,

[grad,loss] = dlfeval(@modelGradients,dlnet,dlX_miniBatch,Y_miniBatch);

the grad is the average of gradients of loss over h samples? Does it calculate dradients automatically and at the end with:

grad = 1/h * sum_i=1:h (\nabla loss(y_i,yHat_i)) ??

Following this question, for computing the total loss and geadient (for a full batch), does we should take avarage of losses and averages of gradients (averaging with the number of batches, say 1000 batches each with h size)??

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Mahesh Taparia 2020 年 12 月 14 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/658543-the-gradient-of-mini-batches#answer_575280

Hi

The function dlfeval evaluate the custom deep learning models. The loss are calculated based on what has been defined in modelGradients function. So if you are calculating the average loss in this function, then it will return the averaged one. For example, consider this modelGradient function, it is calculating the average cross entropy loss, so it will return the average loss. The gradients are calculated with respect to the loss function defined in for the network.

2 件のコメント
なしを表示なしを非表示

MAHSA YOUSEFI 2020 年 12 月 19 日

MATLAB Online で開く

In the example you mentioned, there is a mistake.

function [gradients, loss] = modelGradients(parameters, dlX, T)
    % Forward data through the model function.
    dlY = model(parameters,dlX);
    % Compute loss.
    loss = crossentropy(dlX,T);
    % Compute gradients.
    gradients = dlgradient(loss,parameters);
end

dlY must be feed to crossentropy!

Mahesh Taparia 2020 年 12 月 21 日

Yeah, crossentropy loss will be calculated between dlY and T. The documentation page will be updated.

サインインしてコメントする。

The gradient of mini batches

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

2 件のコメント
なしを表示なしを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

Community Treasure Hunt

The gradient of mini batches

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

2 件のコメント なしを表示なしを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示