Why is dlgradient giving different answers?

Question

Vellapandi M Research Scholar 2023 年 12 月 18 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2061577-why-is-dlgradient-giving-different-answers

回答済み: Angelo Yeo 2023 年 12 月 18 日

When I use the dlgradient function to compute the gradient of the expression (Parameters.fc2.Weights * tanh(Parameters.fc1.Weights * y(:,1) + Parameters.fc1.Bias) + Parameters.fc2.Bias) with respect to Parameters.fc2.Bias, it yields varying results instead of a consistent value of 1. According to theoretical calculations, it should be 1, but for different values of y(:,i), I observe discrepancies. What might be the issue?

Parameters = struct;
stateSize = 1;
hiddenSize = 20;
Parameters.fc1 = struct;
sz_fc1 = [hiddenSize stateSize];
Parameters.fc1.Weights = initializeGlorot(sz_fc1, hiddenSize, stateSize);
Parameters.fc1.Bias = initializeZeros([hiddenSize 1]);
Parameters.fc2 = struct;
sz_fc2 = [stateSize hiddenSize];
Parameters.fc2.Weights = initializeGlorot(sz_fc2, stateSize, hiddenSize);
Parameters.fc2.Bias = initializeZeros([stateSize 1]);
y(:,1) = 1;
y(:,2) = 0.976;
gradient1.fc2.Bias = dlgradient(Parameters.fc2.Weights * (tanh(Parameters.fc1.Weights * y(:,1) + Parameters.fc1.Bias)) + Parameters.fc2.Bias, Parameters.fc2.Bias)
gradient2.fc2.Bias = dlgradient(Parameters.fc2.Weights * (tanh(Parameters.fc1.Weights * y(:,2) + Parameters.fc1.Bias)) + Parameters.fc2.Bias, Parameters.fc2.Bias)

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Matt J 2023 年 12 月 18 日

Attach Parameters and y in a .mat file so we can test your code.

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Angelo Yeo 2023 年 12 月 18 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2061577-why-is-dlgradient-giving-different-answers#answer_1373237

MATLAB Online で開く

You can try to incorporate dlfeval when using dlgradient. You can get the results of 1's as expected.

Parameters = struct;
stateSize = 1;
hiddenSize = 20;
Parameters.fc1 = struct;
sz_fc1 = [hiddenSize stateSize];
Parameters.fc1.Weights = initializeGlorot(sz_fc1, hiddenSize, stateSize);
Parameters.fc1.Bias = initializeZeros([hiddenSize 1]);
Parameters.fc2 = struct;
sz_fc2 = [stateSize hiddenSize];
Parameters.fc2.Weights = initializeGlorot(sz_fc2, stateSize, hiddenSize);
Parameters.fc2.Bias = initializeZeros([stateSize 1]);
y(:,1) = 1;
y(:,2) = 0.976;
[res1, res2] = dlfeval(@gradFun, Parameters, y)
res1 = 
  1×1 single dlarray

     1
res2 = 
  1×1 single dlarray

     1
function [res1, res2] = gradFun(Parameters, y)
res1 = dlgradient(Parameters.fc2.Weights * (tanh(Parameters.fc1.Weights * y(:,1) + Parameters.fc1.Bias)) + Parameters.fc2.Bias, Parameters.fc2.Bias);
res2 = dlgradient(Parameters.fc2.Weights * (tanh(Parameters.fc1.Weights * y(:,2) + Parameters.fc1.Bias)) + Parameters.fc2.Bias, Parameters.fc2.Bias);
end
function weights = initializeGlorot(sz,numOut,numIn)
Z = 2*rand(sz,'single') - 1;
bound = sqrt(6 / (numIn + numOut));
weights = bound * Z;
weights = dlarray(weights);
end
function parameter = initializeZeros(sz)
parameter = zeros(sz,'single');
parameter = dlarray(parameter);
end

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Why is dlgradient giving different answers?

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

採用された回答

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

Community Treasure Hunt

Why is dlgradient giving different answers?

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

採用された回答

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

Community Treasure Hunt

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示