MATLAB Answers

Why is my neural network generating negative weights?

10 ビュー (過去 30 日間)
Sebastián Quiroga Reyes
Sebastián Quiroga Reyes 2021 年 7 月 25 日
編集済み: Sebastián Quiroga Reyes 2021 年 7 月 25 日
Hi, I'm using reinforcement learning on a control problem. My goal is trying to find optimal values for PID gains, so in my search I found this matlab link: https://la.mathworks.com/help/reinforcement-learning/ug/tune-pi-controller-using-td3.html .
In the matlab's example, it use a custom layer for the actor called: "fullyConnectedPILayer", the description says:
"Gradient descent optimization can drive the weights to negative values. To avoid negative weights, replace normal fullyConnectedLayer with a fullyConnectedPILayer. This layer ensures that the weights are positive by implementing the function Y=abs(WEIGHTS)∗X. This layer is defined in fullyConnectedPILayer.m."
So, the two weights always suppose to be positive, but after training, my actor network has a negative weight value (negative ki = -0.0057) and a positive weight value (Kp = 0.0455). Also, in the same example, it says:
"The integral and proportional gains of the PI controller are the absolute weights of the actor representation. To obtain the weights, first extract the learnable parameters from the actor."
And it uses the abs function to get the weigths, so it doesn't make any sense to use the custom layer "fullyConnectedPILayer", because the actor network can generate negative weights.
the code of the layer is as follows:
classdef fullyConnectedPILayer < nnet.layer.Layer
properties (Learnable)
Weights
end
methods
function obj = fullyConnectedPILayer(Weights,Name)
% Set layer name
obj.Name = Name;
% Set layer description
obj.Description = "fullyConnectedNonNegWeightLayer";
% Set layer weights
obj.Weights = Weights;
end
function Z = predict(obj, X)
Z = fullyconnect(X, abs(obj.Weights), 0, 'DataFormat','CB');
end
end
end
The code for my actor network is exactly the same as the example:
initialGain = single([1e-3 2]);
actorNetwork = [
featureInputLayer(numObservations,'Normalization','none','Name','state')
fullyConnectedPILayer(initialGain, 'Action')];
actorOptions = rlRepresentationOptions('LearnRate',1e-3,'GradientThreshold',1);
actor = rlDeterministicActorRepresentation(actorNetwork,obsInfo,actInfo,...
'Observation',{'state'},'Action',{'Action'},actorOptions);
I don't know why it generates a negative weight if I'm using this custom layer.

回答 (0 件)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by