How to get the actor network of a trained policy gradient agent?

Question

Margarita Cabrera 2022 年 7 月 11 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1757355-how-to-get-the-actor-network-of-a-trained-policy-gradient-agent

コメント済み: Margarita Cabrera 2022 年 7 月 18 日

採用された回答: Emmanouil Tzorakoleftherakis

I have trained a policy gradient agent using the following matlab code:

net = [ featureInputLayer(2,'Normalization','none', ...

'Name','state')

fullyConnectedLayer(4,'Name','fc')

softmaxLayer('Name','actionProb') ];

actorOpts = rlRepresentationOptions('LearnRate',1e-2,'GradientThreshold',1);

actorCl = rlStochasticActorRepresentation(net,obsInfo,actInfo,'Observation',{'state'},actorOpts);

agentCl = rlPGAgent(actorCl)

opt = rlTrainingOptions(...

'MaxEpisodes',MaxEpi,...

'MaxStepsPerEpisode',100',...

'StopTrainingCriteria',"AverageReward",...

'StopTrainingValue',-5);

trainStats = train(agentCl,envCl,opt);

where envCl contains my custom environment, obsInfo and actInfo have been previously generated as:

obsInfo = rlNumericSpec([1 2]);

obsInfo.Name = 'Observation';

obsInfo.Description = 'Bidimensional State';

obsInfo.LowerLimit = [1 1];

obsInfo.UpperLimit = [envClConst.Nrows envClConst.Ncolumns];

elements=1:envClConst.Na; % Na number of actions equals 4 in this case

actInfo = rlFiniteSetSpec(elements);

actInfo.Name = 'actions';

actInfo.Description = 'different movements';

So, actions are discrete, Set of action is [1 2 3 4]

Once the agent has been trained, I can generate some episodes with a very good reasonable performance. Action in each state is generated randomly but most of times the generated action is the optimum one given the state.

My question is, How can I obtain the trained actor network? i. e. How can I obtain the weights of the fully connected layer and of the four softmax output layer of the actor?. I'm interested in this network in order to know wich are the probabilities of each of the four actions given an state.

I have tried with this

actor = getActor(agentCl)

actor =

rlDiscreteCategoricalActor with properties:

Distribution: [1×1 rl.distribution.rlDiscreteGaussianDistribution]

ObservationInfo: [1×1 rl.util.rlNumericSpec]

ActionInfo: [1×1 rl.util.rlFiniteSetSpec]

UseDevice: "cpu"

but the Distribution field is built.

Thank you very much for your help.

Best regards.

Marga Cabrera

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Emmanouil Tzorakoleftherakis 2022 年 7 月 18 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1757355-how-to-get-the-actor-network-of-a-trained-policy-gradient-agent#answer_1009930

MATLAB Online で開く

Hello,

To get the neural network model you can use

net = getModel(getActor(agent))

To get learnable parameters you can use

getLearnableParameters(getActor(agent))

Note that will give you the weights and biases, not probabilities. For probabilities of discrete categorical actors, assuming the observation is [0;0] for simplicity, you can use this function (which is not documented):

p = probabilityParameters(getActor(agent),{[0;0]})

Hope that helps

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Margarita Cabrera 2022 年 7 月 18 日

Thanks, It does help

サインインしてコメントする。

How to get the actor network of a trained policy gradient agent?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

How to get the actor network of a trained policy gradient agent?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示