evaluatePolicy.m output differs from Agent action

2 ビュー (過去 30 日間)

Victor Bayer 2021 年 7 月 11 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/876223-evaluatepolicy-m-output-differs-from-agent-action

編集済み: Victor Bayer 2021 年 9 月 22 日

Dear Mathworks Team,

I have a Matlab defined RL-algorithm with an DDPG-agent which recieves two observations can chose values between 0 and 1.

This is defined through:

numAct = 1;

actionInfo = rlNumericSpec([numAct 1],'LowerLimit',0 ,'UpperLimit', 1);

actionInfo.Name = 'sine_amplitude';

During training only values between 0 and 1 are applied. The action is clipped at those values and the actionInfo is repected.

However when I use the generated Agent to generate a Policy according to Matlab (see https://www.mathworks.com/help/reinforcement-learning/ref/rl.agent.rldqnagent.generatepolicyfunction.html)

and evaluate the function I also recieve negative values.

For example

evaluatePolicy(reshape([-0.1515581,1],2,1,1))

returns -1

I have used reshape-functions to reshape the data [-0.1515581,1] to the corresponding shape since the function expects the input to be of shape (2,1,1).

My question is why and how can i change generation of the evaluatePolicy function?

Help Center および File Exchange で Deep Learning Toolbox についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by

evaluatePolicy.m output differs from Agent action

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (0 件)

参考

カテゴリ

タグ

Community Treasure Hunt

evaluatePolicy.m output differs from Agent action

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (0 件)

参考

カテゴリ

タグ

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示