Reaching observation data and pass them to the learning process

Question

Esan freedom 2024 年 3 月 21 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2097231-reaching-observation-data-and-pass-them-to-the-learning-process

コメント済み: Esan freedom 2024 年 3 月 24 日

Hi everyone,

I'm wondering if I can have access to observation data and take actions based on the first observed observations in every time step.

Something that I need is like here:

like here:

obsInfo = rlNumericSpec([8 1]);

if obsInfo.input(1)<0

actInfo= rlNumericSpec([4 1],...

LowerLimit=[-inf -inf -inf -inf]',...

UpperLimit=[0 0 0 0]');

obsInfo.input(1)>0

actInfo= rlNumericSpec([4 1],...

LowerLimit=[0 0 0 0]',...

UpperLimit=[inf inf inf inf]');

else

end

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Esan freedom 2024 年 3 月 21 日

@ Emmanouil Tzorakoleftherakis

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Emmanouil Tzorakoleftherakis 2024 年 3 月 21 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2097231-reaching-observation-data-and-pass-them-to-the-learning-process#answer_1428981

編集済み: Emmanouil Tzorakoleftherakis 2024 年 3 月 21 日

In general, you cannot change the observation/action space definition once they are defined. That said, it seems to me that what you are trying to accomplish can be done in a different way. Depending on whether your environment is in MATLAB or Simulink, you can check whether the last observation was positive or negative and adjust the agent's output as needed.

If you are using an off-policy agent, it would be a good idea to also make sure this adjustment is reflected in the experience buffer as well. You can use, e.g. the last action port for that.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Esan freedom 2024 年 3 月 24 日

Thank you so much,

As I'm using Simulink I did it the way you mentioned. Regards.

サインインしてコメントする。

Reaching observation data and pass them to the learning process

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

採用された回答

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

Reaching observation data and pass them to the learning process

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示