Model Based Reinforcement Learning

Question

Rafael Basso 2019 年 9 月 10 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/479701-model-based-reinforcement-learning

編集済み: Jillian Eunice Oliveros 2021 年 10 月 26 日

I'm trying to implement model based reinforcement learning with matlab. I have a directed graph and i want to travel from origin to destination. Using the function createMDP would be possible to create a very simple graph. The main problem is because the actions are generic. What i would like to do is to allow only a subset of actions depending on the current state. A solution is to implement a good reward function to penalize undesired/invalid actions, but that means a lot more training. So i'd like to speed up the learning by only allowing specific actions depending on the current state. Is it possible to do that?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Neuropragmatist 2019 年 9 月 10 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/479701-model-based-reinforcement-learning#answer_391235

Your agent shouldn't be able to make 'invalid' actions at all. For undesired actions, as you say the correct reward function with time should lead to the correct learning and this unconstrained approach would certianly be the most convincing.

You can of course restrict the actions of the agent in specific circumstances, but I think you would have to have good reason to implement those and be able to show that you are not just initialising your model with the parameters you expect at the end.

Hope this helps,

NP.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Jillian Eunice Oliveros 2021 年 10 月 25 日

編集済み: Jillian Eunice Oliveros 2021 年 10 月 26 日

@Neuropragmatist Using createMDP, Is it possible to add certain conditions (if else) as to what state the agent will have to transition into? For example, when the pixel intensity is more than 10, the transition will be to state 2. And if not, it will transition to state 3.

サインインしてコメントする。

Model Based Reinforcement Learning

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

Model Based Reinforcement Learning

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示