Easy way to evaluate / compare the performance of RL algorithm

Question

Saurav Sthapit 2020 年 7 月 29 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/572359-easy-way-to-evaluate-compare-the-performance-of-rl-algorithm

編集済み: Saurav Sthapit 2020 年 8 月 6 日

I have a RL agent trained and would like to compare its performance with a dumb agent. I can run simout=sim(env,agent,simOpts) to evaluate the actual agent. But, I would like to compare the simulation results with a couple of dumb agents which always has the same action or random action. Is there any easy way to do this?

Currently, I have a seperate simulink model without RL agent block (replaced with constant block) and logging Observation and rewards using Simulation Data Inspector.

Thanks

Saurav

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Emmanouil Tzorakoleftherakis 2020 年 8 月 3 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/572359-easy-way-to-evaluate-compare-the-performance-of-rl-algorithm#answer_474718

Why not use a MATLAB Fcn block and implement the dummy agent in there? If you want random/constant actions should be just one line.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Saurav Sthapit 2020 年 8 月 6 日

編集済み: Saurav Sthapit 2020 年 8 月 6 日

Thanks, thats an excellent suggestion for evaluating random actions.

However, when I do that (or use constant blocks), I have to run two statements below: first one for evaluating random/dumb action and one for evaluating the agent.

logsout=sim(mdl)

simout=sim(env,agent,simOpts)

logsout and simout are not directly comparable, but logsout is a field in the simout.SimulationInfo struct.

I am wondering if this is the best approach or if there is a easy way to do this.

Also, simout contains action, observation and reward but if the reward is weighted sum of multiple rewards, I can't access the individual rewards. ( Of course, i can compare logsout with simout.logsout)

サインインしてコメントする。

Easy way to evaluate / compare the performance of RL algorithm

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

Easy way to evaluate / compare the performance of RL algorithm

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示