Reinforcement Learning Toolbox - When does algorithm train?

Hans-Joachim Steinort

2019 9 月 17

1 回答

回答採用済み

4 ビュー (30 日間)

0 投票

I am currently using the RL-Toolbox with a DQN-Agent built into a long-running process-simulation.

The maximum stepcount is currently 8000 steps per episode.

Unfortunately the documentation seems a little ambiguous to me, so here my question:

Doese the train-function of the RL-Toolbox train the agent at the end of an episode or during the episode when the step count exeeds the minibatch-size (like in the baseline algorithms)?

Thank you in advance.

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

採用された回答

Emmanouil Tzorakoleftherakis 2019 年 9 月 25 日

0 投票

The implementation is based on the algorithm listed here.

Weights are being updated at each time step.

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

Hans-Joachim Steinort 2019 年 9 月 26 日

"For each training time step" - that was the line I was looking for (yet looking into the source code lead me to the same conclusion).

After double-checking the baseline-algorithms I found that they do it the same way.

Thank you for your time!

サインインしてコメントする。

その他の回答 (0 件)

サインインしてこの質問に回答する。

カテゴリ

ヘルプセンターおよび File Exchange で Reinforcement Learning Toolbox についてさらに検索

製品

リリース

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by

Reinforcement Learning Toolbox - When does algorithm train?

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示 -1 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

製品

リリース

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示