フィルターのクリア

Expected reward blows up while training (DDPG agent, reinforcement learning)

3 ビュー (過去 30 日間)
Sayak Mukherjee
Sayak Mukherjee 2020 年 10 月 12 日
I am training a DDPG network and after training for around 5000 iterations, the model seems doesnot seem to converge while the expected reward keeps on increasing exponentially. What can be a possible reason and how to solve the issue.

回答 (1 件)

Emmanouil Tzorakoleftherakis
Emmanouil Tzorakoleftherakis 2020 年 10 月 12 日
編集済み: Emmanouil Tzorakoleftherakis 2020 年 10 月 12 日
Hello,
This answer may be helpful.
I would make sure your reward signal outputs values that make sense, and also possibly simplify the critic network.
  2 件のコメント
Sayak Mukherjee
Sayak Mukherjee 2020 年 10 月 12 日
Thanks for your answer
What does simplifying critic network mean? Does that mean use less nodes and hidden layers?
Emmanouil Tzorakoleftherakis
Emmanouil Tzorakoleftherakis 2020 年 10 月 12 日
That's right

サインインしてコメントする。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by