Why is the DDPG episode rewards never change during the whole training process?
7 ビュー (過去 30 日間)
古いコメントを表示
Guoge Tan
2020 年 5 月 25 日
I'm training a DDPG agent using the Reinforcement Learning toolbox on MATLAB R2020a for a path planning problem. But as you can see, the DDPG episode rewards and average rewards never change during 5000 episodes. I used a simple neural networks with 20 neurons and three layers, the learning rate is set to 0.01, and the Gradient Threshold is 1. Then I try to set weights and bias for fully connected layers and change my reward function, but the result is the same.
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/300253/image.png)
1 件のコメント
採用された回答
Emmanouil Tzorakoleftherakis
2020 年 5 月 26 日
Looks like the scale between Q0 and episode reward is very different. Try unchecking "Show Episode Q0" to see of the episode reward changes. I would then simplify the critic network to make sure it outputs values in a similar scale as the episode reward.
0 件のコメント
その他の回答 (0 件)
参考
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!