Reinforcement learning DDPG action fluctuations
古いコメントを表示
Upon attempting to train the path following control example in MATLAB, the training process generated the behviour shown in the picture.

- The steering angle is constantly fluctuating.
- The acceleration is also constantly flucutating.
- The reward convergence is very noisy and seems to jump between a high reward and low reward.
What could be causing this issue? This also happened for other projects I used. One method I used was to penalise the fluctuation in the reward function using this term inspired by a paper published by Wang et. al:
10*[ (d/dt(current_action) * d/dt(previous_action) < 0]
Please let me know how to avoid this problem. Thank you very much!
2 件のコメント
Emmanouil Tzorakoleftherakis
2020 年 11 月 17 日
Hello,
One clarification - the scope signals you are showing on the right, are you getting these during training or after training?
Tech Logg Ding
2020 年 11 月 17 日
採用された回答
その他の回答 (0 件)
カテゴリ
ヘルプ センター および File Exchange で Policies and Value Functions についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!