DDPG - Noise Model - sample time step - definition
13 ビュー (過去 30 日間)
古いコメントを表示
Niklas Reinisch
2019 年 8 月 2 日
コメント済み: Emmanouil Tzorakoleftherakis
2019 年 8 月 5 日
Hello!
At the moment i am tuning the Parameters of my DDPG Algorithm and i don´t fully understand the Updating process of the Ornstein-Uhlenbeck Noise Model.
The Matlab Documentation describes the Process of Updating the Noise Model of a DDPG-algorithm, consisting of a Formula, which is used in every "sample time step".
(https://www.mathworks.com/help/reinforcement-learning/ref/rlddpgagentoptions.html Chapter: Input Arguments)
But how is the "sample time step" defined? Does this correlate with the episode or step count of the RL trainingprocess?
Thanks
Niklas
0 件のコメント
採用された回答
Emmanouil Tzorakoleftherakis
2019 年 8 月 2 日
Hi Niklas,
This post should be helpful. By "sample time step" the documentation refers to the "step count of the RL trainingprocess", i.e. each episode consists of a number of time steps, and noise is applied to the selected action at the begining of each time step. See this link for a description of DDPG (and specifically step #1). The time step value can be specified in the agent options here.
2 件のコメント
Emmanouil Tzorakoleftherakis
2019 年 8 月 5 日
Correct. This is why you want to keep the decay rate small or zero if you want to promote exploration.
その他の回答 (0 件)
参考
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!