DDPG Agent: Noise settings without any visible consequence

1 回表示 (過去 30 日間)
Tobias Michl
Tobias Michl 2022 年 7 月 25 日
Despite having noise settings - as below -, my RL Agent's output sticks to the limits for many consecutive steps (hundreds to thousands).
My understanding of the sequence order is:
  1. Actor gets observation as input.
  2. Actor outputs through tanh output in the range of [-1, 1]
  3. Noise gets added to actor output
  4. RL Agent outputs actor output plus additive noise
Did I get it wrong? What do I miss?
I'm using:
  • DDPG Agent
  • actor output layer: tanh --> resulting action space: [-1, 1]
  • Agent sample time: Ts = 0.0005;
  • agentOptions.NoiseOptions.StandardDeviation = 0.89443;
  • actionInfo = rlNumericSpec([2 1], 'LowerLimit', [-1; -1], 'UpperLimit', [1; 1]);
  • used Ornstein-Uhlenbeck
Besides, if I set rlDDPGAgent('UseEplorationPolicy', true), do I use Gaussian function instead of Ornstein Uhlenbeck?

回答 (1 件)

Emmanouil Tzorakoleftherakis
Emmanouil Tzorakoleftherakis 2023 年 1 月 26 日
Your standard deviation is very high compared to the action range that you have set. As a result, when noise is added to the tanh output, you are always hitting the limits you have set in your action space definition (which looks like they are [-1, 1]). I would use smaller std

カテゴリ

Help Center および File ExchangeReinforcement Learning についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by