Steady state error in DDPG control

Question

Ari 2024 年 12 月 25 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2172466-steady-state-error-in-ddpg-control

編集済み: Ari 2025 年 1 月 24 日 7:21

I am trying to make some modifications in Control Water Level in a Tank Using a DDPG Agent example. I want to reduce sample time from 1.0 to 0.5, so I set Ts = 0.5. Consequently, I had to make adjustment on StopTrainingValue, i.e., changed its value from 2000 to 4000. The training process was successfully completed as it can be seen below.

But there is something unexpected happened: this modifications introduce a steady state error (or something similar to) that wasn't there in the original example.

How to overcome this steady state error? Do I need to make additional adjustments, e.g. make changes to the structure of observations, reward function, actor/critic network, StopTrainingCriteria, etc?

Update:

This is the error I get using pre-trained agent (doTraining = false, no change on the original example)

This is the error I get using re-trained agent (doTraining = true, no change on the original example)

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

Ari 2024 年 12 月 26 日

編集済み: Ari 2024 年 12 月 26 日

The original reward function is defined as reward = 10(∣e∣<0.1) - 1(∣e∣≥0.1) - 100(h≤0 ∣∣ h≥20) where e = reference - h is the error and h is the height of the water in the tank. I didn't touch this function. It works well in the original example.

Sam Chak 2024 年 12 月 26 日

I see. This probably implies that changing the sampling time affects the learning efficiency of the RL algorithm in tuning the PI Controller gains.

You may manually adjust the tuning parameter, but night as well use an optimization algorithm like GA or PSO to auto-tune all other hyperparameters in RL.

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Divyanshu 2024 年 12 月 26 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2172466-steady-state-error-in-ddpg-control#answer_1556485

Hello @Ari Prasetiyo,

To get the same results and to avoid the error for sample time 0.5, you might have to change 'Tf' as well and set its value to '100'. This will ensure that the 'MaxStepsPerEpisode' parameter of 'rlTrainingOptions' still has the correct value which the example expects.

Since you only tried to modify the sample time, incorrect value of 'MaxStepsPerEpisode' was computed and maybe that can be a reason for the error.

I hope this helps. However, to find the exact root cause of the error, I might the snapshot of the error message and the reproduction steps.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Ari 2024 年 12 月 26 日

編集済み: Ari 2024 年 12 月 26 日

Maximum number of environment steps to run per episode is set using MaxStepsPerEpisode = ceil(Tf/Ts). So, it's automatically adjusted. Also, there is no error message whatsoever. To reproduce my result: download the example -> change Ts to 0.5 -> change StopTrainingValue to 4000 -> change doTraining to true -> run the simulation. You may add an integrator to see the steady state error.

サインインしてコメントする。