When you trained the DDPG agent, what was the threshold that you allowed for the steady-state error?
The values of steady-state error less than the threshold should be rewarded, otherwise be penalised.
I see. This probably implies that changing the sampling time affects the learning efficiency of the RL algorithm in tuning the PI Controller gains.
You may manually adjust the tuning parameter, but night as well use an optimization algorithm like GA or PSO to auto-tune all other hyperparameters in RL.
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!