Training of RL DDPG Agent is not working (Control of an Inverted pendulum)

Question

Roy Nordstrom 2023 年 4 月 13 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1946693-training-of-rl-ddpg-agent-is-not-working-control-of-an-inverted-pendulum

回答済み: Yash Sharma 2023 年 11 月 24 日

This project initially started with a Mathworks example: Train DDPG Agent to swing up and balance pendulum.

The pendulum block in the model has been replaced with simscape components. Also the following have been added: a DC electric motor, and a controllable voltage supply. See my_simscape_pendulum_model.slx

I trained the agent is using the settings in training.m

The session was stopped after 17 hours and 796 episodes. Early on I could see the pendulum rising up to about 30 degrees above the downward hanging position before it stalled. This indicates to me that there was enough torque being applied to enable the agent use a back and forth rocking motion to raise the pendulum. However, after many hours the agent had not learned to do the back and forth rocking motion, and seemed to be stalled in a bad policy. See the screenshot of the RL episode manager after it was stopped.

My research indicates that my learning rate or exploration options may need to be modified. However I have not been able to find documentation on how to do this.

Do you have any suggestions ?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Yash Sharma 2023 年 11 月 24 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1946693-training-of-rl-ddpg-agent-is-not-working-control-of-an-inverted-pendulum#answer_1359037

I understand that you have a reinforcement learning DDPG agent and wants to set learning rate and exploration options of that agent. You can set learning rate using “rlOptimizerOptions”.

Here is the documentation for optimizer options and different exploration policies.

rlOptimizerOptions: https://www.mathworks.com/help/reinforcement-learning/ref/rl.option.rloptimizeroptions.html
rlEpsilonGreedyPolicy: https://www.mathworks.com/help/reinforcement-learning/ref/rl.policy.rlepsilongreedypolicy.html
rlAdditiveNoisePolicy: https://www.mathworks.com/help/reinforcement-learning/ref/rl.policy.rladditivenoisepolicy.html
rlStochasticActorPolicy: https://www.mathworks.com/help/reinforcement-learning/ref/rl.policy.rlstochasticactorpolicy.html
rlMaxQPolicy: https://www.mathworks.com/help/reinforcement-learning/ref/rl.policy.rlmaxqpolicy.html
rlDeterministicActorPolicy: https://www.mathworks.com/help/reinforcement-learning/ref/rl.policy.rldeterministicactorpolicy.html

Hope this helps!

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Training of RL DDPG Agent is not working (Control of an Inverted pendulum)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

Training of RL DDPG Agent is not working (Control of an Inverted pendulum)

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示