Drone Control using Reinforcement Learning

Question

Hassan Moin 2021 年 10 月 13 日

1
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1563166-drone-control-using-reinforcement-learning

回答済み: Umeshraja 2025 年 6 月 10 日

Hello!

I am trying to implement a simple drone altitude controller using Reinforcement Learning Toolbox and Simulink. I am taking the the error of altitude, its integral and its derivative as observations, while the action space is the desired thrust. My reward function is exp(-(1/0.5*err_z)^2). However, after a few iterations, my reward doesnt seem to increase. Can anyone provide insight as to what is happening, and what can I do to rectify this?

2 件のコメント
なしを表示なしを非表示

Apoorv Pandey 2023 年 1 月 12 日

I have the same problem. Were you able to find a solution?

Thank you

Darshan Prajapati 2023 年 4 月 4 日

I am trying to implement fault tolerance for quadcopter. Can you help me? Like how to implement the environment and which block I have to use?

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Umeshraja 2025 年 6 月 10 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1563166-drone-control-using-reinforcement-learning#answer_1566202

Hi @Hassan Moin,

I understand that your drone altitude controller's reward function is not improving.

Your training plot and episode statistics indicate that the episode reward quickly spikes in the first few episodes, then stabilizes at a low value (around 1.14), with no significant improvement across 200+ episodes. This can occur due to any of the follwoing causes

Reward Function is Too Narrow: As previously mentioned, your reward function is very steep, so the agent receives almost zero reward except for very small errors. This makes it difficult for the agent to distinguish between slightly better or worse actions, stalling learning. Make the reward less steep so that the agent can receive meaningful feedback for a wider range of errors. Or, try a linear or piecewise reward for small errors.
Learning Rates Are Too High: The displayed learning rates for both actor and critic are set to 2, which is extremely high for RL agents. Typical values are in the range of 10e-3 to 10e-4 . High learning rates can cause unstable or stalled learning.

If your state or action spaces are not normalized, the agent may struggle to learn effectively. To know more, Please refer to the following MATLAB Answer

https://www.mathworks.com/matlabcentral/answers/2124721-issues-with-quadcopter-deep-reinforcement-learning-training-in-simulink

You can also refer to the other useful resources by MathWorks on reinforcement learning using MATLAB:

Guide to understand reinforcement-learning: https://www.mathworks.com/content/dam/mathworks/ebook/gated/reinforcement-learning-ebook-all-chapters.pdf
Webinar series on Reinforcement learning by MathWorks: https://www.mathworks.com/videos/series/reinforcement-learning.html

I hope this helps!

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Drone Control using Reinforcement Learning

2 件のコメント
なしを表示なしを非表示

回答 (1 件)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

Drone Control using Reinforcement Learning

2 件のコメント なしを表示なしを非表示

回答 (1 件)

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

2 件のコメント
なしを表示なしを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示