Control the exploration in soft actor-critic

9 ビュー (過去 30 日間)
Sayak Mukherjee
Sayak Mukherjee 2022 年 3 月 22 日
回答済み: Ahmed R. Sayed 2022 年 10 月 4 日
What is the best way to control the exploration in SAC agent. For TD3 agent I used to control the exploration by adjusting the variance parameter of the agent. Is there any such option for the SAC agent. Currently it seems that the agent is exploring more than required.

回答 (1 件)

Ahmed R. Sayed
Ahmed R. Sayed 2022 年 10 月 4 日
Hi Mukherjee,
You can control the agent exploration by adjusting the entropy temperature options "EntropyWeightOptions" from the rlSACAgentOptions
For example, large values of EntropyWeight encourage the agent to explore the environment or control it by adjusting the temperature learning rate "LearnRate" to reach the target entropy "TargetEntropy" value [1]. In other words, you can use a fixed weight with zero learning rate and so on.
[1] Haarnoja, Tuomas, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, et al. "Soft Actor-Critic Algorithms and Application." Preprint, submitted January 29, 2019. https://arxiv.org/abs/1812.05905.

製品


リリース

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by