Is it possible to implement a prioritized replay buffer (PER) in a TD3 agent?

11 ビュー (過去 30 日間)
Michael Müller
Michael Müller 2021 年 6 月 18 日
回答済み: Ahmed R. Sayed 2022 年 9 月 30 日
Hey,
I' trying to implement a TD3 Agent using MATLAB. But instead of using a replay buffer that randomly chooses samples to use in the mini batch, I would like to implememt a prioritized replay buffer instead. Until now, I couldn't find a agent option to do so.
I would be very grateful if somebody could help me with my problem.
Thanks in advance for the answers.
best regards
Michael

回答 (1 件)

Ahmed R. Sayed
Ahmed R. Sayed 2022 年 9 月 30 日
By default, built-in off-policy agents (DQN, DDPG, TD3, SAC, MBPO) use an rlReplayMemory object as their experience buffer. Agents uniformly sample data from this buffer. To perform nonuniform prioritized sampling [1], which can improve sample efficiency when training your agent, use an rlPrioritizedReplayMemory object. Please refere to rlprioritizedreplaymemory.
[1] Schaul, Tom, John Quan, Ioannis Antonoglou, and David Silver. 'Prioritized experience replay'. arXiv:1511.05952 [Cs] 25 February 2016. https://arxiv.org/abs/1511.05952.

カテゴリ

Help Center および File ExchangeTraining and Simulation についてさらに検索

製品


リリース

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by