How to compute the gradients of SAC agent for custom training. In additon, is the target critics are updated automatically by matlab, given that agent =rlSACAgent()

6 ビュー (過去 30 日間)
I'm trying to train multi SAC agent using parallel computing, i don't know how to compute the gradients of agents using dlfeval function, knowing that i have created minibatchqueue for data processing. In addition, given that the agents have been created as agent=rlSACAgent(actor1,[critic1,critic2],agentOpts) , should i introduce the critics targets or they are internally handled by MATLAB by specifying the smoothing factor tau or updating frequency of target critic, and how i can update them?

回答 (1 件)

praguna manvi
praguna manvi 2024 年 9 月 4 日
編集済み: praguna manvi 2024 年 9 月 4 日
The critic and actor networks are updated internally using the “train” function for agents defined as:
agent = rlSACAgent(actor,[critic1,critic2],agentOpts);
You can find an example of training a rlSACAgent in this documentation:
For custom training you can refer to this documentation, which outlines the functions needed:
Typically, you could use “getValue” or “getAction” functions to extract outputs, calculate loss and compute gradients with “dlgradient”. Here is a link to another example with custom training using sampled minibatch experiences:

カテゴリ

Help Center および File ExchangeCustom Training Loops についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by