How to compute the gradients of SAC agent for custom training. In additon, is the target critics are updated automatically by matlab, given that agent =rlSACAgent()

7 views (last 30 days)
I'm trying to train multi SAC agent using parallel computing, i don't know how to compute the gradients of agents using dlfeval function, knowing that i have created minibatchqueue for data processing. In addition, given that the agents have been created as agent=rlSACAgent(actor1,[critic1,critic2],agentOpts) , should i introduce the critics targets or they are internally handled by MATLAB by specifying the smoothing factor tau or updating frequency of target critic, and how i can update them?

Answers (1)

praguna manvi
praguna manvi on 4 Sep 2024
Edited: praguna manvi on 4 Sep 2024
The critic and actor networks are updated internally using the “train” function for agents defined as:
agent = rlSACAgent(actor,[critic1,critic2],agentOpts);
You can find an example of training a rlSACAgent in this documentation:
For custom training you can refer to this documentation, which outlines the functions needed:
Typically, you could use “getValue” or “getAction” functions to extract outputs, calculate loss and compute gradients with “dlgradient”. Here is a link to another example with custom training using sampled minibatch experiences:

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!