How to compute the gradients of SAC agent for custom training. In additon, is the target critics are updated automatically by matlab, given that agent =rlSACAgent()

Question

houssam deboucha on 28 Aug 2024

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/2148454-how-to-compute-the-gradients-of-sac-agent-for-custom-training-in-additon-is-the-target-critics-are

Edited: praguna manvi on 4 Sep 2024

I'm trying to train multi SAC agent using parallel computing, i don't know how to compute the gradients of agents using dlfeval function, knowing that i have created minibatchqueue for data processing. In addition, given that the agents have been created as agent=rlSACAgent(actor1,[critic1,critic2],agentOpts) , should i introduce the critics targets or they are internally handled by MATLAB by specifying the smoothing factor tau or updating frequency of target critic, and how i can update them?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

praguna manvi on 4 Sep 2024

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/2148454-how-to-compute-the-gradients-of-sac-agent-for-custom-training-in-additon-is-the-target-critics-are#answer_1510569

Edited: praguna manvi on 4 Sep 2024

Open in MATLAB Online

Hi @houssam deboucha,

The critic and actor networks are updated internally using the “train” function for agents defined as:

agent = rlSACAgent(actor,[critic1,critic2],agentOpts);

You can find an example of training a rlSACAgent in this documentation:

https://www.mathworks.com/help/reinforcement-learning/ug/train-sac-agent-for-ball-balance-control.html#TrainSACAgentForBallBalanceControlExample-2

For custom training you can refer to this documentation, which outlines the functions needed:

https://www.mathworks.com/help/reinforcement-learning/ug/train-reinforcement-learning-policy-using-custom-training.html#TrainRLPolicyUsingCustomTrainLoopExample-6

Typically, you could use “getValue” or “getAction” functions to extract outputs, calculate loss and compute gradients with “dlgradient”. Here is a link to another example with custom training using sampled minibatch experiences:

https://www.mathworks.com/help/reinforcement-learning/ug/custom-training-loop-with-simulink-action-noise.html#CustomTrainingLoopWithSimulinkActionNoiseExample-11

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

How to compute the gradients of SAC agent for custom training. In additon, is the target critics are updated automatically by matlab, given that agent =rlSACAgent()

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

How to compute the gradients of SAC agent for custom training. In additon, is the target critics are updated automatically by matlab, given that agent =rlSACAgent()

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments