Creating an actorLossFunction for ContinuousDeterministicActor

Question

rtn on 24 May 2022

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/1726415-creating-an-actorlossfunction-for-continuousdeterministicactor

Answered: Takeshi Takahashi on 2 Jun 2022

Hi in the example the actor loss function is the following for a rlDiscreteCategoricalActor

function loss = actorLossFunction(policy, lossData)
    policy = policy{1};
    % Create the action indication matrix.
    batchSize = lossData.batchSize;
    Z = repmat(lossData.actInfo.Elements',1,batchSize);
    actionIndicationMatrix = lossData.actionBatch(:,:) == Z;
    
    % Resize the discounted return to the size of policy.
    G = actionIndicationMatrix .* lossData.discountedReturn;
    G = reshape(G,size(policy));
    
    % Round any policy values less than eps to eps.
    policy(policy < eps) = eps;
    
    % Compute the loss.
    loss = -sum(G .* log(policy),'all');
end

Here is my

actInfo =

rlNumericSpec with properties:

LowerLimit: [2×1 double]

UpperLimit: [2×1 double]

Name: "CartPole Action"

Description: [0×0 string]

Dimension: [2 1]

DataType: "double"

obsInfo =

rlNumericSpec with properties:

LowerLimit: -Inf

UpperLimit: Inf

Name: "CartPole States"

Description: "pendulum_force, cart position, cart velocity"

Dimension: [4 1501]

DataType: "double"

Here is how I set my actor

actor = rlContinuousDeterministicActor(actorNet,obsInfo,actInfo);
actor = accelerate(actor,true);
actorOpts = rlOptimizerOptions('LearnRate',1e-3);
actorOptimizer = rlOptimizer(actorOpts);

To create my loss function can I do the following?

function loss = actorLossFunction(policy, lossData)
    policy = policy{1};
    % Create the action indication matrix.
    batchSize = lossData.batchSize;
    Z = repmat(lossData.actInfo.Dimension(1)',1,batchSize);
    actionIndicationMatrix = lossData.actionBatch(:,:) == Z;
    
    % Resize the discounted return to the size of policy.
    G = actionIndicationMatrix .* lossData.discountedReturn;
    G = reshape(G,size(policy));
    
    % Round any policy values less than eps to eps.
    policy(policy < eps) = eps;
    
    % Compute the loss.
    loss = -sum(G .* log(policy),'all');
    
end

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Takeshi Takahashi on 2 Jun 2022

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/1726415-creating-an-actorlossfunction-for-continuousdeterministicactor#answer_976830

Please take a look at this example for rlContinuousDeterministicActor if you want to use it in a custom training loop.

rlDiscreteCategoricalActor is for stochastic discrete actions while rlContinuousDeterministicActor is for deterministic continuous actions. You need different formulations.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Creating an actorLossFunction for ContinuousDeterministicActor

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments
Show -2 older commentsHide -2 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Creating an actorLossFunction for Continuous​Determinis​ticActor

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments Show -2 older commentsHide -2 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Creating an actorLossFunction for ContinuousDeterministicActor

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments