Why reinforcement learning has different results of action between sim() and getAction()?

Question

Shuyue Li on 7 Sep 2023

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/2018301-why-reinforcement-learning-has-different-results-of-action-between-sim-and-getaction

Answered: Emmanouil Tzorakoleftherakis on 25 Sep 2023

Hi Matlab reinforcement learning team

I have a well-trained PPO actor-critic agent and turned UseExplorationPolicy to 0 to obtain actions from sim() and getAction() respectively without any random setting in env. They share the same observations and agents.

However, the actions obtained from sim() and getAction() are different, though the actions can be reproduced respectively.

Thus, I would like to know how sim() generates actions. Does action come from actor network? If so, why the results are different with the same network?

code

actoraction = getAction(saved_agent,{testobstate});

ResetHandleT = @() myResetFunctionCNsim(testData,testobstate);

StepHandleT = @(Action,StockSaved) myStepFunctionCNsim(Action,StockSaved,testData,testobstate);

envT = rlFunctionEnv(observationInfo,actionInfo,StepHandleT,ResetHandleT);

experience = sim(envT,saved_agent,simOpts);

Look forward to your reply.

Sincerely,

Shuyue

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Emmanouil Tzorakoleftherakis on 25 Sep 2023

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/2018301-why-reinforcement-learning-has-different-results-of-action-between-sim-and-getaction#answer_1317957

Hi,

Which release are you using? We tried in R2023a and R2023b with UseExplorationPolicy =0 and getAction and sim provide the same results. A reproduction model would be great.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Why reinforcement learning has different results of action between sim() and getAction()?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Why reinforcement learning has different results of action between sim() and getAction()?

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments