why I get a different action result every new time with same sample observations after deploying trained RL policies?
Show older comments
load("agent0218_300016_40000.mat","agent");
obsInfo = getObservationInfo(agent);
actInfo = getActionInfo(agent);
ResetHandle = @() myResetFunction(test_sss);
StepHandle = @(Action,LoggedSignals) myStepFunction(Action,LoggedSignals,test_sss);
envT = rlFunctionEnv(obsInfo,actInfo,StepHandle,ResetHandle);
simOpts = rlSimulationOptions('MaxSteps',size(test_sss,1));
experience = sim(envT,agent,simOpts);
ac3=squeeze(experience.Action.bs.Data);
%******************************************************************************
%******************************************************************************
generatePolicyFunction(agent);
%******************************************************************************
%******************************************************************************
for iii=1:size(ac3,1)
observation1=test_sss{iii,:};
action1(iii,1) = evaluatePolicy(observation1);
end
sum(abs(ac3-action1))
Accepted Answer
More Answers (1)
de y
on 24 Feb 2021
0 votes
Categories
Find more on Reinforcement Learning Toolbox in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!