RL agent does not learn properly
Show older comments
Hello together
I am trying to learn about the Reinforcement Learning Toolbox and want to control the speed of a DC motor using an RL agent to replace a PI controller. I have oriented myself to the example of the water tank. However, I am having some problems learning.
First, I have problems with the agent adjusting itself to arrive at either minimum (0rpm) or maximum(6000rpm) and then not changing its state, even though it was able to achieve a good reward in its own episodes before.
In my reward function, I have specified the error of the target and actual speed as a percentage. When I try to punish him in the reward function so that he doesn't stay at 0rpm anymore, he stays at 0rpm and doesn't try to explore the area. I also have trouble correcting the remaining control error.
In the following the code and some pictures





close all
%param
R = 7.03;
L = 1.04*10^-3;
J = 44.2*10^-7;
a = 2.45*10^-6;
Kn = 250*2*pi/60;
Km = 38.2*10^-3;
actInfo = rlNumericSpec([1 1],'LowerLimit', 0, 'UpperLimit', 24);
actInfo.Name = 'spannung';
obsInfo = rlNumericSpec([3 1],...
'LowerLimit',[-inf -inf -inf ]',...
'UpperLimit',[ inf inf inf]');
obsInfo.Name = 'observations';
obsInfo.Description = 'integrated error, error, and measured rpm';
env=rlSimulinkEnv("DCMotorRL2", 'DCMotorRL2/RL Agent',...
obsInfo,actInfo);
env.ResetFcn = @(in)localResetFcn(in);
Ts = 0.1; %agent sample time
Tf = 20; %simulation time
rng(0)
statePath = [
featureInputLayer(obsInfo.Dimension(1),Name="netObsIn")
fullyConnectedLayer(50)
reluLayer
fullyConnectedLayer(25,Name="CriticStateFC2")];
actionPath = [
featureInputLayer(actInfo.Dimension(1),Name="netActIn")
fullyConnectedLayer(25,Name="CriticActionFC1")];
commonPath = [
additionLayer(2,Name="add")
reluLayer
fullyConnectedLayer(1,Name="CriticOutput")];
criticNetwork = layerGraph();
criticNetwork = addLayers(criticNetwork,statePath);
criticNetwork = addLayers(criticNetwork,actionPath);
criticNetwork = addLayers(criticNetwork,commonPath);
criticNetwork = connectLayers(criticNetwork, ...
"CriticStateFC2", ...
"add/in1");
criticNetwork = connectLayers(criticNetwork, ...
"CriticActionFC1", ...
"add/in2");
criticNetwork = dlnetwork(criticNetwork);
figure
plot(criticNetwork)
critic = rlQValueFunction(criticNetwork,obsInfo,actInfo, ...
ObservationInputNames="netObsIn", ...
ActionInputNames="netActIn");
actorNetwork = [
featureInputLayer(obsInfo.Dimension(1))
fullyConnectedLayer(9) %3
tanhLayer
fullyConnectedLayer(actInfo.Dimension(1))
];
actorNetwork = dlnetwork(actorNetwork);
actor = rlContinuousDeterministicActor(actorNetwork,obsInfo,actInfo);
agent = rlDDPGAgent(actor,critic);
agent.SampleTime = Ts;
agent.AgentOptions.TargetSmoothFactor = 1e-3;
agent.AgentOptions.DiscountFactor = 1.0;
agent.AgentOptions.MiniBatchSize = 64;
agent.AgentOptions.ExperienceBufferLength = 1e6;
agent.AgentOptions.NoiseOptions.Variance = 0.8; %0.3
agent.AgentOptions.NoiseOptions.VarianceDecayRate = 1e-5; %-5
agent.AgentOptions.CriticOptimizerOptions.LearnRate = 1e-03;
agent.AgentOptions.CriticOptimizerOptions.GradientThreshold = 1;
agent.AgentOptions.ActorOptimizerOptions.LearnRate = 1e-04;
agent.AgentOptions.ActorOptimizerOptions.GradientThreshold = 1;
trainOpts = rlTrainingOptions(...
MaxEpisodes=4000, ...
MaxStepsPerEpisode=ceil(Tf/Ts), ...
ScoreAveragingWindowLength=20, ...
Verbose=false, ...
Plots="training-progress",...
StopTrainingCriteria="AverageReward",...
StopTrainingValue=800,...
SaveAgentCriteria="EpisodeCount", ...
SaveAgentValue=600);
doTraining = true;
if doTraining
% Train the agent.
trainingStats = train(agent,env,trainOpts);
end
function in = localResetFcn(in)
% randomize reference signal
blk = sprintf('DCMotorRL2/omega_ref');
h=randi([2000,4000]);
in = setBlockParameter(in,blk,'Value',num2str(h));
%initial 1/min
% h=randi([2000,4000])*(2*pi)/60;
% blk = 'DCMotorRL2/DCMotor/Integrator1';
% in = setBlockParameter(in,blk,'InitialCondition',num2str(h));
end
Accepted Answer
More Answers (0)
Categories
Find more on Reinforcement Learning in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!