Reinforcement learning DDPG Agent semi active control issue

Question

0 votes

Dear Matlab community,

i have implemented a reinforcement learning agent (DDPG) for controlling a semi-active suspension system in Simulink for my master thesis. The Simulink model is a half car model with two tires connected to a chassis body and the agent should control the variable dampers of the front and back axis. But every learning session with a huge number of episodes the DDPG Agent only learns a suboptimal control strategy. Mostly the results are the lowest possible damping ratio for the back axis and the maximum for the front axis with just tiny control adjustments (example in the picture).

Description of the Model:

13 continuous Observation
2 continuous Actions
Reward function with negative quadratic chassis and pitch acceleration
Resetfunction loads a pseudorandom road profile each episode
Damping ratio from 900 to 4300 Ns/m
Each episode last 10 seconds

I have tried with all these changes and the results are mostly the same:

‘NumHiddenUnit’ 25 and 256
Learn rate Actor = 1e-3 and 1e-4
With and without parallel computing
300, 1500 and 2000 episodes

My questions:

What is wrong with my agent that he only makes small control steps?
Is it possible, that my DDPG Agent doenst explore enough?

Sorry for my bad english and i thank you all for the help.

%% Agent creation
% Actionspace
actInfo = rlNumericSpec([2 1], ...
    'LowerLimit', hfmParam.dA.value(1), ...
    'UpperLimit', hfmParam.dA.value(2));
% Observationspace
obsInfo = rlNumericSpec([13 1], ...
    'LowerLimit', [-inf -inf -inf -inf -inf -inf -inf -inf -inf -inf -inf -inf 0]', ...
    'UpperLimit', [inf inf inf inf inf inf inf inf inf inf inf inf 40]');
%% Enviroment 
env = rlSimulinkEnv(mdl, agentBlock, obsInfo, actInfo);
env.ResetFcn = @(in)localResetFcn(in);
% Agent options
agentOpts = rlDDPGAgentOptions('SampleTime', tS);
knnOpts = rlAgentInitializationOptions('NumHiddenUnit', obsInfo.Dimension(1)*2-1);
% Agent 
agent = rlDDPGAgent(obsInfo, actInfo, knnOpts, agentOpts);
critic = getCritic(agent);
critic.Options.LearnRate = 1e-3;
agent = setCritic(agent, critic);
actor = getActor(agent);
actor.Options.LearnRate = 1e-4;
agent = setActor(agent, actor);

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Emmanouil Tzorakoleftherakis on 29 Mar 2021

1 vote

Hello,

This is very open-ended so there could be a lot of ways to improve your setup. My guess is that the issue is very relevant to the question you raise above. If the agent does not explore enough, all the other parameters you played with won't make much difference.

First, it is important to understand how exploration works for DDPG. Literally, whate happens is that we are adding noise sampled from a noise model to the deterministic policy output (step 1 here). If the parameters of the noise model are not tuned well, the noise added will be very small compared to your action range so the agent will not explore (which I suspect is what happends given that you do not tune the noise options in your code above).

Please take a look at this note in the doc. At a minimum, you should make sure that the variance of the noise model is between 1-10% of your action range. Then you can play with the variance decay rate. That should help you make some progress

5 Comments
Show 3 older comments Hide 3 older comments

Emmanouil Tzorakoleftherakis on 5 Apr 2021

Setting the apprpriate noise params is a necessary step for a correct problem formulation - it does not guarantee succesful learning. If the agent actions during training make sense, i.e., if the agent is exploring values that make sense, the next thing to look at is your reward signal.

Maha Mosalam on 1 Dec 2021

hello

If I had very small values of the action range may be between 0.001 and -0.001 , how I can choose exploration , it actulally the action donot change values during steps, any help for that?

Sign in to comment.

Reinforcement learning DDPG Agent semi active control issue

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

5 Comments
Show 3 older comments Hide 3 older comments

More Answers (0)

Categories

Products

Release

Tags

Community Treasure Hunt

Reinforcement learning DDPG Agent semi active control issue

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

5 Comments Show 3 older comments Hide 3 older comments

More Answers (0)

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

5 Comments
Show 3 older comments Hide 3 older comments