Unclear RL reward scheme

1 view (last 30 days)
GCats on 25 May 2022
I'm looking at the reward scheme in this example https://nl.mathworks.com/help/reinforcement-learning/ug/train-ddpg-agent-for-adaptive-cruise-control.html . I don't quite understand the role of the in the reward scheme. The agent gets a negative reward proportional to the control signal from the previous time step? Not sure what that means.
Anyone able to clarify? Thank you!

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!