Is it possible to change RL action values under certain conditions?
Show older comments
I want my agent to output a target value, but in certain situations (reward drops dramatically), I would want the agent to look for a better solution by letting him change the target value. I tried to use initial condition block in order to use the target value in the first place. However, my agent (PPO) always outputs an average value after some training episodes.
5 Comments
Emmanouil Tzorakoleftherakis
on 18 May 2021
Can you provide some more information? What do you mean by letting the agent change target value? Isn't that what is happening by default every time the agent takes an action? what is the envronment architecture?
black_cat
on 18 May 2021
Emmanouil Tzorakoleftherakis
on 19 May 2021
thanks. It's still not clear to me what you mean by "However, this results in having an output of 3 since the agent is averaging it during training". If it's best to output a 6, the agent should do so, why would it average the output? Unless you are talking about the average episode reward that you see in the episode manager?
Answers (0)
Categories
Find more on Reinforcement Learning Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!