Why the Reinforcement Learning seems do not learn anything?

31 views (last 30 days)
HUNG JUI CHIU
HUNG JUI CHIU on 31 Mar 2021
Commented: Claudia AB on 13 Jan 2022 at 14:35
Is reward not converge to a certain value show that the RL agent does no learn anything?
The result shows that every training the agent does the different choices, it won't learn something good from the previous one.
Although the reward is good and has the good result, next training it won't keep at that good choices, it will try the other choice then get the bad result.
How can I deal with this problem?
Thank for helping.
  1 Comment
Claudia AB
Claudia AB on 13 Jan 2022 at 14:35
Hello, no matter how much I try different designs that I get something similar to your graphic. So, I would like to know if you were able to find a solution and how you did it.
Maybe, my graphic has other problems too because the average always goes up and down and I think this is not good. So if you have idea of what to do to obtain a fluid line, I will be grateful too.
Thank you for the help.

Sign in to comment.

Answers (1)

Tarunbir Gambhir
Tarunbir Gambhir on 27 May 2021
If the agent is not taking good choices at later episodes, it is likely that the exploration epsilon factor is still high. You can try increasing the "agentOptions.EpsilonGreedyExploration.EpsilonDecay" parameter to encourage the agent to exploit the previously learned Q-values at later episodes.
You can refer this documentation page for more information on the importance of parameters for the epsilon-greedy exploration concept.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!