DQN Agent Training not working - takes longer than it should

14 views (last 30 days)
Hello,
I'm new to DQN algorithms and reinforcement learning overall, so I've been testing the MATLAB examples for RL in hopes to learn a bit more about the subject.
I've been trying to train the agent for the MATLAB Cart Pole DQN Example. I'm using the script prompted when typing openExample('rl/MATLABCartPoleDQNExample') in the command window; I haven't modified a single line of code.
The following graph (episode reward vs. episode number) is the one provided by MATLAB with the example. By episode 38, the average reward received is greater than 480, at which the training ends.
Now, the next graph is what I obtained when running the code and training the agent myself. It took 153 episodes (about half an hour for me) for the average reward to surpass 480, and the agent sort of 'wanders' around a lot. Also, something that intrigues me and I can't seem to understand is why, after a few consecutive episodes with really high reward, there are suddenly many episodes with a really low reward.
Side note: I excluded the Episode Q0 curve from the graph because it varied wildly, reaching values as high as 42720... I don't understand why that happens either.
My questions are:
Is this behavior normal in the training process of a DQN agent? maybe it isn't, and it has something to do with the computer I'm using? (which I hope not, because it is brand new).
Why is it that a few episodes with a really high reward are followed by a bunch of episodes with low reward?
I would also really appreciate some suggestions on how to fix this, if it shouldn't be happening.
Thanks a lot!!!

Answers (0)

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!