Hi @Fabian,
Thanks for your questions about those DDPG training options — NumEpoch, MaxMiniBatchPerEpoch, and LearningFrequency. I checked out the MathWorks docs and here’s a quick rundown:
NumEpoch is basically how many times the agent goes over the replay buffer data in one learning step. So more epochs means more network updates per batch.
MaxMiniBatchPerEpoch sets a cap on how many mini-batches the agent processes in each epoch — this helps keep training time and resource use in check.
LearningFrequency tells how often the agent updates its networks relative to the environment’s sampling steps. For example, a value of 4 means the network updates once every 4 steps.
If you want, you can take a look at the official docs here: [ https://www.mathworks.com/help/reinforcement-learning/ref/rlddpgagentoptions.html ]( https://www.mathworks.com/help/reinforcement-learning/ref/rlddpgagentoptions.html )
Let me know if you want me to help with some examples or anything else.
Hope this helps.