Train a DDPG agent to swing a pole with constraints
1 view (last 30 days)
I'm currently working with the pendulum enviroment and the DDPG agent described in this document: https://nl.mathworks.com/help/reinforcement-learning/ug/train-ddpg-agent-to-swing-up-and-balance-pendulum.html
Now, I would like to add some constraints on the Simulink model between the observations and the agent (I believe this technique is called shielding). For example, I would like to constraint the angular speed of the pendulum before the observations are fed to the agent.
I think an option could be to use the Contraint Enforcement block on Simulink, however I am not sure on how to tackle the implementation.
Could anyone help me out jumpstart the problem? Thanks!!