Main Content

SimulinkEnvWithAgent

Reinforcement learning environment with a dynamic model implemented in Simulink

Description

The SimulinkEnvWithAgent object represents a reinforcement learning environment that uses a dynamic model implemented in Simulink®. The environment object acts as an interface such that when you call sim or train, these functions in turn call the Simulink model to generate experiences for the agents.

Creation

To create a SimulinkEnvWithAgent object, use one of the following functions.

  • rlSimulinkEnv — Create an environment using a Simulink model with at least one RL Agent block.

  • createIntegratedEnv — Use a reference model as a reinforcement learning environment.

  • rlPredefinedEnv — Create one of the predefined Simulink reinforcement learning environment.

Properties

expand all

Simulink model name, returned as a string or character vector. The specified model must contain one or more RL Agent blocks.

This property is read-only.

Agent block paths, returned as a string or string array. This property is read-only.

If Model contains a single RL Agent block for training, then AgentBlock is a string containing the block path.

If Model contains multiple RL Agent blocks for training, then AgentBlock is a string array, where each element contains the path of one agent block.

Model can contain RL Agent blocks whose path is not included in AgentBlock. Such agent blocks behave as part of the environment and select actions based on their current policies. When you call sim or train, the experiences of these agents are not returned and their policies are not updated.

The agent blocks can be inside of a model reference. For more information on configuring an agent block for reinforcement learning, see RL Agent.

Reset behavior for the environment, specified as a function handle or anonymous function handle. The function must have a single Simulink.SimulationInput input argument and a single Simulink.SimulationInput output argument. The output object specifies temporary changes applied to model for the duration of the simulation of training episode. For more information about Simulink simulation input objects, see Simulink.SimulationInput (Simulink).

The reset function can set the initial state or parameters of the Simulink environment. For example, you can create a reset function that randomizes certain block states such that each training episode begins from different initial conditions.

If you have an existing reset function myResetFunction on the MATLAB® path, set ResetFcn using a handle to the function.

env.ResetFcn = @(in)myResetFunction(in);

If your reset behavior is simple, you can implement it using an anonymous function handle. For example, the following code uses setVariable (Simulink) to set the variable x0 to a random value in the Simulink.SimulationInput (Simulink) object in. The value of x0 that you specify overrides the existing x0 value in the model workspace for the duration of the simulation or training. The value of x0 is then reverted to the original when the simulation or training completes.

env.ResetFcn = @(in) setVariable(in,'x0',rand());

If you call the reset function of a SimulinkEnvWithAgent object that has its ResetFcn property empty, a Simulink.SimulationInput for the unmodified Simulink model is returned.

The sim function calls the reset function to reset the environment at the start of each simulation episode, and the train function calls it at the start of each training episode.

For more information, see Reset Function for Simulink Environments.

Example: env.ResetFcn = @myResetFunction; sets the ResetFcn property of the SimulinkEnvWithAgent object env to the handle of the existing function myResetFunction.

Option to toggle fast restart, specified as either "on" or "off". Fast restart allows you to perform iterative simulations without compiling a model or terminating the simulation each time.

For more information on fast restart, see How Fast Restart Improves Iterative Simulations (Simulink).

Example: env.UseFastRestart="off" sets the UseFastRestart property of the SimulinkEnvWithAgent object env to "off".

Object Functions

trainTrain reinforcement learning agents within a specified environment
simSimulate trained reinforcement learning agents within specified environment
getObservationInfoObtain observation data specifications from reinforcement learning environment, agent, or experience buffer
getActionInfoObtain action data specifications from reinforcement learning environment, agent, or experience buffer

Examples

collapse all

Create a Simulink environment using the trained agent and corresponding Simulink model from the Control Water Level in a Tank Using a DDPG Agent example.

Load the agent in the MATLAB® workspace.

load WaterTankDDPG

Create an environment for the rlwatertank model, which contains an RL Agent block. Because the agent used by the block is already in the workspace, you do not need to pass the observation and action specifications to create the environment.

env = rlSimulinkEnv("rlwatertank","rlwatertank/RL Agent")
env = 
SimulinkEnvWithAgent with properties:

           Model : rlwatertank
      AgentBlock : rlwatertank/RL Agent
        ResetFcn : []
  UseFastRestart : on

Validate the environment by performing a short simulation for two sample times.

validateEnvironment(env)

You can now train and simulate the agent within the environment by using train and sim, respectively.

For this example, you use rlNumericSpec to define an observation space consisting of a single channel carrying a three-element vector. You then use rlFiniteSetSpec to define an action space consisting of a single channel carrying only one of three possible values. You then use these observation and action specifications to create a custom Simulink® environment that relies on the rlSimplePendulumModel Simulink model.

The model represents a simple frictionless pendulum that initially hangs in a downward position. Open the model.

mdl = "rlSimplePendulumModel";
open_system(mdl)

An rlNumericSpec object specifies an environment channel that carries signals (actions or observations) that belong to a continuous set. By contrast, an rlFiniteSetSpec object specifies a channel that carries signals that belong to a finite set (a set containing only a finite number of elements).

If you have an existing environment, you can extract its action or observation specifications (which in general are vectors of rlNumericSpec and rlFiniteSetSpec objects) using the getActionInfo or getObservationInfo functions.

In this example, instead, you need to create a new custom environment. To do so you must first define the environment action and observation channels.

To define the channel that represents the observation space, use rlNumericSpec. The channel carries a vector containing three signals (the sine, cosine, and time derivative of the angle).

obsInfo = rlNumericSpec([3 1]) 
obsInfo = 
  rlNumericSpec with properties:

     LowerLimit: -Inf
     UpperLimit: Inf
           Name: [0×0 string]
    Description: [0×0 string]
      Dimension: [3 1]
       DataType: "double"

To define the channel that represents the action space, use rlFiniteSetSpec. The channel carries a scalar expressing the torque and can be one of three possible values, -2 Nm, 0 Nm and 2 Nm.

actInfo = rlFiniteSetSpec([-2 0 2])
actInfo = 
  rlFiniteSetSpec with properties:

       Elements: [3×1 double]
           Name: [0×0 string]
    Description: [0×0 string]
      Dimension: [1 1]
       DataType: "double"

You can use dot notation to assign property values for the rlNumericSpec and rlFiniteSetSpec objects.

obsInfo.Name = "observations";
actInfo.Name = "torque";

You can now use these specifications to create both a new custom environment and an agent object that works within your environment.

To create your custom Simulink environment, use rlSimulinkEnv. Specify the Simulink model as first argument, the path of the agent block as a second argument, and the observation and action specifications that you have created in the previous step. For more information on custom Simulink environments, see Create Custom Simulink Environments.

agentBlk = mdl + "/RL Agent";
env = rlSimulinkEnv(mdl,agentBlk,obsInfo,actInfo)
env = 
SimulinkEnvWithAgent with properties:

           Model : rlSimplePendulumModel
      AgentBlock : rlSimplePendulumModel/RL Agent
        ResetFcn : []
  UseFastRestart : on

Specify a reset function using dot notation. For this example, randomly initialize theta0 in the model workspace using the setVariable (Simulink) function.

env.ResetFcn = @(in) setVariable(in,"theta0",randn,"Workspace",mdl)
env = 
SimulinkEnvWithAgent with properties:

           Model : rlSimplePendulumModel
      AgentBlock : rlSimplePendulumModel/RL Agent
        ResetFcn : @(in)setVariable(in,"theta0",randn,"Workspace",mdl)
  UseFastRestart : on

Here, in is a Simulink.SimulationInput (Simulink) object, and the values of theta0 that you specify overrides the existing theta0 value in the model workspace for the duration of the simulation or training. The value of theta0 is then reverted to the original when the simulation or training completes. For more information on reset functions, see Reset Function for Simulink Environments.

You can now use env (together with an agent object) as argument for the built-in functions train and sim, which train and simulate the agent within the environment.

Create an environment for the Simulink model from the example Train Multiple Agents to Perform Collaborative Task.

Load the file containing the agents. For this example, load the agents that have been already trained using decentralized learning.

load decentralizedAgents.mat

Create an environment for the rlCollaborativeTask model, which has two agent blocks. Because the agents used by the two blocks (agentA and agentB) are already in the workspace, you do not need to pass their observation and action specifications to create the environment.

env = rlSimulinkEnv( ...
    "rlCollaborativeTask", ...
    ["rlCollaborativeTask/Agent A","rlCollaborativeTask/Agent B"])
env = 
SimulinkEnvWithAgent with properties:

           Model : rlCollaborativeTask
      AgentBlock : [
                     rlCollaborativeTask/Agent A
                     rlCollaborativeTask/Agent B
                   ]
        ResetFcn : []
  UseFastRestart : on

It is good practice to specify a reset function for the environment such that agents start from random initial positions at the beginning of each episode. For an example, see the resetRobots function defined in Train Multiple Agents to Perform Collaborative Task.

You can now simulate or train the agents within the environment using the sim or train functions, respectively.

Use the predefined "SimplePendulumModel-Continuous" keyword to create a continuous simple pendulum model reinforcement learning environment.

env = rlPredefinedEnv("SimplePendulumModel-Continuous")
env = 
SimulinkEnvWithAgent with properties:

           Model : rlSimplePendulumModel
      AgentBlock : rlSimplePendulumModel/RL Agent
        ResetFcn : []
  UseFastRestart : on

This example shows how to use createIntegratedEnv to create an environment object starting from a Simulink model that implements the system with which the agent will interact, and that does not have an agent block. Such a system is often referred to as plant, open-loop system, or reference system, while the whole (integrated) system that includes the agent is often referred to as the closed-loop system.

For this example, use the flying robot model described in Train DDPG Agent to Control Two-Thruster Sliding Vehicle as the reference (open-loop) system.

Open the sliding robot model.

open_system("rlFlyingRobotEnv")

Initialize the state variables and sample time.

% initial model state variables
theta0 = 0;
x0 = -15;
y0 = 0;

% sample time
Ts = 0.4;

Create the Simulink model myIntegratedEnv containing the flying robot model connected in a closed loop to the agent block. The function also returns the reinforcement learning environment object env to be used for training.

env = createIntegratedEnv( ...
    "rlFlyingRobotEnv", ...
    "myIntegratedEn")
env = 
SimulinkEnvWithAgent with properties:

           Model : myIntegratedEn
      AgentBlock : myIntegratedEn/RL Agent
        ResetFcn : []
  UseFastRestart : on

The function can also return the block path to the RL Agent block in the new integrated model, as well as the observation and action specifications for the reference model.

[~,agentBlk,observationInfo,actionInfo] = ...
    createIntegratedEnv( ...
    "rlFlyingRobotEnv","myIntegratedEnv")
agentBlk = 
"myIntegratedEnv/RL Agent"
observationInfo = 
  rlNumericSpec with properties:

     LowerLimit: -Inf
     UpperLimit: Inf
           Name: "observation"
    Description: [0×0 string]
      Dimension: [7 1]
       DataType: "double"

actionInfo = 
  rlNumericSpec with properties:

     LowerLimit: -Inf
     UpperLimit: Inf
           Name: "action"
    Description: [0×0 string]
      Dimension: [2 1]
       DataType: "double"

Returning the block path and specifications is useful in cases in which you need to modify descriptions, limits, or names in observationInfo and actionInfo. After modifying the specifications, you can then create an environment from the integrated model IntegratedEnv using the rlSimulinkEnv function.

Version History

Introduced in R2019a