メインコンテンツ

getObservationInfo

Obtain observation data specifications from reinforcement learning environment, agent, or experience buffer

Description

obsInfo = getObservationInfo(env) extracts observation information from reinforcement learning environment env.

example

obsInfo = getObservationInfo(agent) extracts observation information from reinforcement learning agent agent.

obsInfo = getObservationInfo(buffer) extracts observation information from experience buffer buffer.

Examples

collapse all

The reinforcement learning environment for this example is a longitudinal dynamics model comprising two cars, a leader and a follower. The vehicle model is also used in the Adaptive Cruise Control System Using Model Predictive Control (Model Predictive Control Toolbox) example.

Open the model.

mdl = "rlACCMdl";
open_system(mdl);

Specify path to the agent block in the model.

agentblk = mdl + "/RL Agent";

Create the observation and action specifications.

% Observation specifications
obsInfo = rlNumericSpec([3 1],LowerLimit=-inf*ones(3,1),UpperLimit=inf*ones(3,1));
obsInfo.Name = "observations";
obsInfo.Description = "information on velocity error and ego velocity";

% Action specifications
actInfo = rlNumericSpec([1 1],LowerLimit=-3,UpperLimit=2);
actInfo.Name = "acceleration";

Create environment object.

env = rlSimulinkEnv(mdl,agentblk,obsInfo,actInfo)
env = 
SimulinkEnvWithAgent with properties:

           Model : rlACCMdl
      AgentBlock : rlACCMdl/RL Agent
        ResetFcn : []
  UseFastRestart : on

The reinforcement learning environment env is a SimulinkEnvWithAgent object.

Extract the action and observation specifications from env.

actInfoExt = getActionInfo(env)
actInfoExt = 
  rlNumericSpec with properties:

     LowerLimit: -3
     UpperLimit: 2
           Name: "acceleration"
    Description: [0×0 string]
      Dimension: [1 1]
       DataType: "double"

obsInfoExt = getObservationInfo(env)
obsInfoExt = 
  rlNumericSpec with properties:

     LowerLimit: [3×1 double]
     UpperLimit: [3×1 double]
           Name: "observations"
    Description: "information on velocity error and ego velocity"
      Dimension: [3 1]
       DataType: "double"

The action information contains acceleration values while the observation information contains the velocity and velocity error values of the ego vehicle.

Input Arguments

collapse all

Environment, specified as follows:

  • MATLAB® environment, represented by one of the following objects.

    Among the MATLAB environments, only rlMultiAgentFunctionEnv and rlTurnBasedFunctionEnv support training more agents at the same time.

  • Simulink® environment, represented by a SimulinkEnvWithAgent object, and created using:

    • rlSimulinkEnv — This environment is created from a model already containing one or more agents block, and supports training multiple agents at the same time.

    • createIntegratedEnv — This environment is created from a model that does not already contain an agent block, and does not supports training multiple agents at the same time.

    A Simulink-based environment object acts as an interface so that the reinforcement learning simulation or training function calls the (compiled) Simulink model to generate experiences for the agents. Such an environment does not support using the reset and step functions.

Note

env is a handle object, so a function that does not return it as output argument, such as train, can still update its internal states. For more information about handle objects, see Handle Object Behavior.

For more information on reinforcement learning environments, see Reinforcement Learning Environments and Create Custom Simulink Environments.

Example: env = rlPredefinedEnv("DoubleIntegrator-Continuous") creates a predefined environment that implements a continuous-action double-integrator system and assigns it to the variable env.

Agent, specified as one of the following reinforcement learning agent objects:

Note

agent is a handle object, so a function that does not return it as output argument, such as train, can still update it. For more information about handle objects, see Handle Object Behavior.

For more information on reinforcement learning agents, see Reinforcement Learning Agents.

Example: agent = rlPPOAgent(rlNumericSpec([2 1]),rlNumericSpec([1 1])) creates the default rlPPOAgent object agent for an environment with an observation channel carrying a continuous two-element vector and an action channel carrying a continuous scalar.

Experience buffer, specified as one of the following replay memory objects.

Example: rlReplayMemory(rlNumericSpec([1 1]),rlFiniteSetSpec([0 1]),1e5)

Output Arguments

collapse all

Observation data specifications extracted from the reinforcement learning environment, returned as an array of one of the following:

Each element in the array defines the properties of an environment observation channel, such as its dimensions, data type, and name.

You can extract observationInfo from an existing environment, function approximator, or agent using getObservationInfo. You can also construct the specifications manually using rlFiniteSetSpec or rlNumericSpec.

Version History

Introduced in R2019a