rlContinuousGaussianActor
Stochastic Gaussian actor with a continuous action space for reinforcement learning agents
Description
This object implements a function approximator to be used as a stochastic actor
within a reinforcement learning agent with a continuous action space. A continuous Gaussian
actor takes an environment state as input and returns as output a random action sampled from a
Gaussian probability distribution of the expected cumulative long term reward, thereby
implementing a stochastic policy. After you create an
rlContinuousGaussianActor
object, use it to create a suitable agent, such
as an rlACAgent
or rlPGAgent
agent. For
more information on creating representations, see Create Policies and Value Functions.
Creation
Syntax
Description
creates a Gaussian stochastic actor with a continuous action space using the deep neural
network actor
= rlContinuousGaussianActor(net
,observationInfo
,actionInfo
,ActionMeanOutputNames=netMeanActName
,ActionStandardDeviationOutputNames=netStdvActName
)net
as function approximator. Here,
net
must have two differently named output layers, each with as
many elements as the number of dimensions of the action space, as specified in
actionInfo
. The two output layers calculate the mean and standard
deviation of each component of the action. The actor uses these layers, according to the
names specified in the strings netMeanActName
and
netStdActName
, to represent the Gaussian probability distribution
from which the action is sampled. The function sets the
ObservationInfo
and ActionInfo
properties of
actor
to the input arguments observationInfo
and actionInfo
, respectively.
Note
actor
does not enforce constraints set by the action
specification, therefore, when using this actor, you must enforce action space
constraints within the environment.
specifies the names of the network input layers to be associated with the environment
observation channels. The function assigns, in sequential order, each environment
observation channel specified in actor
= rlContinuousGaussianActor(net
,observationInfo
,actionInfo
,ActionMeanOutputNames=netMeanActName
,ActionStandardDeviationOutputNames=netStdActName
,ObservationInputNames=netObsNames
)observationInfo
to the layer
specified by the corresponding name in the string array
netObsNames
. Therefore, the network input layers, ordered as the
names in netObsNames
, must have the same data type and dimensions
as the observation specifications, as ordered in
observationInfo
.
specifies the device used to perform computational operations on the
actor
= rlContinuousGaussianActor(___,UseDevice=useDevice
)actor
object, and sets the UseDevice
property of actor
to the useDevice
input
argument. You can use this syntax with any of the previous input-argument
combinations.
Input Arguments
Properties
Object Functions
rlACAgent | Actor-critic reinforcement learning agent |
rlPGAgent | Policy gradient reinforcement learning agent |
rlPPOAgent | Proximal policy optimization reinforcement learning agent |
rlSACAgent | Soft actor-critic reinforcement learning agent |
getAction | Obtain action from agent, actor, or policy object given environment observations |
evaluate | Evaluate function approximator object given observation (or observation-action) input data |
gradient | Evaluate gradient of function approximator object given observation and action input data |
accelerate | Option to accelerate computation of gradient for approximator object based on neural network |
getLearnableParameters | Obtain learnable parameter values from agent, function approximator, or policy object |
setLearnableParameters | Set learnable parameter values of agent, function approximator, or policy object |
setModel | Set function approximation model for actor or critic |
getModel | Get function approximator model from actor or critic |
Examples
Version History
Introduced in R2022a