Error appears when setting the multi-dimensional actions in Matlab Environment (Reinforcement Learning Toolbox)
1 回表示 (過去 30 日間)
古いコメントを表示
As shown in the following codes, three actions, whose ranges are [0.1 10], [0.1 10] and [0 pi], respectively, are set:
%% main.m
%% Observation
ObservationInfo = rlNumericSpec([7 1]);
ObservationInfo.Name = 'Obstacle Avoidance States';
ObservationInfo.Description = 'delta_x, delta_y, delta_z, delta_L, delta_V, pusi, theta';
%% Action
ActionInfo = rlNumericSpec([3 1],'LowerLimit',[0.1 0.1 0]','UpperLimit',[10 10 pi]');
ActionInfo.Name = 'Action';
%% Environment
env = rlFunctionEnv(ObservationInfo,ActionInfo,'myStepFunction','myResetFunction');
rng(0);
InitialObs = reset(env);
Then, the actions are assigned three variables in the function myStepFunction (Action, LoggedSignals) as follows:
function [NextObs,Reward,IsDone,LoggedSignals] = myStepFunction(Action,LoggedSignals)
para_rho1 = Action(1);
para_rho2 = Action(2);
para_theta = Action(3);
......
end
Run the main.m, it is normal.
However, when running the following instruction, an error appears:
step(env,10)
Index exceeds the number of array elements (1)
Error: myStepFunction (line 23)
para_rho2 = Action(2);
Why does the dimension of the action is changed as 1? How to address this error?
If I set the variables para_rho2 and para_theta as constants, and change the dimension of the action as [1 1] in rlNumericSpec, then the instruction step(env,10) can be normally executed.
0 件のコメント
回答 (1 件)
Ryan Comeau
2020 年 5 月 29 日
Hello, so I've take a look at the rocket lander code environment which MATLAB gives as an example. What they do, it that every action is scaled between 0 and 1. The maximum values for the actions are then stored in your environment properties as a vector of values defining the min and max values. When we hop into the step function, the actions get scaled. I know this seems strange, and i'm not sure if it's the best approach but i'm not a veteran of RL. So, what you should do is the following:
%% main.m
%% Observation
ObservationInfo = rlNumericSpec([7 1]);
ObservationInfo.Name = 'Obstacle Avoidance States';
ObservationInfo.Description = 'delta_x, delta_y, delta_z, delta_L, delta_V, pusi, theta';
%% Action
ActionInfo = rlNumericSpec([3 1 1],'LowerLimit',0,'UpperLimit',1);
ActionInfo.Name = 'Action';
%stuff...
function [NextObs,Reward,IsDone,LoggedSignals] = myStepFunction(Action,LoggedSignals)
para_rho1 = Action(1).*env.borders(1); %so open rocket lander from MATLAB and take a look
para_rho2 = Action(2).*env.borders(2);
para_theta = Action(3).*env.borders(3);
......
end
Hope this helps
RC
2 件のコメント
参考
カテゴリ
Help Center および File Exchange で Environments についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!