Error using rl.env.Sim​ulinkEnvWi​thAgent>lo​calHandleS​imoutError​s (line 689) (By RL toolbox)

22 ビュー (過去 30 日間)
I want to creat the multi-discrete actor outputs.
It will be like delta1 output 1 or 0, and delta2 is the same.
but there comes the error
Error using rl.env.AbstractEnv/simWithPolicy (line 70)
An error occurred while simulating "quarter_car" with the agent "agent".
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Invalid input argument type or size such as observation, reward, isdone or loggedSignals.
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Unable to evaluate representation.
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
The logical indices contain a true value outside of the array bounds.
I don't understand the error is cause by the code or the simulink, and how to fix it.
% create observation info
observationInfo = rlNumericSpec([numObs 1],'LowerLimit',-inf*ones(numObs,1),'UpperLimit',inf*ones(numObs,1));
observationInfo.Name = 'observation';
% create action Info
actionInfo = rlFiniteSetSpec({[0;0],[1;1]});
actionInfo.Name = 'actor';
% define environment
env = rlSimulinkEnv(mdl,agentblk,observationInfo,actionInfo);
rng(0)
actorNetwork = [
imageInputLayer([numObs 1 1],'Normalization','none','Name','observation')
fullyConnectedLayer(200,'Name','ActorFC1')
reluLayer('Name','ActorRelu1')
fullyConnectedLayer(150,'Name','ActorFC2')
reluLayer('Name','ActorRelu2')
fullyConnectedLayer(numAct,'Name','ActorFC3')
tanhLayer('Name','ActorTanh')];
actorOpts = rlRepresentationOptions('LearnRate',1e-3,'GradientThreshold',1);
actor= rlStochasticActorRepresentation(actorNetwork, obsInfo, actInfo, 'Observation', {'observation'}, actorOpts);
agentOpts = rlPPOAgentOptions(...
'ExperienceHorizon',600,...
'ClipFactor',0.02,...
'EntropyLossWeight',0.01,...
'MiniBatchSize',128,...
'NumEpoch',3,...
'AdvantageEstimateMethod','gae',...
'GAEFactor',0.95,...
'SampleTime',h,...
'DiscountFactor',0.997);
agent = rlPPOAgent(actor,critic,agentOpts);

回答 (1 件)

Emmanouil Tzorakoleftherakis
Emmanouil Tzorakoleftherakis 2020 年 11 月 29 日
編集済み: Emmanouil Tzorakoleftherakis 2020 年 12 月 1 日
Hello,
Based on the attached files, it seems like you are creating a PPO agent but you are creating a Q network for a critic. If you look at this page, PPO implementation in Reinforcement Learning Toolbox requires a V critic. If you change your critic network to be equivalent to, e.g., this example, the errors go away.
Hope that helps
  4 件のコメント
Hong-Ruei Ciou
Hong-Ruei Ciou 2020 年 12 月 1 日
This is my Simulink model.
Thanks for your help.

サインインしてコメントする。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by