Setting Initial Conditions In Simulink Model while Training Reinforcement Learning Agent

Question

0 投票

Hello,

I'm attempting to randomize some constant values in a Simulink model between each episode while training the agent in the RL Agent block. Prior to implementing the RL agent block, I would call a script I made, from the 'init' callback method of my simulink model. This script would then update the workspace variables. However when training the agent via the 'train()' function from MATLAB, it appears that the script in the 'init' callback is not called as the ICs of my model do not change between each episode. This leads me to the following question.

Are callback methods not called when training an RL agent via the train function?

I then tried to create a ResetFcn for the environment object to take the place of the script I had been calling in the callbacks. I followed the example here Reinforcement Learning for Ball Balancing Using a Robot Manipulator to create my ResetFcn and how to then access the variables, updated in the ResetFcn, within my model. The variables are set as parameters in Constant blocks as well as parameters in Rigid Transform blocks. Note that I have tried using the setVariable() function with the "Workspace" argument to save the variable to the model workspace but that has been with no success as well.

function in = tumorRandomization_RLFunction(in)

% tumorRandomizationScript;

mico=micoCodeGen;

%These random numbers below are wrt to the centre of the phantom tissue

xTumorLocation=(rand(1)-0.5)/0.5*.075;

yTumorLocation=(rand(1)-0.5)/0.5*.075;

zTumorLocation=(rand(1)-1)*.1;

in = in.setVariable('xTumorLocation',xTumorLocation);

in = in.setVariable('yTumorLocation',yTumorLocation);

in = in.setVariable('zTumorLocation',zTumorLocation);

%These insertion points are wrt the centre of the phantom tissue

xInsertionPoint=(rand(1)-0.5)/0.5*.035;

yInsertionPoint=(rand(1)-0.5)/0.5*.035;

in = in.setVariable('xInsertionPoint',xInsertionPoint);

in = in.setVariable('yInsertionPoint',yInsertionPoint);

%Packaging tumor & insertion point location into vectors

tumorLocation=[xTumorLocation;yTumorLocation;zTumorLocation];

insertionLocation=[xInsertionPoint;yInsertionPoint;0.1];

in = in.setVariable('tumorLocation',tumorLocation);

in = in.setVariable('insertionLocation',insertionLocation);

sprintf("The tumor location is: %d %d %d",tumorLocation(1),tumorLocation(2),tumorLocation(3))

%Kinova offset

kinovaOffset=[0;-0.4;-0.135];

tumorLocation=tumorLocation+kinovaOffset;

insertionLocation=insertionLocation+kinovaOffset;

%Direction the needle should be oriented along

needleDirection=tumorLocation-insertionLocation;

needleDirection=needleDirection./norm(needleDirection);

in = in.setVariable('needleDirection',needleDirection);

%Position where EE should be located

offsetDistanceFromPhantom=0.22;

eeInsertionPosition=insertionLocation-offsetDistanceFromPhantom.*needleDirection;

in=in.setVariable('eeInsertionPosition',eeInsertionPosition);

%Create HT matrix for IK

X=[-needleDirection(2);needleDirection(1);0];

X=X./norm(X);

if(X(1)<0)

X=-1.*X;%This is to avoid twisting cables on the Kinova

end

Z=needleDirection;

Y=cross(Z,X);

Y=Y./norm(Y);

in=in.setVariable('X',X);

in=in.setVariable('Y',Y);

in=in.setVariable('Z',Z);

desiredTransform=[[X,Y,Z,eeInsertionPosition];[0 0 0 1]];

ik=inverseKinematics('RigidBodyTree',mico,'SolverAlgorithm','LevenbergMarquardt');

[q,solutionInfo]=ik('EE',desiredTransform,[0.9 0.9 0.9 0.8 0.8 0.8],[-1.591;4.118;1.834;-0.115;0.0281;0]);

IC=q;

in=in.setVariable('IC',IC);

transformIC=getTransform(mico,q,'EE','base');

in=in.setVariable('transformIC',transformIC);

% Check if an acceptable solution was found, if not try again

if(solutionInfo.PoseErrorNorm>1e-5)

in=tumorRandomization_RLFunction(in,mico);

end

What's strange is that "IC" appears to be the only variable updating properly (Although I can't say for certain at this point). The elements of the 'IC' vector are used as the values for the 'Specify Target Position' argument in different revolute joints. All the other variables being set with setVariable() are called from either a Constant Block or a Rigid Transform Block. I have gone over all the spelling of the variables and everything is spelt properly. I should also note that I have not changed anything in the model (Constant & Rigid Transform Blocks) from when it worked properly with the callback method, and now where I have the RL Agent block.

Any help you could offer would be much appreciated.

-Chris

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Steve Miller 2022 年 11 月 28 日

編集済み: Steve Miller 2022 年 11 月 28 日

0 投票

The behavior you describe (some values tune, others do not) indicates you have not configured the parameters in the Simscape Multibody model to be run-time parameters. A run-time parameter is a block parameter that can be tuned at run-time (before the simulation begins).

> What's strange is that "IC" appears to be the only variable updating properly (Although I can't say for certain at this point). The elements of the 'IC' vector are used as the values for the 'Specify Target Position' argument in different revolute joints.

Target positions in joints are often set to be run-time parameters, which may explain why this worked for your test.

> All the other variables being set with setVariable() are called from either a Constant Block or a Rigid Transform Block.

You will need to configure the parameters for Rigid Transform blocks and PS Constant blocks. Here is the setting in the Rigid Transform block, similar settings exist in other block dialogs.

--Steve

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

Setting Initial Conditions In Simulink Model while Training Reinforcement Learning Agent

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

回答 (1 件)

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

カテゴリ

製品

リリース

タグ

Community Treasure Hunt

Setting Initial Conditions In Simulink Model while Training Reinforcement Learning Agent

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

回答 (1 件)

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

カテゴリ

製品

リリース

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示