Simulating environment while Training rlAgent

Question

0 投票

plot(env);
trainingStats = train(agent,env,trainOpts);

We use this to simulate agent while training. But this takes a lot of time as we are simulating each episode. What if I wanted to simulate every 100 episodes or so. How to do that?

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Emmanouil Tzorakoleftherakis 2021 年 2 月 19 日

編集済み: Emmanouil Tzorakoleftherakis 2021 年 2 月 20 日

0 投票

I can interpret your question in 3 ways so I will put my thoughts here and hopefully they will be sufficient.

1) Depending on the RL algorithm, training works differently. For example, DQN and DDPG do an optimization step at each time step (which generally takes more time), whereas, e.g., PPO can work with batches of data. The latter seems to be closer to what you are referring to

2) There is something called offline/batch reinforcement learning where you already have collected data and you use the data to train offline. This also seems close to what you are talking about but there is no out-of-the-box way to do this in Reinforcement Learning Toolbox currently, i.e., you would have to write the implementation yourself.

3) Is you question perhaps about visualization (and not simulation)? If that's the case, indeed visualizing the environment does slow things down so I would recommend only using it at the beginning to get an idea of whether the training setup is ok.

There is no standard way of visualizing an agent after N epidoes but you can probably create a timer and plot/visualize what you need when you need it. Take a look at how the visualization is set up for this custom MATLAB environment. You can use the IsDone flag to increment the counter and every 100 episodes you should call the 'updateplot' method.

5 件のコメント
3 件の古いコメントを表示 3 件の古いコメントを非表示

ay_asa 2021 年 2 月 19 日

MATLAB Online で開く

trainOpts.SaveAgentCriteria = "EpisodeCount";
trainOpts.SaveAgentValue = 100;
trainOpts.SaveAgentDirectory = pwd + "\Agents";

This saves all the agents after 100th episode. Is there any way to save agents at every 100th episode?

Emmanouil Tzorakoleftherakis 2021 年 2 月 20 日

Hmm you are right that wouldn't work. I created an enhancement request for this feature.

In the meantime, since your question is about visualization, you should be able to do what you want by implementing a counter. Take a look at how the visualization is set up for this custom MATLAB environment. You can use the IsDone flag to increment the counter and every 100 episodes you should call the 'updateplot' method.

サインインしてコメントする。

Simulating environment while Training rlAgent

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

回答 (1 件)

5 件のコメント
3 件の古いコメントを表示 3 件の古いコメントを非表示

カテゴリ

製品

タグ

Community Treasure Hunt

Simulating environment while Training rlAgent

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

回答 (1 件)

5 件のコメント 3 件の古いコメントを表示 3 件の古いコメントを非表示

カテゴリ

製品

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

5 件のコメント
3 件の古いコメントを表示 3 件の古いコメントを非表示