QTable reset when using train

Question

0 投票

Hi,

I am using the Matlab Reinforcement Learning toolbox to train an rlQAgent.

The issue that I am facing is that the corresponding QTable, i.e., the output of the command getLearnableParameters(getCritic(qAgent)), is reset each time the train command is used.

Is it possible to avoid this reset so to train further a previously trained agent?

Thank you

Corrado

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Emmanouil Tzorakoleftherakis 2020 年 5 月 19 日

編集済み: Emmanouil Tzorakoleftherakis 2020 年 5 月 20 日

MATLAB Online で開く

0 投票

If you stop training, you should be able to continue from where you left off. I called 'train' on the basic grid world example a couple of times in a row and the output of 'getLearnableParameters(getCritic(qAgent))' was different. You can always save the trained agent and reload it as well to make sure you don't accidentally delete it.

Update:

There is a regularization term added to the loss which causes the other entries to change slightly. To avoid this, you can type:

qRepresentation.Options.L2RegularizationFactor=0;

5 件のコメント
3 件の古いコメントを表示 3 件の古いコメントを非表示

Corrado Possieri 2020 年 5 月 20 日

編集済み: Corrado Possieri 2020 年 5 月 20 日

MATLAB Online で開く

I am actually traying to set the initial Qtable for the agent.

If I run the code

env = rlPredefinedEnv("BasicGridWorld");
qTable = rlTable(getObservationInfo(env),getActionInfo(env));
qTable.Table = randn(size(qTable.Table));
qRepresentation = rlQValueRepresentation(qTable,getObservationInfo(env),getActionInfo(env));
agentOpts = rlQAgentOptions;
agentOpts.DiscountFactor = 1;
qAgent = rlQAgent(qRepresentation,agentOpts);
trainOpts = rlTrainingOptions;
trainOpts.Plots = 'none';
trainOpts.MaxEpisodes = 1;
trainOpts.MaxStepsPerEpisode = 1;
trainOpts.Verbose = 1;
QTable0 = getLearnableParameters(getCritic(qAgent));
train(qAgent,env,trainOpts);
QTable1 = getLearnableParameters(getCritic(qAgent));
train(qAgent,env,trainOpts);
QTable2 = getLearnableParameters(getCritic(qAgent));
disp(find(QTable0{1} ~= QTable1{1}))
disp(find(QTable1{1} ~= QTable2{1}))

I get what I expect, that is just one and two entries of the QTable are changed.

However, if I try to force the initial value of the QTable

env = rlPredefinedEnv("BasicGridWorld");
qTable = rlTable(getObservationInfo(env),getActionInfo(env));
qTable.Table = randn(size(qTable.Table));
qRepresentation = rlQValueRepresentation(qTable,getObservationInfo(env),getActionInfo(env));
agentOpts = rlQAgentOptions;
agentOpts.DiscountFactor = 1;
qAgent = rlQAgent(qRepresentation,agentOpts);
trainOpts = rlTrainingOptions;
trainOpts.Plots = 'none';
trainOpts.MaxEpisodes = 1;
trainOpts.MaxStepsPerEpisode = 1;
trainOpts.Verbose = 1;
QTable0 = getLearnableParameters(getCritic(qAgent));
train(qAgent,env,trainOpts);
QTable1 = getLearnableParameters(getCritic(qAgent));
train(qAgent,env,trainOpts);
QTable2 = getLearnableParameters(getCritic(qAgent));
disp(find(QTable0{1} ~= QTable1{1}))
disp(find(QTable1{1} ~= QTable2{1}))

all its entries are perturbed as if the QTable is somehow reinitialized.

Emmanouil Tzorakoleftherakis 2020 年 5 月 20 日

Updated my answer above with a solution - hope that helps.

Corrado Possieri 2020 年 5 月 20 日

Thank you Emmanouil, this solved the issue.

サインインしてコメントする。

QTable reset when using train

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

5 件のコメント
3 件の古いコメントを表示 3 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

タグ

Community Treasure Hunt

QTable reset when using train

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

5 件のコメント 3 件の古いコメントを表示 3 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

5 件のコメント
3 件の古いコメントを表示 3 件の古いコメントを非表示