Confusion in agent and trainFromData options when using RNN/LSTM
2 ビュー (過去 30 日間)
古いコメントを表示
My dataset contains numTraj trajectories, each containing numSteps time-steps. I filled the experience buffer with my data in a similar manner as follows for off-policy training. This makes "IsDone" 1 on the final time-step of every trajectory.
numStates = 5;
numActions = 1;
numSteps = 600;
numTraj = 100;
obsInfo = rlNumericSpec([numStates 1]);
actInfo = rlNumericSpec([numActions 1]);
buffer = rlReplayMemory(obsInfo,actInfo,numTraj*numSteps);
expBatch = struct;
for j = 1:numTraj % Generate random training data
for i = 1:numSteps
n = (j-1)*numSteps + i;
expBatch(n).Observation = {rand(numStates, 1)};
expBatch(n).Action = {rand(numActions, 1)};
expBatch(n).Reward = rand(1, 1);
expBatch(n).NextObservation = {rand(numStates, 1)};
expBatch(n).IsDone = 0;
end
expBatch(n).IsDone = 1;
end
append(buffer,expBatch);
Since I have a fixed number of trajectories and time-steps per trajectory, how should I be setting the following agent and trainFromData options?
rlSACAgentOptions (or options for other agents that can use RNNs):
- SequenceLength: Since all numTraj trajectories have numSteps time-steps, should SequenceLength = numSteps?
- MiniBatchSize: From this answer, it seems that MiniBatchSize should be set to numSteps as well. Is this correct?
- MaxMiniBatchPerEpoch: If MiniBatchSize = numSteps, then should MaxMiniBatchPerEpoch = numTraj if I want to use the whole dataset for training every epoch?
- NumStepsPerEpoch: Is this referring to the number of time-steps that are used for training in an epoch? If so, should this be set to numTraj*numSteps to use the whole dataset every epoch?
0 件のコメント
回答 (1 件)
Shivansh
2024 年 6 月 26 日
Hi Kundan!
I think you are setting all the agent and trainFromData options in the right manner with respect to your model.
SequenceLength: You can set "SequenceLength" as "numSteps" since all "numTraj" trajectories have "numSteps" time-steps.
MiniBatchSize: The "MiniBatchSize" should also be set to the number of time-steps within each trajectory as you want the mini-batch to cover an entire trajectory for training.
MaxMiniBatchPerEpoch: Since "MiniBatchSize" is set to "numSteps", you should set this parameter to be "numTraj".
NumStepsPerEpoch: This should be set to the total number of time-steps in your dataset, which is "numTraj * numSteps", to ensure that the whole dataset is used in each epoch.
I hope it helps!
0 件のコメント
参考
カテゴリ
Help Center および File Exchange で Sequence and Numeric Feature Data Workflows についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!