フィルターのクリア

Why my validation RMSE and loss increase after some epoch by my training data increase

19 ビュー (過去 30 日間)
arash rad
arash rad 2023 年 1 月 22 日
編集済み: Aneela 2024 年 9 月 10 日 8:42
Hello everyone
I am trying to predict traffic flow of future steps by previous collected data so I Use LSTM for it
but my validation loss and rmse increase and training loss and rmse decrease .because I am net to LSTM I don't know which parameters I should check for improving model and predictions.
the picture of training progress is :
also I use different lags time for my predictions and here in my codes I have 4 step lag time
XTrain_ZaMir = (XTrain_ZaMir - mu_ZaMir)/sig_ZaMir;
YTrain_ZaMir = (YTrain_ZaMir - mu_ZaMir)/sig_ZaMir;
XTrain_ZaMir = XTrain_ZaMir(:,1:end-4);
YTrain_ZaMir = YTrain_ZaMir(:,5:end);
Test_ZaMir = [flowTe_ZaMir flowTeOther_ZaMir]';
nt = floor(0.7*length(Test_ZaMir));
YTest_ZaMir = Test_ZaMir(1,1:end);
XTest_ZaMir = Test_ZaMir(1,1:end); %One input
% XTest_ZaMir = Test_ZaMir(:,1:end); % More than One input
XTest_ZaMir = (XTest_ZaMir - mu_ZaMir)/sig_ZaMir;
YTest_ZaMir = (YTest_ZaMir - mu_ZaMir)/sig_ZaMir;
XVal_ZaMir = XTest_ZaMir(:,1:nt-4);
YVal_ZaMir = YTest_ZaMir(:,5:nt);
XTest_ZaMir = XTest_ZaMir(:,nt+4:end-1);
YTest_ZaMir = YTest_ZaMir(:,nt+5:end);
%% Layers and Options
numResponses = 1 ;
featureDimension = 1;
numHiddenUnits =200 ;
layers = [ ...
sequenceInputLayer(featureDimension)
lstmLayer(numHiddenUnits)
% dropoutLayer(0.002)
fullyConnectedLayer(numResponses)
regressionLayer
];
maxepochs = 250;
minibatchsize =128;
options = trainingOptions('adam', ... %%adam
'MaxEpochs',maxepochs, ...
'GradientThreshold',1, ...
'InitialLearnRate',0.005, ...
'ValidationData',{XVal_ZaMir,YVal_ZaMir},...
'ValidationFrequency',20,...
'Shuffle','every-epoch',...
'MiniBatchSize',minibatchsize,...
'LearnRateSchedule','piecewise', ...
'LearnRateDropPeriod',150, ...
'LearnRateDropFactor',0.005, ...
'Verbose',1, ...
'Plots','training-progress');
%% Train the Network
[net,info] = trainNetwork(XTrain_ZaMir,YTrain_ZaMir,layers,options);
[net,YPred_ZaMir]= predictAndUpdateState(net,XTest_ZaMir);
numTimeStepsTest= (0.5*floor(length(XTest_ZaMir)));
for i = 2:numTimeStepsTest
[net,YPred_ZaMir(:,i)] = predictAndUpdateState(net,XTest_ZaMir(:,i-1),'ExecutionEnvironment','cpu');
% net = resetState(net);
end
YTest_ZaMir = sig_ZaMir*YTest_ZaMir + mu_ZaMir;
YPred_ZaMir = sig_ZaMir*YPred_ZaMir + mu_ZaMir;

回答 (1 件)

Aneela
Aneela 2024 年 9 月 10 日 8:41
編集済み: Aneela 2024 年 9 月 10 日 8:42
Hi Arash,
You are experiencing “overfitting” with the LSTM model where training loss decreases while the validation loss increases.
  • Add a “dropoutLayer” after the LSTM layer to prevent overfitting.
dropoutLayer(0.2)
  • The initial learning rate is high which might overshoot the optimal weights. Reduce it to 0.001 or even lower and see if it improves convergence.
  • Add L2 regularization to the “fullyConnectedLayer” which prevents overfitting by adding a penalty which prevents model from learning complex patterns.
fullyConnectedLayer(numResponses, 'L2Factor', 0.001)
  • Implement early stopping by monitoring the validation loss. This can prevent overfitting by stopping training when the validation loss starts to increase.
Refer to the following MathWorks documentation for more information on LSTM: https://www.mathworks.com/discovery/lstm.html

カテゴリ

Help Center および File ExchangeSequence and Numeric Feature Data Workflows についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by