How to implement cross validation in neural network for time series prediction
9 ビュー (過去 30 日間)
古いコメントを表示
I am using k fold cross validation for the training neural network in order to predict a time series. I have an input time series and I am using Nonlinear Autoregressive Tool for time series. I am using 10 fold cross validation method and divide the data set as 70 % training, 15% validation and 15 % testing. But I really din't know how to generate the code.
And please to be honest, this is the first time that I am using neural networks. So, please be humble in your explanation!!
This is something that I wrote,
k=10;
Indices=crossvalind('Kfold', length(X), 10);
X = tonndata(densig,true,false);
T = tonndata(densig,true,false);
trainFcn = 'trainlm';
inputDelays = 1:2;
feedbackDelays = 1:2;
hiddenLayerSize = [50 20 20];
X1=cell2mat(X);
T1=cell2mat(T);
for i=1:k
net = narxnet(inputDelays,feedbackDelays,hiddenLayerSize,'open',trainFcn);
X1(i)=find(X1(Indices(i)));
T1(i)=find(T1(Indices(i)));
[x,xi,ai,t] = preparets(net,X1,{},T1);
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
net.trainParam.epochs = 5000;
[net,tr] = train(net,x,t,xi,ai);
end
y = net(x,xi,ai);
e = gsubtract(t,y);
performance = perform(net,t,y);
Please help
Thanks Baqar
1 件のコメント
Greg Heath
2017 年 8 月 23 日
The quickest way to get NN help is to run your program on one or more of the MATLAB examples from
doc nndatasets
and/or
help nndatasets
after initializing using
rng('default') % same as rng(0).
Hope this helps.
Greg
回答 (4 件)
Greg Heath
2017 年 8 月 23 日
編集済み: Greg Heath
2017 年 8 月 24 日
UH-OH ! I do not have crossvalind.
CROSSVALIND IS NOT IN THE NN TOOLBOX!!!
However, I have posted crossvalidation results in both the NEWSGROUP and ANSWERS.
Your problem is doubly troubling because there are very few references that use cross-validation with
EITHER NNs OR TIMESERIES !!!
My search yields the following number of hits:
NEWSGROUP ANSWERS
NEURAL 4319 5130
TIMESERIES 604 1696
NEURAL TIMESERIES 87 344
CROSSVAL 51 119
CROSSVAL NEURAL 9 19
CROSSVAL TIMESERIES 0 3
The main reasons for so few examples is that
1. It IS VERY MUCH EASIER AND NO LESS VALID to design NNs with
multiple random data divisions.
2. TIMESERIES REQUIRE CONSTANT TIMESTEPS. However, the number
of relevant arrangements is severely limited.
3. The best way to get many design variations is merely to use
many trials with random initial weights.
Hope this helps.
Thank you for formally accepting my answer
Greg
1 件のコメント
Greg Heath
2017 年 8 月 25 日
I don't think you understand:
It is YOUR job to test YOUR code.
Use a MATLAB example dataset and initialize the rng to the zero state so that we can compare our results with yours.
Greg
Greg Heath
2017 年 12 月 31 日
If this is the 1st time you are using neural networks:
1. BOTH TIMESERIES AND CROSSVALIDATION ARE ADVANCED TOPICS. IF YOU HAVE A CHOICE, START WITH ELEMENTARY TOPICS
a. Regression/Function-Fitting
help fitnet
doc fitnet
b. Classification/Target-Identification
help patternnet
doc patternnet
c. Non-feedback Timeseries
help time-delaynet
doc time-delaynet
d. Feedback Timeseries
help narxnet
doc narxnet
2. I don't recommend crossvalidation for neural networks.
a. Multiple random weight intializations for each of a specified number of hidden nodes in a single hidden layer net tends to be sufficient and order of magnitudes faster.
b. The goal is to minimize the number the number of hidden nodes subject to an upper limit on meansquareerror (or crossentropy for classification)
Hope this helps.
Greg
0 件のコメント
orlem lima dos santos
2018 年 1 月 19 日
Hi again, I do not recommend using standard cross-validation (crossval function) to time series prediction for this type of case there is a technique known as "time series cross-validation" (https://robjhyndman.com/hyndsight/tscv/)
Unfortunately there is not a function implemented in matlab, but there is one in python scikit-learn (<http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.TimeSeriesSplit.html>) that can help.
0 件のコメント
Greg Heath
2018 年 1 月 19 日
編集済み: Greg Heath
2018 年 1 月 19 日
If you have to maintain the original spacing, one way to use f-fold XVAL in time series is illustrated below for f = 10
1. Divide the data into 10 blocks [ B1 B2 ... B10 ]
2. for i= 1: 10, test on Bi, train on the rest.
3. For example, if i =5,
a. Train on B1 to B4 using B1 for initial conditions
b. Continue training on B6 to B10 using B6 (NOT B4 !) for initial conditions
c. Compute separate SSEs for B5 and ~B5
4. Combine the i=1:10 SSEs for 2 separate results MSEtrn and MSEtst
5. To obtain a production series, you can test each on all of the data
and combine them any way you choose (e.g., best, weighted average, ...)
Hope this helps.
Thank you for formally accepting my answer
Greg
P.S. I favor the normalized MSE,
NMSE = MSE/mean(var(target',1))
which is normally in the range 0 <= NMSE <= 1 and related to the statistical Rsquare (See Wikipedia)
Rsquare = 1 - NMSE
0 件のコメント
参考
カテゴリ
Help Center および File Exchange で Sequence and Numeric Feature Data Workflows についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!