SUGGESTIONS FOR IMPROVEMENT(S) ON NARNET MULTISTEP AHEAD PREDICTIONS ON THE SOLAR DATASET?
2 ビュー (過去 30 日間)
古いコメントを表示
Dear Reader,
This is a continuation of the of the following newsgroup thread: http://se.mathworks.com/matlabcentral/newsreader/view_thread/345189?s_tid=srchtitle I think it is easier to use “answers” than “newsgroup” threads, mostly because of the possibility to label code and links…and also because of the possibility to attach figures (images).
INTRODUCTION
This is an analysis of how to perform multistep predictions on the solar datasets (doc nndatasets). Main improvement compared to previous iterations of this code in the added filtering of the signal. However, results indicate that the predictability can be improved. Unfortunately, it is not even possible to generate a prediction using a generated repetitive dataset which is here based on the solar dataset.
METHOD
I use a main program to have the possibility to test the 9 different training algorithms. Here is the code for main:
clear all
close all
% 1: g_train_method='trainlm';
% 2: g_train_method='trainbfg';
% 3: g_train_method='trainrp';
% 4: g_train_method='trainscg';
% 5: g_train_method='traincgb';
% 6: g_train_method='traincgf';
% 7: g_train_method='traincgp';
% 8: g_train_method='trainoss';
% 9: g_train_method='traingdx';
for i=1:9
current_training_method=i
[NMSEo_1(i),NMSEc(i),NMSEc2(i),toc(i)] = train_diff_method(i)
% test NMSEc - NMSEc2
end
figure;
plot(1:9,NMSEc,1:9,NMSEc2)
xlabel('Method number')
legend('NMSEc, without closed loop training','NMSEc2, with closed loop training')
figure;
plot(toc)
xlabel('Method number')
title('toc')
Then I use a function to evaluate the different training algorithms. One thing that should be noticed is that I used mapminmax to norm the signal. I read somewhere that trainbr requires this. Also, it cant do any harm if all methods use this normalized signal, at least as I see it.
function [NMSEo_1,NMSEc,NMSEc2,totaltime] = train_diff_method(train_method_no)
close all
% train_method_no=1;
if train_method_no==1
g_train_method='trainlm'
end
if train_method_no==2
g_train_method='trainbfg'
end
if train_method_no==3
g_train_method='trainrp'
end
if train_method_no==4
g_train_method='trainscg'
end
if train_method_no==5
g_train_method='traincgb'
end
if train_method_no==6
g_train_method='traincgf'
end
if train_method_no==7
g_train_method='traincgp'
end
if train_method_no==8
g_train_method='trainoss'
end
if train_method_no==9
g_train_method='traingdx'
end
format shortG
% GEH1 = [ 'START TIMING' ]
% STJ1 = [ ' DONE ' ]
tic
plt=0;
T = solar_dataset;
t = cell2mat(T);
% t=[t(477:2400) t(477:2400) t(477:2400)];
% FILTERING, PLOTTING AND MAKING T A CELL ARRAY, THIS TIME BASED ON FILTERED t
[b,a]=butter(2,0.1);
t_filt=filtfilt(b,a,t);
plt=plt+1;
figure(plt);
subplot(211);
plot(1:length(t),t,1:length(t),t_filt);
title('SOLAR DATASET, UNFILTERED AND FILTERED');
subplot(212);
plot(1:length(t),t-t_filt);
title('RESIDUALS; FILTERED SIGNAL SUBRTRACTED FROM RAW SIGNAL');
T=num2cell(t_filt);
T_map=mapminmax(cell2mat(T));
T=num2cell(T_map);
t = cell2mat(T)
[ O , N ] = size(t); % 'O' FOR '"o"utput
% GEH2 = [ ' REMOVE SEMICOLON [ O N] ==> [ 1 2899 ] ' ]
% STJ2 = [ ' DONE ' ]
% GEH3 = [ ' TO INSURE UNBIASED NONTRAINING PREDICTIONS'...
% ' USE DIVIDEBLOCK AND THEN USE Ntrn in NNCORR ' ]
% STJ3 = [ ' DONE ' ]
% GEH4 = [ ' Ntrn = N - 2*round(0.15*N) = 2029 ' ]
% STJ4 = [ ' OK ' ]
% ### One thing I think of initially is the use of training algorithm,
% should one always first use trainlm in all steps (OL and CL) or
% is there a scientific approach that can determine (before training is
% started) if e.g. trainbr would fit a specific signal better? ###
% GEH5 = [ ' I DON"T KNOW. BEST TO BEGIN WITH DEFAULTS' ...
% ' THEN SEE MY SPIELS ON OVERTRAINING, Hub AND MSEREG' ]
% STJ5 = [ ' ON THE SUBJECT OF Hub I FOUND SIGNIFICANT POSTS, I HAVEN'T FOCUSED ON OVERTRAINING YET BUT I WILL LOOG INTO THIS. MSEREG SEEMS TO BE OBSOLETE, ANY SUGGESTIONS WHAT TO USE INSTEAD? ']
MSE00 = var(t',1)
% GEH6 = [ ' TAKE MEAN IF if O > 1 ' ]
% STJ6 = [ ' GOOD POINT. IN THIS CASE O is equal to 1' ]
zt = zscore(t',1)';
plt=plt+1;
figure(plt);
subplot(211);
plot(t);
title('SOLAR DATASET');
subplot(212);
plot(zt);
title('SOLAR DATASET, STANDARDIZED')
% [ Xo, Xoi, Aoi, To ] = preparets( neto, {}, {}, T );
% to = cell2mat( To );
% zto = zscore(to,1);
% varto1 = mean(var(to',1));
% minmaxto = minmax([ to ; zto ])
% GEH7 = [ ' MINMAX zt CHECK FOR OUTLIERS? ' ]
% STJ = [ ' FIRST ITERATION I USE FILTFIT, HOPE TO USE WAVELET FILTERING IN THE NEAR FUTURE ']
% rng('default');
rng('shuffle');
L = floor(0.95*(2*N-1));
% GEH8 = [ 'JUST USE Ntrn, Ltrn FOR SIGLAGS ' ]
% ST8 = [ ' THE CODE BELOW, DOES IT LOOK OK? ']
for i = 1:100
n = randn(1,N);
autocorrn = nncorr( n,n, N-1, 'biased');
sortabsautocorrn = sort(abs(autocorrn));
thresh95(i) = sortabsautocorrn(L);
end
sigthresh95 = mean(thresh95); % 0.2194 % tested with * (times) 3 to decrease used fractions (outside dotted line)
% GEH9 = [ ' UH-OH ...THAT"S NOT WHAT I GET ! ' ]
% STJ9 = [ ' I UNDERSTAND AND AGREE, 0.2194 WAS FROM AN EARLIER SCRIPT ' ]
% GEH10 = [' ALSO GOOD TO KNOW OTHER STATS ']
% STJ10 = [ ' ALWAYS GOOD TO GET A FEEL FOR THE DATA, ANY OTHER PURPOSES? PERHAPS PARAMETERS TO STUDY WHEN THE SIZE OF i IS INCREASED/DECREASED? ' ]
minthresh95 = min(thresh95); % 0.024638
medthresh95 = median(thresh95); % 0.027242
meanthresh95 = mean(thresh95); %0.0273
stdthresh95 = std(thresh95); %0.0011573
maxthresh95 = max(thresh95); %0.030326
% [minthresh95 medthresh95 meanthresh95 stdthresh95 maxthresh95]
% GEH11 = [ 'MIGHT WANT TO KNOW WHAT IF max(i) > 100' ]
% STJ11 = [ ' I HAVE WORKED WITH DIFFERENT SIZES OF i, INCREASING i SHOULD GIVE BETTER STATISTICAL REPRESENTATION OF THE ABOVE PARAMETERS (minthresh95 etc.), HOWEVER, AN INCREASED i CAUSES ONLY MINOR CHANGES OF ABOVE PARAMETERS' ]
autocorrt = nncorr(zt,zt,N-1,'biased');
siglag95 = -1+ find(abs(autocorrt(N:2*N-1))>=sigthresh95);
% This three lines below decreases the length of siglab95 and FD
l_sig=length(siglag95);
r_l_sig=ceil(l_sig/15); % 15 used when three times the length is not used
siglag95=siglag95(1:r_l_sig);
FD=siglag95(2:end);
FD_length=length(FD);
d=max(FD);
% siglag95=0:1:156;
% siglag95 = -1+ find(abs(autocorrt(N:2*N-1))>=0.6);
% ### Memory is maxing out if I use all lag values from the above line
% probably this is one of the most important improvement areas ###
% GEH12 = [ '1. ONLY NEED A SUFFICIENT SUBSET ' ...
% ' 2. I THINK YOUR INITIAL CALCULATION IS FLAWED ' ]
% STJ12 = [ ' THE LENGTH OF siglag95 IS DECREASED, HOWEVER, THE MOST SIGNIFICANT WAVELENGTHS ARE STILL TAKEN INTO ACCOUNT, FOR MORE INFO, SEE: ...
% 'https://se.mathworks.com/matlabcentral/answers/284253-it-is-not-possible-to-predict-wavelengths-longer-than-the-maximum-index-value-of-the-fd-in-narxnet-a ' ]
plt = plt+1; figure(plt);
hold on
plot(0:N-1, -sigthresh95*ones(1,N),'b--');
plot(0:N-1, zeros(1,N),'k');
plot(0:N-1, sigthresh95*ones(1,N),'b--');
plot(0:N-1, autocorrt(N:2*N-1));
plot(siglag95,autocorrt(N+siglag95),'ro');
title('REDUCED NUMBER OF AUTOCORRELATIONS IN SOLAR DATASET');
% GEH13 = [ ' NOT SURE WHAT THAT MEANS ' ]
% STJ13 = [ 'FOR THIS VERSION OF THE CODE I HAVE REDUCED THE LENGTH OF ' ...
% ' SIGLAG95 TO ONLY INCLUDE ENOUGH DATAPOINTS SO THAT THE MOST SIGNIFICANT WAVELENGTH IS ACCONTED FOR ' ]
% defining the global parameters
% g_train_method='trainbr';
% http://se.mathworks.com/help/nnet/ug/choose-a-multilayer-neural-network-training-function.html
% Why isn't trainbr included in the above page?
% BR seems convincing, see http://www.ncbi.nlm.nih.gov/pubmed/19065804
g_no_epochs=5000;
g_trainRatio=74; % 66.67 74 89 50 61 21_22
g_valRatio=26; % 33.33 26 11 50 39 21_22
g_testRatio=0;
g_min_grad=2e-5;
g_trainmaxfail=100;
Ntrials=5;
% (n)-loop or the for(j)-loop? What difference does these three alternatives give? ###
% GEH14 = [ ' ALL YOU NEED TO KNOW IS WHAT THE RNG STATE IS' ...
% ' FOR A DESIGN YOU WANT TO EPRODUCE, SEE MY CODES' ]
% STJ14 = [ ' I THINK THAT WITH THE APPROACH (FOR AND WHILE LOOPS) I USE IN THIS CODE THE RNG STATE 'shuffle' GIVES THE CORRECT USAGE OF THE SCRIPT']
NFD = length(FD); % 147
LDB = max(FD); % 152
Ns = N-LDB; % 2747
Nseq = Ns*O; % 2747
% Nw = (NFD*O+1)*H+(H+1)*O % moved inside the first loop below
Hub = -1+ceil( (Nseq-O) / (NFD*O+O+1)); % 18
Hmax = ceil(Hub/10) % Hmax = 2 ==> Nseq >>Nw
min_nodes=1;
max_nodes=Hmax;
% GEH15 = [ ' JUSTIFICATION FOR THESE NUMBERS ? ... Hub = ?' ]
% STJ15 = [ ' Hub DEFINED ABOVE' ]
% STJX1 = [ ' WHAT WOULD JUSTIFY MORE THAN ONE HIDDEN LAYER? IT THERE ANY RESULT IN THIS SCRIPT THAT COULD SHOW THIS? E.G. NMSE, PREDICTIVE PEERFORMANCE '...
% [ ' IS THIS A VALID REFERENCE FOR THE QUESTION? ftp://ftp.sas.com/pub/neural/FAQ3.html#A_hl ' ]
% GEH16 = [ ' Ntrials = Number of trials; NOT trails ']
% STJ16 = [ ' CORRECTED ' ]
% GEH17 = [ ' Define narnet( FD, n) in outer loop. Ntrials of initial random weights in inner loop '...
% ' but NOT random datadivision for timeseries .. get nonuniform time spacing' ]
%
% STJ17 = [ ' NARNET IS NOW DEFINED IN OUTER LOOP. WITH THIS COMMENT "I did not try randomizing these states here yet, possible future improvement..." I MEANT THAT I WOULD TRY DIFFERENT LEVELS OF TRN/VAL/TST. ']
% 'DOES NOT net.divideFcn='divideblock GUARANTEE THAT THE TIME
% SPACING IS UNIFORM?'...
% 'DEFINING NARNET IN THE OUT LOOP (AS IT LOOKS LIKE NOW DOES...
% 'NOT GIVE RANDOM WEIGHTS THUS A REPETITION OF Ntrials...
% ' WOULD NOT GIVE ANY ADDITIONAL INFORMATION ']
% GEH18 = ['MORE LATER']
% STJ18 = [ ' OK ']
best_Y=[];
NMSEin=Inf;
i=0;
for n=min_nodes:max_nodes
Nw = (NFD*O+1)*n+(n+1)*O;
% Nw when n = 1 : 154
% Nw when n = 2 : 307
% (HOWEVER, SIZE OF Nw SEEMS TO BE METHOD DEPENDANT)
i=i+1; % Counter for indices inside the second loop (do not tie up anything to n).
for j=1:Ntrials
%%%%start copy
net = narnet(FD, n);
% GEH19 = [ ' USE THE SHORTEST SUBSET AS POSSIBLE ' ]
% STJ19 = [ ' SEE COMMENTS ABOVE ON USING THE SIGNIFICANT WAVELENGTH ' ]
% Setup Division of Data for Training, Validation, Testing
net.trainFcn = g_train_method;
net.divideFcn='divideblock'; % I did not try randomizing these states here yet, possible future improvement...
net.trainParam.epochs = g_no_epochs;
net.divideParam.trainRatio = g_trainRatio/100;
net.divideParam.valRatio = g_valRatio/100;
net.divideParam.testRatio = g_testRatio/100;
net.trainParam.min_grad=g_min_grad;
net.trainParam.max_fail=g_trainmaxfail;
[Xo,Xoi,Aoi,To] = preparets(net,{},{},T);
%%%end copy
% GEH20 = [ 'FOR TIMESERIES PREDICTION TEST SUBSET'...
% ' MUST OCCUR AFTER THE TRAINING AND VAL SUBSETS']
% STJ20 = [ ' IS THIS BECAUSE THE TEST DATA SHOULD HAVE AS SIMILAR CHARATERISTICS AS POSSIBLE TO THE PREDICTED DATA? OR ARE THERE OTHER REASONS? ']
% GEH21 = [ 'WHY 10 ?' ]
% STJ21 = [ 'TO INVESTIGATE IF ERROR WOULD DECREASE. I'VE ALSO TRIED DIFFERENT LEVELS HERE, UP TO 100' ]
[ net ,tr, Yo, Eo, Xof, Aof] = train(net,Xo,To,Xoi,Aoi);
% view(net);
% Yo = net(Xo,Xoi,Aoi);
% Eo = gsubtract(To,Yo);
% GEH22 = [ 'COMMENT PREVIOUS TWO STATEMENTS' ]
% STJ22 = [ 'DONE' ]
NMSEo_temp(i,j) = mse(Eo)/MSE00;
epochs = tr.epoch(end);
% GEH23 = [ ' BETTER TO JUST USE A REASONABLE NUMBER OFDELAYS ' ]
% STJ23 = [ ' SEE COMMENT STJ13 ' ]
[n j NMSEo_temp(i,j) mse(Eo) Nw epochs] % Just to show what step is currently being calculated and to show the important parameters
end
NMSEo=min(NMSEo_temp(i,:));
% GEH24 = [' NO! FIND THE BEST DESIGN, NOT THE BEST AVERAGE!']
% STJ24 = [ ' DONE, SEE ADDED WHILE LOOP BELOW ' ]
% GEH25 = [ ' HOWEVER DO TABULATE SUMMARY STATS OF' ...
% ' MIN, MEDIAN, MEAN, STD AND MAX']
% STJ25 = [ ' FOR THIS VERSION OF THE CODE I HAVE FOCUSED ON THE MINIMUM ERROR ']
if (NMSEo<NMSEin)
fprintf('Best minimmum NMSE so far: %4.6e (for %d Hidden nodes)\n',NMSEo,n);
NMSEin=NMSEo;
best_Y=Yo;
N_use=n;
end
end
% searching for the smallest NMSEo from initial Ntrials
mean_NMSEo=mean(NMSEo_temp');
std_NMSEo=std(NMSEo_temp');
min_NMSEo=min(NMSEo_temp');
RESULTS
I have used the above code and investigated the following trn/val ratios (first number is trn, second is val): 50-50,61-39,74-26,89-11,93.5-6.5. The training method numbeer refers to the following training method (you will also be able to find this in the code for "main"):
% 1: g_train_method='trainlm';
% 2: g_train_method='trainbfg';
% 3: g_train_method='trainrp';
% 4: g_train_method='trainscg';
% 5: g_train_method='traincgb';
% 6: g_train_method='traincgf';
% 7: g_train_method='traincgp';
% 8: g_train_method='trainoss';
% 9: g_train_method='traingdx';
Here is the "winning" NMSEo from the while loop. This NMSEo is selected on the basis that it is smaller (or equal) than the 10 NMSEo (Ntrials) generated in the above for loop.
Then I close the loop and obtain NMSEc, error from closing the loop with no training of the closed loop performed.
Here I have limited the y axis to 0 - 2. As you can see, no values are below 1.
Then I close the loop and also perform training of the closed loop. With this I obtain the following NMSEc2.
Here I have limited the y axis to 0 - 2. NMSEc2 is decreased compared to NMSEc, however, regardless of method I am still far away from any desired value.
Please let me know if you would like me to submit tables of the data instead of figures..(however, I hope the message that I have not obtained a desired low value of NMSEc or NMSEc2 is still clear). In this result section I could show numerous figures that portray the high values of NMSEc or NMSEc2 but I see little meaning is uploading figures with the following appearance:
METHOD - PART 2
So, back to the drawing board. I thought to myself, perhaps the solar dataset cannot be used for prediction. There might be some truth to this, I don't see any repeating pattern in the longer wavelengths (longer wavelengths are round 130 points, for more info see this post: https://se.mathworks.com/matlabcentral/answers/284253-it-is-not-possible-to-predict-wavelengths-longer-than-the-maximum-index-value-of-the-fd-in-narxnet-a). I decided to create a repeating pattern. I used three repetitions of a part of the solar dataset, namely (you can find this text in the function code, it is commented):
t=[t(477:2400) t(477:2400) t(477:2400)];
Start and end of sampling; 477 and 2400 are selected on the basis that the signal should be as continuous as possible. I used filtering and analysis of the diff to determine these values. The content of FD is similar for "part 1" and this "part 2". To not increase computational time too much I decreased Ntrials from 10 to 5. Here is a figure showing the "3*repeated solar dataset", I've added some black rectangles to make it easier to see reoccurring features.
For this analysis I have used a trn/val ratio of 66,67 - 33.33, I thought the model could generate a small NMSEo is the val set were identical to two reoccuring parts of the trn dataset.
RESULTS - PART 2
Here is the NMSEo for the 9 different methods using the "3*repeated solar dataset":
4.1823e-08 1.785e-06 0.00042742 1.031e-05 0.00013799 0.00011407 0.00013801 2.9156e-05 0.00096596
Here is the NMSEc (no training) for the 9 different methods using the "3*repeated solar dataset":
2.2568 243.24 6.013 28.174 10.056 9.0195 10.83 10.871 5.2844
Here is the NMSEc2 (training is performed) for the 9 different methods using the "3*repeated solar dataset":
2.2568 13.607 1.0029 27.614 10.056 8.9929 10.83 1.0002 1.0013
DISCUSSION
I could think of a few things to further develop the approach but, at this stage I would be very thankful for receiving a second opinion. I would be very happy if I've made one or a few simple mistakes that could give me a better NMSEc, I really thought that using a repeated signal as I did in part 2 would give me an improved prediction with a small residual error. If there are no simple fixes here, my next step is to investigate using all of the significant FD obtained from nncorr.
3 件のコメント
Greg Heath
2016 年 5 月 27 日
You are trying to analyze an unsolved problem, not giving a tutorial lecture on a proven technique.
Therefore, ANSWERS is the correct forum.
I don't like the idea of simultaneously posting in both forums. Rather, once you have a polished technique, post a tutorial in the NEWSGROUP with references to preliminary ANSWERS threads.
Greg
P.S. Interesting problem.
採用された回答
Greg Heath
2016 年 5 月 30 日
Here is my preliminary result with default 70/15/15 datadivision and H = 10 hidden nodes. However, NONDEFAULT DIVIDEBLOCK was used to impose TRUE FUTURE PREDICTION WITH CONSTANT TIMESTEPS!
close all, clear all, clc, plt=0, tic
T = solar_dataset ;
t = cell2mat(T) ;
[ O N ] = size(t) % [ 1 2899 ]
zt = zscore(t,1); tzt = [ t ; zt ];
meantzt = mean( tzt' )' %[ 51.727 ; -9.38e-16 ]
vartzt1 = var( tzt', 1)' % [1937.3 ; 1 ]
minmaxtzt = minmax(tzt)
% minmaxtzt = 0 253.8
% -1.1752 4.5911
% Possible outliers
Nout3 = numel(find(abs(zt) > 3)) % 27 ( < 0.93%)
Nout4 = numel(find(abs(zt) > 4)) % 4 ( < 0.14%)
% Keep outliers for now
Ntst = round(0.15*N), Nval = Ntst % 435, 435
Ntrn = N-Nval-Ntst % 2029
trnind = 1:Ntrn; % 1 : 2029
valind = Ntrn+1:Ntrn+Nval; % 2030 : 2464
tstind = Ntrn+Nval+1:N; % 2465 : 2899
ttrn = t(trnind); tval = t(valind); ttst = t(tstind);
plt = plt+1, figure(plt), hold on
plot( trnind, ttrn,'b' )
plot( valind, tval,'g' )
plot( tstind, ttst, 'r' )
legend('TRAIN','VAL','TEST')
title( ' SOLAR DATASET TIMESERIES ' )
rng('default');
Ltrn = floor(0.95*(2*Ntrn-1)) % 3854
imax = 200
for i = 1:imax
n = randn(1,Ntrn);
autocorrn = nncorr( n,n, Ntrn-1, 'biased');
sortabsautocorrn = sort(abs(autocorrn));
thresh95(i) = sortabsautocorrn(Ltrn);
end
For imax = 100 & 200:
minthresh95 = min(thresh95) % 0.0293 0.0293
medthresh95 = median(thresh95)% 0.0325 0.0325
meanthresh95 = mean(thresh95) % 0.0325 0.0325
stdthresh95 = std(thresh95) % 0.0015 0.0014
maxthresh95 = max(thresh95) % 0.0355 0.0355
sigthresh95 = meanthresh95 % 0.0325
autocorrt = nncorr( zt, zt, Ntrn-1, 'biased' );
siglag95 = -1+ find(abs(autocorrt(Ntrn:2*Ntrn-1)) ...
>= sigthresh95);
Nsiglag95 = length(siglag95) % 1591 ~ 0.78*Ntrn
d = min(find(diff(siglag95)>1)) % 35
proof = siglag95(d-1:d+1) % 33 34 38
plt = plt+1; figure(plt);
hold on
plot(0:Ntrn-1, -sigthresh95*ones(1,Ntrn),'b--' );
plot(0:Ntrn-1, zeros(1,Ntrn),'k');
plot(0:Ntrn-1, sigthresh95*ones(1,Ntrn),'b--' );
plot(0:Ntrn-1, autocorrt(Ntrn:2*Ntrn-1));
plot( siglag95, autocorrt(Ntrn + siglag95), 'ro' );
title('SIGNIFICANT AUTOCORRELATIONS VALUES')
to = t(d+1:N);
No = N-d %2864
Notst = round(0.15*No) %430
Noval = Notst %430
Notrn = No-Noval-Notst %2004
H = 10 % Default
neto = narnet(1:d,H);
[ Xo, Xoi, Aoi, To ] = preparets(neto,{},{},T);
to = cell2mat(To);
[ O No ] = size(to) % [ 1 2864 ]
varto1 = var(to,1) % 1948.6
[neto tro Yo Eo Xof Aof ] = train(neto,Xo,To,Xoi,Aoi);
view(neto)
% Yo = net(Xo,Xoi,Aoi); Eo = gsubtract(To,To);
NMSEo = mse(Eo)/varto1 % 0.11505
NMSEtrno = tro.best_perf/varto1 % 0.10849
NMSEvalo = tro.best_vperf/varto1 % 0.12139
NMSEtsto = tro.best_tperf/varto1 % 0.13927
Rsqo = 1-NMSEo % 0.88495
Rsqtrno = 1-NMSEtrno % 0.89151
Rsqvalo = 1-NMSEvalo % 0.87861
Rsqtsto = 1-NMSEtsto % 0.86073
plotresponse(To,Yo)
totaltime = toc % 24.0
% The next step is to use a double loop over hidden
% nodes
%
% Hmax <= Hub = (Ntrn*O-O)/(d+O+1) % 54
%
% and initial RNG states (varies initial weights)
Hope this helps.
Greg
2 件のコメント
Greg Heath
2016 年 5 月 31 日
編集済み: Greg Heath
2016 年 5 月 31 日
I have found 4 errors in the previous post:
1. d = 34
2. DIVIDEBLOCK WAS NOT IMPLEMETED.
3.Significant lags were obtain via zt instead of ztrn.
4. The RNG was not initialized before training
I will replace it ASAP.
Greg
その他の回答 (4 件)
Greg Heath
2016 年 6 月 1 日
編集済み: Greg Heath
2016 年 6 月 1 日
close all, clear all, clc, plt=0, tic
% FOR UNBIASED PREDICTION OF FUTURE SOLAR FLARE EVENTS, 70/15/15 DIVIDEBLOCK DATADIVISION IS USED TO INSURE
1. UNIFORM TIME SAMPLING
2. ONLY TRAINING DATA (THE FIRST 70%) IS EXPLICITLY
USED TO CALCULATE NETWORK PARAMETERS
HOWEVER, PERFORMANCE ON VALIDATION DATA (THE FOLLOWING
15%) IS MONITORED TO STOP TRAINING WHEN GENERALIZATION TO
GOOD PERFORMANCE ON NONTRAINING DATA BEGINS TO FAIL
T = solar_dataset ; t = cell2mat(T) ;
[ O N ] = size(t) % [ 1 2899 ]
zt = zscore(t,1); tzt = [ t ; zt ];
meantzt = mean( tzt' )' %[ 51.727 ; -9.38e-16 ]
vartzt1 = var( tzt', 1)' % [1937.3 ; 1 ]
minmaxtzt = minmax(tzt)
% minmaxtzt = 0 253.8
% -1.1752 4.5911
% Possible outliers
Nout3 = numel(find(abs(zt) > 3)) % 27 ( < 0.93%)
Nout4 = numel(find(abs(zt) > 4)) % 4 ( < 0.14%)
% NOTE: WILL CURRENTLY IGNORE OUTLIER MITIGATION
Ntst = round(0.15*N), Nval = Ntst % 435, 435
Ntrn = N-Nval-Ntst % 2029
trnind = 1:Ntrn; % 1 : 2029
valind = Ntrn+1:Ntrn+Nval; % 2030 : 2464
tstind = Ntrn+Nval+1:N; % 2465 : 2899
ttrn = t(trnind); tval = t(valind);ttst = t(tstind);
plt = plt+1, figure(plt), hold on
plot( trnind, ttrn,'b' )
plot( valind, tval,'g' )
plot( tstind, ttst, 'r' )
legend('TRAIN','VAL','TEST')
title( ' SOLAR DATASET TIMESERIES ' )
rng('default');
Ltrn = floor(0.95*(2*Ntrn-1)) % 3854
imax = 200
for i = 1:imax
n = randn(1,Ntrn);
autocorrn = nncorr( n,n, Ntrn-1, 'biased');
sortabsautocorrn = sort(abs(autocorrn));
thresh95(i) = sortabsautocorrn(Ltrn);
end
% imax = 100 200
minthresh95 = min(thresh95) % 0.0293 0.0293
medthresh95 = median(thresh95)% 0.0325 0.0325
meanthresh95 = mean(thresh95) % 0.0325 0.0325
stdthresh95 = std(thresh95) % 0.0015 0.0014
maxthresh95 = max(thresh95) % 0.0355 0.0355
sigthresh95 = meanthresh95 % 0.0325
zttrn = zscore(ttrn',1)';
autocorrttrn = nncorr( zttrn, zttrn, Ntrn-1, 'biased' );
siglag95 = find(abs(autocorrttrn(Ntrn+1:2*Ntrn-1)) ...
>= sigthresh95);
Nsiglag95 = length(siglag95) % 1327 ~ 0.65*Ntrn
d = min(find(diff(siglag95)>1))% 35
proof = siglag95(d-1:d+1) % 34 35 39
plt = plt+1; figure(plt); hold on
plot(0:Ntrn-1, -sigthresh95*ones(1,Ntrn),'b--' );
plot(0:Ntrn-1, zeros(1,Ntrn),'k');
plot(0:Ntrn-1, sigthresh95*ones(1,Ntrn),'b--' );
plot(0:Ntrn-1, autocorrttrn(Ntrn:2*Ntrn-1));
plot( siglag95, autocorrttrn(Ntrn + siglag95), 'ro' );
title('SIGNIFICANT AUTOCORRELATIONS VALUES')
to = t(d+1:N);
No = N-d % 2864
Notst = round(0.15*No) % 430
Noval = Notst % 430
Notrn = No-Noval-Notst % 2004
H = 10 % Default
neto = narnet(1:d,H);
neto.divideFcn = 'divideblock';
[ Xo, Xoi, Aoi, To ] = preparets(neto,{},{},T);
to = cell2mat(To); varto1 = var(to,1) % 1948.6
% Desire number of unknown weights Nw = (d+1)*H+(H+1)*O to not exceed number of training equations Notrneq = Notrn*O ==> H < = Hub where
Hub = (Notrn*O-O)/(d+O+1) % 54.14
Hmin = 10, dH = 10, Hmax = 50
Ntrials = 10
rng('default')
j = 0
for h = Hmin:dH:Hmax
j = j+1
neto.layers{1}.dimensions = h;
for i = 1:Ntrials
neto = configure(neto,Xo,To);
[neto tro Yo Eo Xof Aof ]=train(neto,Xo,To,Xoi,Aoi);
%[Yo Xof Aof =net(Xo,Xoi,Aoi);Eo=gsubtract(To,Yo);
NMSEo(i,j) = mse(Eo)/varto1;
NMSEtrno(i,j) = tro.best_perf/varto1;
NMSEvalo(i,j) = tro.best_vperf/varto1;
NMSEtsto(i,j) = tro.best_tperf/varto1;
end
end
plotresponse(To,Yo)
NumH = Hmin:dH:Hmax
Rsqo = 1-NMSEo
Rsqtrno = 1-NMSEtrno
Rsqvalo = 1-NMSEvalo
Rsqtsto = 1-NMSEtsto
totaltime = toc
% Rsqo
% H=10 20 30 40 50
% 0.877 0.881* 0.878 0.790 0.775
% 0.880* 0.883* 0.776 0.744 0.794
% 0.858 0.847 0.768 0.650 0.834
% 0.436 0.886* 0.681 0.859 0.762
% 0.879 0.809 0.516 0.803 0.819
% 0.878 0.714 0.883* 0.646 0.880*
% 0.881* 0.881* 0.860 0.841 0.781
% 0.879 0.864 0.810 0.881* 0.778
% 0.887* 0.879 0.641 0.882* 0.549
% 0.875 0.869 0.883* 0.657 0.714
%
% Rsqtrno
%H=10 20 30 40 50
% 0.900 0.907 0.915 0.877 0.870
% 0.906 0.906 0.905 0.906 0.948
% 0.901 0.939 0.902 0.892 0.865
% 0.902 0.915 0.933 0.929 0.956
% 0.907 0.871 0.904 0.884 0.902
% 0.905 0.895 0.903 0.968 0.903
% 0.899 0.897 0.929 0.950 0.911
% 0.913 0.928 0.839 0.897 0.953
% 0.916 0.896 0.738 0.900 0.883
% 0.902 0.911 0.929 0.943 0.851
%
% Rsqvalo
%H=10 20 30 40 50
% 0.865 0.860 0.833 0.788 0.744
% 0.863 0.860 0.769 0.689 0.627
% 0.835 0.780 0.769 0.742 0.831
% 0.756 0.860 0.780 0.763 0.631
% 0.866 0.811 0.584 0.729 0.748
% 0.864 0.578 0.863 0.553 0.862
% 0.872* 0.875* 0.839 0.771 0.730
% 0.854 0.777 0.797 0.874* 0.770
% 0.855 0.874* 0.553 0.875* 0.687
% 0.865 0.856 0.821 0.730 0.678
%
% Rsqtsto
%H=10 20 30 40 50
% 0.785 0.777 0.752 0.388 0.359
% 0.777 0.801* 0.175 0.039 0.243
% 0.679 0.484 0.140 -0.570 0.688
% -2.06 0.778 -0.597 0.627 -0.014
% 0.764 0.510 -1.363 0.497 0.503
% 0.768 0.006 0.806* -0.762 0.786
% 0.807* 0.814* 0.559 0.406 0.223
% 0.745 0.650 0.685 0.814* -0.031
% 0.783 0.805* 0.274 0.805* -1.150
% 0.758 0.689 0.730 -0.752 0.113
%
%These test Rsquare values are not high enough to obtain
% a useful closed loop configuration by just closing the loop.
% The next step is to continue training the net after closing
% the loop and, if unsuccessful, to consider training the CL
% configuration with a new set of initial random weights.
Hope this helps.
Thank you for formally accepting my answer
Greg
0 件のコメント
Greg Heath
2016 年 5 月 27 日
編集済み: Greg Heath
2016 年 5 月 28 日
For UNBIASED FUTURE PREDICTIONS of the SOLAR_DATASET, I see the
initial attack structured as follows:
1. DIVIDEBLOCK DATADIVISION
2. DEFAULT 0.7/0.15/0.15 TRN/VAL/TST datadivision ratios
3. TRN subset is known. VAL and TST subsets are unknown.
4. STATIONARY SUMMARY STATISTICS so that the
min/median/mean/max/stdv/siglags
of VAL & TST subsets are ASSUMED to be the same as those of the
TRN subset.
5. Emphasis on estimating a good combination of feedback delays FD =
1:d and number of hidden nodes, H, to obtain Rsqc >= 0.99
(i.e., NMSEc <= 0.01 ).
6. TRAINBR is only considered when the number of training equations,
Ntrneq ~ 0.7*Ntrn*O, is not sufficiently larger than the number of
unknown weights Nw = (d+1)*H+(H+1)*O.
7. Although performance function MSEREG is obsolete and has been
replaced by the REGULARIZATION OPTION in MSE, it is STILL
AVAILABLE! Unfortunately, there is a BUG in TRAINBR that does not
allow the use of a VAL subset. I say unfortunate because recent
investigations have shown that the combination can achieve
superior results!
8. My initial investigations have shown that
a. The summary stats are not stationary.
b. There are outliers which may be significantly detrimental.
9. The next steps are
a. Estimate the significant lags of the TRN subset
b. Try to find combinations of d and H to minimize Nw subject to
the OL (i.e., OPEN LOOP) performance constraint Rsqo >= 0.999
( i.e., NMSEo <= 0.001 ).
c. Determine what has to be done to the OL & CL (CLOSELOOP)
configurations to optimize the CL performance. If necessary,
consider outlier modifications.
Hope this helps.
Thank you for formally accepting my answer
Greg
0 件のコメント
Staffan
2016 年 8 月 11 日
編集済み: Staffan
2016 年 8 月 12 日
4 件のコメント
Greg Heath
2016 年 8 月 23 日
% Dividetrain --> I’ve put some hours on this; This method does not increase the accuracy of the prediction. I don’t think it is impossible to see improvement using dividetrain, however, I see three detriments: (1) increased likelihood of over fitting, (2) increased computational time and (3) only marginal room for improvement compared to divideint.
(1) I have been operating under the edict: For unbiased timeseries prediction
max(indtrn,indval) < min(indtst)
So far, no success. Frustrated, I have cheated and looked at DIVIDETRAIN mainly to find out why everthing is failing. Surprisingly, even this has failed.
Now I'm pretty sure that the basic problems are
a. Nonstationarity
b. Insufficient use of significant delays.
c. I'm on vacation and my wife wants to denerd me for a week.
(2) Overfitting is never a problem for me because I always
a. Compare the number of training equations Ntrneq with the number of
unknown weights Nw (Search NEWSREADER and ANSWERS using Ntrneq Nw).
b. Try to minimize the number of hidden nodes subject to a maximum allowed
error
(3). So far training time has been no problem at all.
(4). With default delays , nothing works for me. I think skipping the estimation of significant delays is my biggest mistake.
% Divideint 34/33/33 --> At this early stage I still like produce a prediction that I can visualise. I think that parameters such as NMSE or R2 for tst values are applicable but I think such parameters could be more usable in an optimization (at a later stage, so to say). If I can visualise a prediction in a graph I analyse both the accuracy of wavelengths and amplitudes in one go.
% DIVIDE into stationary subsections --> You are here thinking about e.g. % using divideblock 44/26/30 and to be sure that the data is (more) stationary at each of these divisions? Regarding stationary data, do you have a suggestion that you think I should read on this subject? I get the concept of that it’s not that easy to perform predictions on data that contains larger trends and I’ve brushed up on Dickey–Fuller test (from wiki, mostly) but I think there is a better reference more applicable to neural networks than the ones I’ve come across.
My experience with trend removal has been limited to polynomials and the equivalent of replacing variables with differences.
% Difference the series to reduce non-stationarity --> Are you here suggesting that I should perform the NN calculation of the diff of the dataset?...simply : % % T = solar_dataset; % t = cell2mat(T); % t_diff=diff(t); % T_diff=num2cell(t); % … and so forth using t_diff and T_diff instead of t and T
Yes. This is standard procedure. In fact even higher order diferences are used (differences remove linear trends/2nd differences remove quadratic trends, etc).
Frustrating problem.
More later
Greg
Shashank Bhatia
2018 年 8 月 12 日
I think this post has most from Greg and Staffan. Thank you both for this deeper discussion into the matter on which Mathworks have failed to give appropriate examples.
Staffan it would be best if you can post your final code for this example.
Greg, I request you to please add if you have any further developments on this example.
I am trying to compile all the relevant information regarding NARNET from various threads by Greg to make 1 post which covers the most.
Thanks once again.
Sincerely, Shashank
Saira AlZadjali
2018 年 10 月 14 日
Dear Greg,
i am working on almost same, predicting wind speed using narnet it have been long time, did lots of changes in network parameters, but all giving almost same accuracy or worse. so read more in deep, in ur answers here. and got to know that we need to do autocorrelation to get the FD. thus i followed the method above for my dataset. but i am getting d value very large :( can you help to find suitable delay and hidden layer neurons, i am getting open loop MSE = range (0.34xx to 0.38xx) for all tries which i did with different Delay and Hidden neurons, with different training function, and algorithms) finally i came to do this autocorrelation method, but here again stuck, d is "8445" which is very huge number.. please advise //
data_1= xlsread('G:\SQU2\Solar_Wind_Project\MAtlabCodes\data');
T = data_1 (1:79146,2);
T=T';
t = T ;
[ O N ] = size(t) % [ 1 79146 ]
zt = zscore(t,1); tzt = [ t ; zt ];
meantzt = mean( tzt' )' %[ 6.2702 ; 0.0000 ]
vartzt1 = var( tzt', 1)' % [10.4444 ; 1.0000 ]
minmaxtzt = minmax(tzt)
% minmaxtzt = 0.2100 18.1810
% -1.8752 3.6855
% Possible outliers
Nout3 = numel(find(abs(zt) > 2)) % 3305
Nout4 = numel(find(abs(zt) > 3)) % 221
% NOTE: WILL CURRENTLY IGNORE OUTLIER MITIGATION
Ntst = round(0.15*N), Nval = Ntst % 11872, 11872
Ntrn = N-Nval-Ntst % 55402
trnind = 1:Ntrn ; % 1 : 55402
valind = Ntrn+1:Ntrn+Nval; % 55403 : 67274
tstind = Ntrn+Nval+1:N; % 67275 : 79146
ttrn = t(trnind); tval = t(valind);ttst = t(tstind);
plt = plt+1, figure(plt), hold on
plot( trnind, ttrn,'b' )
plot( valind, tval,'g' )
plot( tstind, ttst, 'r' )
legend('TRAIN','VAL','TEST')
title( ' WIND SPEED DATASET TIMESERIES ' )
rng('default');
Ltrn = floor(0.95*(2*Ntrn-1)) % 105262
imax = 100
for i = 1:imax
n = randn(1,Ntrn);
autocorrn = nncorr( n,n, Ntrn-1, 'biased');
sortabsautocorrn = sort(abs(autocorrn));
thresh95(i) = sortabsautocorrn(Ltrn);
end
% imax = 100
minthresh95 = min(thresh95) % 0.0061
medthresh95 = median(thresh95)% 0.0062
meanthresh95 = mean(thresh95) % 0.0062
stdthresh95 = std(thresh95) % 5.1053e-05
maxthresh95 = max(thresh95) % 0.0064
sigthresh95 = meanthresh95 % 0.0062
zttrn = zscore(ttrn',1)';
autocorrttrn = nncorr( zttrn, zttrn, Ntrn-1, 'biased' );
siglag95 = find(abs(autocorrttrn(Ntrn+1:2*Ntrn-1)) ...
>= sigthresh95);
Nsiglag95 = length(siglag95) % 52291 ~ 0.9438*Ntrn
d = min(find(diff(siglag95)>1))% 8445
proof = siglag95(d-1:d+1) % 8444 8445 8470
plt = plt+1; figure(plt); hold on
plot(0:Ntrn-1, -sigthresh95*ones(1,Ntrn),'b--' );
plot(0:Ntrn-1, zeros(1,Ntrn),'k');
plot(0:Ntrn-1, sigthresh95*ones(1,Ntrn),'b--' );
plot(0:Ntrn-1, autocorrttrn(Ntrn:2*Ntrn-1));
plot( siglag95, autocorrttrn(Ntrn + siglag95), 'ro' );
title('SIGNIFICANT AUTOCORRELATIONS VALUES')
to = t(d+1:N);
No = N-d % 70701
Notst = round(0.15*No) % 10605
Noval = Notst % 10605
Notrn = No-Noval-Notst % 49491
0 件のコメント
参考
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!