Linear regression on training set
古いコメントを表示
I have some data that I want to divide into a training set and a validation set in order to do linear regression on the training set to find y0 and r. The training set should contain at least 50% of the data. My code so far is that below:
A=[130, 300, 400, 500, 650, 1075, 2222, 2550, 3300]';
t = [1930, 1943, 1966, 1976, 1991, 1994, 2000, 2005, 2008];
idx=randperm(numel(A))
subSet1=A(idx(1:5)) %Trainingset
subSet2=A(idx(6:end)) %Validationset
If I can assume the function is exponential and is y(t)= y0*e^rt how do I continue to plot the training set to find y0 and r?
Thankful for all help!
9 件のコメント
J. Alex Lee
2020 年 9 月 10 日
you already identified that your regression can be made into linear form, so that's already a big hint for you...
katara
2020 年 9 月 10 日
Johannes Hougaard
2020 年 9 月 10 日
the five t values that will correspond to the randomly chosen values are used by using the idx vector similarly to what you do for A.
A=[130, 300, 400, 500, 650, 1075, 2222, 2550, 3300]';
t = [1930, 1943, 1966, 1976, 1991, 1994, 2000, 2005, 2008];
idx=randperm(numel(A));
subSet1=A(idx(1:5)); %Trainingset
subSet2=A(idx(6:end)); %Validationset
t1 = t(idx(1:5)); %t values for Trainingset
y=log(subSet1);
c=polyfit(t1,y, 1)
r=c(1);
lny0=c(2);
y0=exp(c(2));
y2 = y0*exp(r*t);
plot(t,y2,'*')
And to apply your polyfit result you could just use polyval.
% Or you could use
y2 = exp(polyval(c,t));
plot(t,y2);
Johannes has the right approach (maybe it can be written as an answer). It can be generalized to any size dataset using
idx = randperm(numel(A));
nTrain = ceil(numel(A)/2);
% nTest = numel(A)-nTrain; % if needed
trainIdx = 1:nTrain;
testIdx = nTrain+1 : numel(A);
trainSet = [A(trainIdx); t(trainIdx)]; % assuming A and t are row vectors
testSet = [A(testIdx); t(testIdx)]; % same assumptionx
% Then proceed with fitting on the trainSet and measuring
% error on the testSet
Also note that if you're planning on using a more rigorous cross validation, use cvpartition to partition your data.
katara
2020 年 9 月 10 日
J. Alex Lee
2020 年 9 月 10 日
you just need to exponentiate the result of polyval (remember you took the log), and I would wager the plot you really want is
plot(t,A,'*',t,exp(polyval(c,t)))
Or if I may:
A=[130, 300, 400, 500, 650, 1075, 2222, 2550, 3300];
t = [1930, 1943, 1966, 1976, 1991, 1994, 2000, 2005, 2008];
idx=randperm(numel(A));
subSet1=A(idx(1:5)); %Trainingset
subSet2=A(idx(6:end)); %Validationset
t1=t(idx(1:5)); %t values for Trainingset
t2=t(idx(6:end)); %t values for Trainingset
y=log(subSet1);
c=polyfit(t1,y, 1)
p=polyval(c,t);
r=c(1);
y0=exp(c(2));
yMdlFn = @(t)(y0*exp(r*t));
% to evaluate on test set
yMdlTest = yMdlFn(t2)
% more comprehensive plot
figure(1); cla; hold on
plot(t1,subSet1,'*')
plot(t2,subSet2,'o')
fplot(yMdlFn,[1929,2009])
But also recommend implement Adam's generalization to arbitrarily large data sets partitioned into arbitrarily sized training and test sets (although i think the code posted doesn't work)
Image Analyst
2020 年 9 月 10 日
If you want a log fit, use fitnlm() rather than polyfit().
J. Alex Lee
2020 年 9 月 10 日
i would take linear least squares anywhere i can get it, including this situation. linear fitting doesn't require initial guesses and guaranteed to give a "result", and is faster. now you could use the result of the polyfit to do a nonlinear fit, if you want to define the least squares differently. But you're still left with a choice on how to define your residual anyway, so you have a lot more things to worrry about if you care to that level with nonlinear fitting.
回答 (1 件)
Johannes Hougaard
2020 年 9 月 11 日
the five t values that will correspond to the randomly chosen values are used by using the idx vector similarly to what you do for A.
A=[130, 300, 400, 500, 650, 1075, 2222, 2550, 3300]';
t = [1930, 1943, 1966, 1976, 1991, 1994, 2000, 2005, 2008];
idx=randperm(numel(A));
subSet1=A(idx(1:5)); %Trainingset
subSet2=A(idx(6:end)); %Validationset
t1 = t(idx(1:5)); %t values for Trainingset
y=log(subSet1);
c=polyfit(t1,y, 1)
r=c(1);
lny0=c(2);
y0=exp(c(2));
y2 = y0*exp(r*t);
plot(t,y2,'*')
And to apply your polyfit result you could just use polyval.
% Or you could use
y2 = exp(polyval(c,t));
plot(t,y2);
カテゴリ
ヘルプ センター および File Exchange で Linear Predictive Coding についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!