Matlab: Error using classreg.learning.FitTemplate/fit with hyperparameter optimization of SVM
23 ビュー (過去 30 日間)
古いコメントを表示
I am using Bayesian optimization (bayesopt function) in Matlab for hyperparameter optimization of SVM classifier. The optimization goal is to minimize 10-fold cross validation error. Here is the code that I use:
KernelFlag = 1;
c = cvpartition(size(XTrain,1),'KFold',10);
sigma = optimizableVariable('sigma',[1e-5,1e5],'Transform','log');
box = optimizableVariable('box',[1e-5,1e5],'Transform','log');
polyOrder = optimizableVariable('polyOrder',[2,4]);
fun = @(z)mysvmfunTest(z,XTrain,yTrain,c,classNames,KernelFlag);
results = bayesopt(fun,[sigma,box,polyOrder],'IsObjectiveDeterministic',true,...
'PlotFcn',{@plotMinObjective},...
'AcquisitionFunctionName','expected-improvement-plus');
and mysvmfunTest:
function [objective] = mysvmfunTest(z,X,Y,c,classNames,KernelFlag)
if KernelFlag == 1
t = templateSVM('Standardize',1,'KernelFunction','RBF',...
'BoxConstraint',z.box,'KernelScale',z.sigma,'RemoveDuplicates',true);
elseif KernelFlag == 2
t = templateSVM('Standardize',1,'KernelFunction','polynomial',...
'BoxConstraint',z.box,'KernelScale',z.sigma,'PolynomialOrder',z.polyOrder,...
'RemoveDuplicates',true);
else
t = templateSVM('Standardize',1,'KernelFunction','linear',...
'BoxConstraint',z.box,'KernelScale',z.sigma,...
'RemoveDuplicates',true);
end
SVMModel = fitcecoc(X,Y,'Learners',t,'ClassNames',classNames);
cvModel = crossval(SVMModel,'CVPartition',c);
objective = kfoldLoss(cvModel);
I have used this code before, with different datasets. But, lately when I try to use it on a new dataset, it throws me an error:
Error using classreg.learning.FitTemplate/fit (line 249) You passed a cvpartition object for 27152 observations, but the input data have only 10395 observations. Some observations may have been removed because they have NaN values for all predictors, missing response values or zero weights. When cross-validating an existing object, consider using the RowsUsed property to determine what size partition is required.
I checked all the data, there is no nan, or missing values in my data. I even removed all the samples which have any feature between 0 and .01 (all my features are positive). Still have the same problem and get the same error. I guess the error is due to the existence of samples that are perhaps too close, resulting into removal of many of the observations, but I am not sure that is the case. Any idea where this error might come from or any suggestion how I can solve this issue?
0 件のコメント
採用された回答
Ilya
2018 年 10 月 26 日
You are passing ClassNames to fitcecoc - are your ClassNames a subset of all class names you have in yTrain?
Train one ECOC model using
SVMModel = fitcecoc(XTrain,yTrain,'Learners',t,'ClassNames',classNames);
and look at the size of property X in SVMModel. Does it have as many rows as XTrain does?
その他の回答 (1 件)
Don Mathis
2018 年 10 月 26 日
編集済み: Don Mathis
2018 年 10 月 26 日
Maybe your use of 'RemoveDuplicates' is causing observations to be removed?
I ran your code on some synthetic data that has no duplicates in XTrain and it works fine:
XTrain = rand(1000,10);
yTrain = categorical(round(XTrain(:,1)*3));
classNames = categories(yTrain);
KernelFlag = 1;
c = cvpartition(size(XTrain,1),'KFold',10);
sigma = optimizableVariable('sigma',[1e-5,1e5],'Transform','log');
box = optimizableVariable('box',[1e-5,1e5],'Transform','log');
polyOrder = optimizableVariable('polyOrder',[2,4]);
fun = @(z)mysvmfunTest(z,XTrain,yTrain,c,classNames,KernelFlag);
results = bayesopt(fun,[sigma,box,polyOrder],'IsObjectiveDeterministic',true,...
'PlotFcn',{@plotMinObjective},...
'AcquisitionFunctionName','expected-improvement-plus');
function [objective] = mysvmfunTest(z,X,Y,c,classNames,KernelFlag)
if KernelFlag == 1
t = templateSVM('Standardize',1,'KernelFunction','RBF',...
'BoxConstraint',z.box,'KernelScale',z.sigma,'RemoveDuplicates',true);
elseif KernelFlag == 2
t = templateSVM('Standardize',1,'KernelFunction','polynomial',...
'BoxConstraint',z.box,'KernelScale',z.sigma,'PolynomialOrder',z.polyOrder,...
'RemoveDuplicates',true);
else
t = templateSVM('Standardize',1,'KernelFunction','linear',...
'BoxConstraint',z.box,'KernelScale',z.sigma,...
'RemoveDuplicates',true);
end
SVMModel = fitcecoc(X,Y,'Learners',t,'ClassNames',classNames);
cvModel = crossval(SVMModel,'CVPartition',c);
objective = kfoldLoss(cvModel);
end
By the way, it's probably best to declare polyOrder to be an integer:
polyOrder = optimizableVariable('polyOrder',[2,4],'Type','integer');
参考
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!