hyperparameter tuning with fitclinear
22 ビュー (過去 30 日間)
古いコメントを表示
Hello Matlab community,
I would like to run an SVM classification on my high-dimensional data. I decided to use fitclinear to do so. I would like to tune lambda.
What I don't understand is the cross-validation that takes place in the HyperparameterOptimizationOptions field.
The 'MaxObjectiveEvaluations' field is by default set to 30 and 'Kfold' is by default set to 5. In my script, I choose to tune lambda and the result is 30 lambda's ranked. I do not understand where the cross-validation happens exactly.
Here is a simplified example of my code:
% 1. load data
x = data.data;
y = labels;
% 2. CV particion
CV = cvpartition(data.sex, 'KFold', 5);
for i = 1:5
x_train = x(CV.training(i), :);
y_train = y(CV.training(i));
x_test = x(CV.test(i), :);
y_test = y(CV.test(i));
% 3. normalization
[x_train_norm, C, S] = normalize(x_train);
x_test_norm = normalize(x_test, 'center', C, 'scale', S);
% 4. Hyperparameter (lambda) tuning
VariableDescriptions = hyperparameters('fitclinear', x_train_norm, y_train);
[mdl, ~, HyperparameterOptimizationResults] = fitclinear(x_train_norm', y_train,...
'ObservationsIn','columns', 'OptimizeHyperparameters', VariableDescriptions(1,1),...
'HyperparameterOptimizationOptions', struct('Optimizer', 'randomsearch', 'AcquisitionFunctionName', ...
'expected-improvement-plus', 'Verbose', 0));
% I am choosing 'OptimizeHyperparameters', VariableDescriptions(1,1)
% here because I only want to tune Lambda
% 5. Find best lambda out of the 30 MaxObjectiveEvaluations
idx = find(HyperparameterOptimizationResults.Rank == 1);
lambda = HyperparameterOptimizationResults.Lambda(idx);
% 6. Train final SVM model
finalModel = fitclinear(x_train_norm', y_train, 'ObservationsIn', 'columns', ...
'Lambda', lambda);
% 7. Predict labels for test data
[predictionsY, scores] = predict(finalModel, x_test_norm);
end
In this example, when the hyperparameter tuning happens in Step 4, is the x_train_norm further split into 5 training/test groups? And then the 30 lambdas are calculated using these 5 training/test groups of the x_train_norm? Is this process an equivalent of a nested cross-validation?
I appreciate the help!
Best,
Nasia
0 件のコメント
回答 (1 件)
Drew
2023 年 8 月 23 日
The short answer is yes. That is, the code you shared is doing "nested cross-validation" because the hyperparameter optimization inside fitclinear is using 'Kfold',5 crossvalidation by default as part of the HyperparameterOptimizationOptions. This is documented at https://www.mathworks.com/help/stats/fitclinear.html.
If this answer is helpful for you, please remember to accept the answer.
0 件のコメント
参考
カテゴリ
Help Center および File Exchange で Classification Ensembles についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!