How to do cross-validation with PLS feature extraction before SVM?

3 ビュー (過去 30 日間)
Juuso Korhonen
Juuso Korhonen 2021 年 6 月 17 日
コメント済み: Rishik Ramena 2021 年 8 月 30 日
Hi,
I would like to know the best way to do cross-validation with a pipeline where PLS feature extraction is done before fitting an SVM. Here is my current code:
% Cross validation (train: 80%, test: 20%)
rng default;
cv = cvpartition(size(X,1),'HoldOut',0.8);
idx = cv.test;
% Separate to training and test data
XTrain = X(~idx,:);
YTrain = Y(~idx, :);
XTest = X(idx,:);
YTest = Y(idx, :);
n_components = 10; % We should optimize this
[XL,yl,XS,YS,beta,PCTVAR, MSE, stats] = plsregress(XTrain,YTrain,n_components);
W = stats.W;
SVMModel = fitcsvm(XS,YTrain,'Standardize',false,'KernelFunction','rbf',...
'KernelScale','auto'); % I would like to have parameter optimization here
% PLS does centering of the data, X0 = X - mean(X)
% XS = X0 * W
XS_test = (XTest - mean(XTrain)) * W;
YPred = predict(SVMModel, XS_test);
accuracy = sum(YPred == YTest)/length(YPred)
The use of fitcsvm(..., 'Optimizehyperparameters', all) isn't suitable here since there is information leakage between the k-folds since the whole XTrain is used for plsregress to get XS. Are there some hyperparameter optimization functions in matlab where I could use the whole PLS+SVM as fitting function?
  1 件のコメント
Rishik Ramena
Rishik Ramena 2021 年 8 月 30 日
Yes your analysis is correct. The use of fitcsvm isn't suitable here due to the information leakage between the k-folds. There are no inbuilt hyperparameter optimization functions in matlab which can be used for the whole PLS+SVM setup.

サインインしてコメントする。

回答 (0 件)

カテゴリ

Help Center および File ExchangeClassification Trees についてさらに検索

製品


リリース

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by