Feature selection for SVM classifier

4 ビュー (過去 30 日間)
Jos Huigen
Jos Huigen 2019 年 6 月 25 日
I am trying to have matlab do a feature selection for me so I can use the svm classifier on my data and check the ideal performance for each amount of features used for the classification. In my script, I have checked the differentiation between the two groups ("healthy" and "sick") through t-statistics. The t-statistics actually already show me which features would be best, since the feature with the lowest p-value would have the best discriminating properties, but I want it to be done by the sequentialfs command. The problem is, that the feature selection selects different genes than I would have chosen when looking at the p-values (my first-choice feature would be A and the feature selection selects B). Could anyone check if there is something wrong with either the t-statistics or the feature selection? I have attached the dataset matrix to this message. Any help is greatly appreciated!
load samples1
ID=samples1(:,12)
ID(ID<3)=0
ID(ID>=3)=1
samples1(:,13)=ID
%% Determining significancy of feature differentiation between sick and healthy group
sick=find(samples1(1:60,12)>=3);
healthy=find(samples1(1:60,12)<3);
sick2 = samples1(sick,:);
healthy2 = samples1(healthy,:);
[h,p,ci,stats] = ttest2(healthy2,sick2);
%% Train/Test Division
%
x_train=(samples1(1:60,2:7))
y_train=(samples1(1:60,13))
x_test=(samples1(61:end,2:7))
y_test=(samples1(61:end,13))
%% CV partition
c=cvpartition(y_train,'LeaveOut')
%% feature selection
opts = statset('display','iter');
classf = @(x_train, y_train, x_test, y_test)...
sum(predict(fitcsvm(x_train, y_train,'KernelFunction','RBF','Kernelscale','auto'), x_test)~=y_test);
[fs, history] = sequentialfs(classf, x_train, y_train, 'cv', c, 'options', opts,'nfeatures',6);
%% Best hyperparameter
X_train_w_best_feature = x_train(:,fs);
Mdl = fitcsvm(X_train_w_best_feature,y_train,'KernelFunction','rbf','OptimizeHyperparameters','auto',...
'HyperparameterOptimizationOptions',struct('AcquisitionFunctionName',...
'expected-improvement-plus','ShowPlots',true)); % Bayes' Optimization.
%% Final test with test set
X_test_w_best_feature = x_test(:,fs);
test_accuracy_for_iter = sum((predict(Mdl,X_test_w_best_feature) == y_test))/length(y_test)*100
%% Extract error rate
label = predict(Mdl, X_test_w_best_feature)
L=loss(Mdl,X_test_w_best_feature,y_test)

回答 (0 件)

カテゴリ

Help Center および File ExchangeClassification についてさらに検索

製品


リリース

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by