Main Content

kfoldfun

Cross-validate function using cross-validated ECOC model

Description

example

vals = kfoldfun(CVMdl,fun) cross-validates the function fun by applying fun to the data stored in the cross-validated ECOC model CVMdl. You must pass fun as a function handle.

Examples

collapse all

Train a multiclass ECOC classifier, and then cross-validate the model using a custom k-fold loss function.

Load Fisher’s iris data set. Specify the predictor data X, the response data Y, and the order of the classes in Y.

load fisheriris
X = meas;
Y = categorical(species);
classOrder = unique(Y); % Class order
rng(1); % For reproducibility

Train and cross-validate an ECOC model using support vector machine (SVM) binary classifiers. Standardize the predictors using an SVM template, and specify the class order.

t = templateSVM('Standardize',1);
CVMdl = fitcecoc(X,Y,'CrossVal','on','Learners',t,...
    'ClassNames',classOrder);

CVMdl is a ClassificationPartitionedECOC model. By default, the software implements 10-fold cross-validation.

Compute the classification error (proportion of misclassified observations) for the validation-fold observations.

L = kfoldLoss(CVMdl)
L = 0.0400

Examine the result when the cost of misclassifying a flower as versicolor is 10 and the cost of any other error is 1. Write a function named noversicolor that assigns a cost of 1 for general misclassification and a cost of 10 for misclassifying a flower as versicolor.

If you use the live script file for this example, the noversicolor function is already included at the end of the file. Otherwise, you need to create this function at the end of your .m file or add it as a file on the MATLAB® path.

Compute the mean misclassification error with the noversicolor cost.

foldLoss = kfoldfun(CVMdl,@noversicolor);
mean(foldLoss)
ans = single
    0.0667

This code creates the function noversicolor.

function averageCost = noversicolor(CMP,Xtrain,Ytrain,Wtrain,Xtest,Ytest,Wtest)
% noversicolor: Example custom cross-validation function that assigns a cost of
%   10 for misclassifying versicolor irises and a cost of 1 for misclassifying
%   the other irises. This example function requires the fisheriris data
%   set.
Ypredict = predict(CMP,Xtest);
misclassified = not(strcmp(Ypredict,Ytest)); % Different result
classifiedAsVersicolor = strcmp(Ypredict,'versicolor'); % Index of bad decisions
cost = sum(misclassified) + ...
    9*sum(misclassified & classifiedAsVersicolor); % Total differences
averageCost = single(cost/numel(Ytest)); % Average error
end

Input Arguments

collapse all

Cross-validated ECOC model, specified as a ClassificationPartitionedECOC model.

Cross-validated function, specified as a function handle. fun has this syntax:

testvals = fun(CMP,Xtrain,Ytrain,Wtrain,Xtest,Ytest,Wtest)
  • CMP is a compact model stored in one element of the CVMdl.Trained property.

  • Xtrain is the training matrix of predictor values.

  • Ytrain is the training array of response values.

  • Wtrain is the set of training weights for observations.

  • Xtest and Ytest are the validation data, with associated weights Wtest.

  • The returned value testvals must have the same size across all folds.

Data Types: function_handle

Output Arguments

collapse all

Cross-validation results, returned as a numeric matrix. vals corresponds to the arrays of the testvals output, concatenated vertically over all the folds. For example, if testvals from every fold is a numeric vector of length n, kfoldfun returns a KFold-by-n numeric matrix with one row per fold.

Extended Capabilities

Version History

Introduced in R2014b