Data Partition using CVPartition_ Warning

Question

0 投票

Dear All;

i am trying to use divide my data using Cvpartition with "Kfold" option in order to use for cross valdtion in neural network, i have a function to do that as shown below , it works but it give a warning message and i do not know why it is coming

Warning: One or more folds do not contain points from all the groups.

> In internal.stats.cvpartitionImpl>stra_kfoldcv (line 364)

In internal.stats.cvpartitionImpl/rerandom (line 315)

In internal.stats.cvpartitionInMemoryImpl (line 166)

In cvpartition (line 175)

In jFFNN_REG (line 14)

In NN_Kfold_Regression (line 8)

Function:

function [FFNN,Pred,Actual]=jFFNN_REG(input,output,kfold,Hiddens,Maxepochs)

% Layer

if length(Hiddens)==1

h1=Hiddens(1); net=fitnet(h1);

elseif length(Hiddens)==2

h1=Hiddens(1); h2=Hiddens(2); net=fitnet([h1 h2]);

elseif length(Hiddens)==3

h1=Hiddens(1); h2=Hiddens(2);

h3=Hiddens(3); net=fitnet([h1 h2 h3]);

end

%rng('default');

% Divide data into k-folds

fold=cvpartition(output,'kfold',kfold);

% Pre

pred2=[]; ytest2=[]; Afold=zeros(kfold,1);

% Neural network start

for i=1:kfold

% Call index of training & testing sets

trainIdx=fold.training(i); testIdx=fold.test(i);

% Call training & testing inputures and labels

xtrain=input(trainIdx,:); ytrain=output(trainIdx);

xtest=input(testIdx,:); ytest=output(testIdx);

% Set Maximum epochs

net.trainParam.epochs= Maxepochs;

% Training model

net=train(net,xtrain',ytrain');

% Perform testing

pred=net(xtest');

% Perfomance

tstPerform = perform(net, ytest', pred);

% Get accuracy for each fold

Afold(i)=tstPerform;

% Store temporary result for each fold

pred2=[pred2(1:end,:),pred]; ytest2=[ytest2(1:end);ytest];

end

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

サインインしてアクティビティをフォロー

Answer 1

Divya Gaddipati 2019 年 8 月 5 日

MATLAB Online で開く

1 投票

c = cvpartition(n,'KFold',k)

The above syntax of the function randomly splits the “n” observations into “k” disjoint sets of roughly equal size. Hence, it doesn’t ensure if all the “k” sets include samples corresponding to all the classes. If your dataset is highly imbalanced, there is a possibility that some of the sets might not contain samples corresponding to the minority class.

c = cvpartition(group,'KFold',k,'Stratify',true)

While, the above syntax of the function ensures that each of the “k” sets contain approximately the same percentage of samples for each class as the complete set.

In case of large imbalance in the distribution of target classes, it is recommended to use stratified sampling to ensure that relative class frequencies are approximately preserved in each train and validation fold.

For more syntaxes of this function, refer to this link.

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

Data Partition using CVPartition_ Warning

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

タグ

Community Treasure Hunt

Data Partition using CVPartition_ Warning

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示