how to divide a data set randomly into training and testing data set?

Hello guys, I have a dataset of a matrix of size 399*6 type double and I want to divide it randomly into 2 subsets training and testing sets by using the cross-validation.
i have tried this code but did get what i want https://www.mathworks.com/help/stats/cvpartition-class.html
Could anyone help me to do that?
Expected outputs:
training_data: k*6 double
testing_data: l*6 double

 採用された回答

KSSV
KSSV 2018 年 4 月 16 日
編集済み: KSSV 2018 年 4 月 16 日

18 投票

Let A be your data of size 399*6. To divide data into training and testing with given percentage:
[m,n] = size(A) ;
P = 0.70 ;
idx = randperm(m) ;
Training = A(idx(1:round(P*m)),:) ;
Testing = A(idx(round(P*m)+1:end),:) ;

20 件のコメント

chocho
chocho 2018 年 4 月 16 日
@KSSV i have tried your code but I want to divide it randomly by cross vaidation
KSSV
KSSV 2018 年 4 月 16 日
Edited the answer.
chocho
chocho 2018 年 4 月 16 日
@KSSV now it works but there is a problem when I sum up the training and testing sets there is one extra row my current data of 359*7 and I got training of 251*7 and testing of 109*7, 251+109=360?
KSSV
KSSV 2018 年 4 月 16 日
Edited the answer....you need to add one.
chocho
chocho 2018 年 4 月 16 日
Thanx
Jaspalsingh Virdi
Jaspalsingh Virdi 2018 年 8 月 16 日
Thanks it worked
Timothy Tizhe Fidelis
Timothy Tizhe Fidelis 2019 年 5 月 9 日
Thanks also. It work for me as well.
Faiq Ahmad khan
Faiq Ahmad khan 2019 年 6 月 27 日
Thanks , it worked for me too.
doaa khalil
doaa khalil 2019 年 7 月 4 日
i have this error for this code can you help me pleas
Array formation and parentheses-style indexing with objects of class 'matlab.io.datastore.ImageDatastore' is not allowed. Use objects of class 'matlab.io.datastore.ImageDatastore' only as scalars or use a cell array.
there is my code
%Load Data
unzip('CropSet.zip');
imds = imageDatastore('CropSet', ...
'IncludeSubfolders',true, ...
'LabelSource','foldernames');
%Use countEachLabel to summarize the number of images per category.
tbl = countEachLabel(imds)
%Divide the data into training and validation data sets
rng('default') % For reproduciblity
[m,n] = size(imds) ;
P = 0.70 ;
idx = randperm(m) ;
imdsTrain= imds(idx(1:round(P*m)), :);
imdsValidation = imds(idx(round(P*m)+1:end),:);
k priya
k priya 2019 年 12 月 28 日
Thanks a lot for your work.
Aliyuda Ali
Aliyuda Ali 2020 年 1 月 23 日
Many thanks @KSSV. The code worked for me.
MAT-Magic
MAT-Magic 2020 年 2 月 4 日
Dear KSSV,
For training, we take first 70% of the data. After applying your code, splitting is done, but why the values of data in say training are shuffled? Why these values in training set are not in same order which is in the given data? Thanks
BANDARU UMAMADHURI
BANDARU UMAMADHURI 2020 年 10 月 14 日
What if we need a training data of 70% , testing and validation of 15% each,Can we use the same command used in testing for validation as well
KSSV
KSSV 2020 年 10 月 15 日
data = rand(100,3) ; % your data
%
[m,n] = size(data) ;
idx = randperm(m) ; % shuffle the rows
Traing = data(1:round(m*0.70),:) ;
Testing = data(round(m*0.70)+1:round(m*0.85),:) ;
Validation = data(round(m*0.85)+1:end,:) ;
Susan
Susan 2021 年 6 月 8 日
@KSSV How the cross-validation is implemented here? Thanks
KSSV
KSSV 2021 年 6 月 9 日
[r,c] = size(A) ;
P1 = 0.70 ; P2 = 0.85 ;
idx = randperm(r) ;
m = round(P1*r) ; n = round(P2*r) ;
Training = A(idx(1:m),:) ;
Validation = A(idx(m+1:n),:) ;
Testing = A(idx(n+1:end),:) ;
Susan
Susan 2021 年 6 月 9 日
編集済み: Susan 2021 年 6 月 9 日
@KSSV Thanks for your response. I think I need your help to underestand the definition of cross-validation here. Does this cross-validation have nothing to do with something like k-fold cross validation?
Saraswathi S
Saraswathi S 2021 年 6 月 30 日
Thank you so much
Gentil Andres Collazos Escobar
Gentil Andres Collazos Escobar 2021 年 9 月 10 日
Thank you!!
Abhijit Bhattacharjee
Abhijit Bhattacharjee 2023 年 3 月 4 日
If it hasn't been covered already, you can also use cvpartition to split the dataset. See THIS answer for more details.

サインインしてコメントする。

その他の回答 (8 件)

Jeremy Breytenbach
Jeremy Breytenbach 2019 年 5 月 24 日
編集済み: Jeremy Breytenbach 2019 年 5 月 24 日

3 投票

Hi there.
If you have the Deep Learning toolbox, you can use the function dividerand: https://www.mathworks.com/help/deeplearning/ref/dividerand.html
[trainInd,valInd,testInd] = dividerand(Q,trainRatio,valRatio,testRatio) separates targets into three sets: training, validation, and testing.
ALDO
ALDO 2020 年 2 月 2 日

2 投票

you can use The helper function 'helperRandomSplit', It performs the random split. helperRandomSplit accepts the desired split percentage for the training data and Data. The helperRandomSplit function outputs two data sets along with a set of labels for each. Each row of trainData and testData is an signal. Each element of trainLabels and testLabels contains the class label for the corresponding row of the data matrices.
percent_train = 70;
[trainData,testData,trainLabels,testLabels] = ...
helperRandomSplit(percent_train,Data);
make sure to have the proper toolbox to use it.

1 件のコメント

Lucrezia Cester
Lucrezia Cester 2021 年 2 月 7 日
could you please send a link to this function?

サインインしてコメントする。

sidra ashiq
sidra ashiq 2018 年 11 月 23 日

1 投票

Training = A(idx(1:round(P*m)),:) ;
what is the A function??

2 件のコメント

Mohamed Marei
Mohamed Marei 2018 年 12 月 17 日
A is the vector or array indexed by the elements inside the bracket. It is not a function.
madhan ravi
madhan ravi 2018 年 12 月 17 日
A is a matrix

サインインしてコメントする。

Mehernaz Savai
Mehernaz Savai 2022 年 5 月 26 日
編集済み: Mehernaz Savai 2022 年 5 月 26 日

1 投票

You can partition data in a number of ways:
Let X be your input matrix. You can also use similar workflow for Tables.
If you have the Statistics and Machine Learning Toolbox, you can use cvpartition as follows:
% Partiion with 40% data as testing
hpartition = cvpartition(size(X,1),'Holdout',0.4);
% Extract indices for training and test
trainId = training(hpartition);
testId = test(hpartition);
% Use Indices to parition the matrix
trainData = X(trainId,:);
testData = X(testId,:);
If you have the Deep Learning Toolbox, you can use dividerand as follows:
% Partiion with 60:20:20 ratio for training,validation and testing
% respectively
[trainId,valId,testId] = dividerand(size(X,1),0.6,0.2,0.2);
% Use Indices to parition the matrix
trainData = X(trainId,:);
valData = X(valInd,:);
testData = X(testId,:);
Pramod Hullole
Pramod Hullole 2019 年 3 月 5 日

0 投票

hello sir,
iI'm new to the neuralnetworks..now i am working on my projects which is leaf disease detections using image processing. i am done with feature extraction and now not getting what is the next step..i know that i should apply nn and divide it in training and testing data set.. but in practically how to procced that's what i am not getting .please help me through this... please send steps..each steps in details. .

1 件のコメント

Savas Yaguzluk
Savas Yaguzluk 2019 年 3 月 8 日
Dear Pramod,
Open a new topic and ask your question there. So, people can see your topic title and help you.

サインインしてコメントする。

Hossein Amini
Hossein Amini 2019 年 7 月 15 日

0 投票

Hi there, it worked for me but I have problem in rest of the code. In newrb doc, it has been witten how to write the code but the more tried that I did, I got error like below.error.JPG
Hossein Amini
Hossein Amini 2019 年 7 月 15 日

0 投票

[z,r] = size(X);
idx = randperm(z);
TrainX = (X(idx(1:round(Ptrain.*z)),:))';
TrainY = (Y(idx(1:round(Ptrain.*z)),:))';
TestX = (X(idx(round(Ptrain.*z)+1:end),:))';
TestY = (Y(idx(round(Ptrain.*z)+1:end),:))';
If I'm not mistaken, in newrb doc, the size of input data and output data should be same like (4x266 and 1x266), that's why I transposed that matrixes. But the error which I got is specifying zeros matrix. I don't know how to prepare that.
ranjana roy chowdhury
ranjana roy chowdhury 2019 年 7 月 15 日

0 投票

the dataset is WS Dream dataset with 339*5825.The entries have values between 0 and 0.1,few entries are -1.I want to make 96% of this dataset 0 excluding the entries having -1 in dataset.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by