Split array into training and testing

9 ビュー (過去 30 日間)
Ihsan Yassin
Ihsan Yassin 2016 年 12 月 21 日
回答済み: indhumathi karuppaiya 2020 年 6 月 2 日
Hi,
I have a set of data (DataA has 106x14). I want to split the rows into 2 section, one for training and one for testing. Here is my code:
[trainA,testA] = divideblock(DataA.', .7, .3); % 70% for training 30% for testing.
trainData = trainA.';
testData = testA.';
Result: but the total data I have after executing the code is only 93 (66x14 for traindata, 27x14 for testdata) I don't want to use valInd since I don't need it.
PLEASE correct me.
  1 件のコメント
Cyrus
Cyrus 2016 年 12 月 21 日
I would manually find the 70 and 30 % and use "for loop" to split the data

サインインしてコメントする。

採用された回答

Jos (10584)
Jos (10584) 2016 年 12 月 21 日
dataA = cumsum(ones(20,3)) % some test data
p = .7 % proportion of rows to select for training
N = size(dataA,1) % total number of rows
tf = false(N,1) % create logical index vector
tf(1:round(p*N)) = true
tf = tf(randperm(N)) % randomise order
dataTraining = dataA(tf,:)
dataTesting = dataA(~tf,:)
  1 件のコメント
Ihsan Yassin
Ihsan Yassin 2016 年 12 月 22 日
編集済み: Stephen23 2016 年 12 月 22 日
Thanks, it works exactly like I want. but may I ask another question,
tf = false(N,1) % create logical index vector
tf(1:round(p*N)) = true
tf = tf(randperm(N)) % randomise order
what are these for, I don't quite understand. Thanks again for your help.

サインインしてコメントする。

その他の回答 (5 件)

Greg Heath
Greg Heath 2016 年 12 月 22 日
編集済み: Greg Heath 2016 年 12 月 22 日
Your answer should be simply obtained from the divideblock documentation (help and/or doc). From the help documentation example
>> clear all, help divideblock
[trainInd,valInd,testInd] = divideblock(250,0.7,0.15,0.15);
whos
Name Size Bytes Class
testInd 1x37 296 double
trainInd 1x176 1408 double
valInd 1x37 296 double
>> [ 176 37 37 ]/250
ans =
0.7040 0.1480 0.1480
However, DIVIDEBLOCK (MATLAB 2016A) HAS A BUG
>> clear all, clc
[trainInd,valInd,testInd] = divideblock(250,0.7,0.0,0.3);
whos
Subscript indices must either be real positive integers or logicals
Error in divideblock>divide_indices (line 108)
testInd = (1:numTest)+valInd(end);
Error in divideblock (line 65)
[out1,out2,out3] = divide_indices(in1,params);
Hope this helps.
Thank you for formally accepting my answer
Greg
P.S. What version of MATLAB do you have?
  1 件のコメント
Ihsan Yassin
Ihsan Yassin 2016 年 12 月 23 日
I have 2015 ver..yes I get those error when i wrote that, as I don't want to use that valInd

サインインしてコメントする。


Jaeseok Kim
Jaeseok Kim 2017 年 11 月 13 日
dividerand is what you want...

Satyam Agarwal
Satyam Agarwal 2018 年 8 月 5 日
編集済み: Satyam Agarwal 2018 年 8 月 5 日
[Trainset,Testset]= splitEachLabel(datastore,p)
p is ratio 0<p<1

MUHAMMAD SAJAD
MUHAMMAD SAJAD 2018 年 9 月 3 日
% Split 60% of the files from each label into ds60 and the rest into dsRest [ds60,dsRest] = splitEachLabel(imds,0.6) ds60 is a trainingset while dsRest is testset. we can also divide it for validset. like this [TrianSet,ValidSet,TestSet]=splitEachLabel(DataStore,0.7,0.2). In this case 70% of files split for TrainingSet,20% for ValidSet and the remaining for TestSet.

indhumathi karuppaiya
indhumathi karuppaiya 2020 年 6 月 2 日
hi my name indhu i try to do project for my studies .i have choosed parkinson diease speech recognition in matlab coding how to split the data to train data and test data please let me know just i want use only 1to 60 patiend data onlu use thank u

カテゴリ

Help Center および File ExchangeClassification Ensembles についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by