splitting dataset into training set and testing set

Munshida P
Munshida P 2020 年 1 月 14 日
コメント済み: Akira Agata 2020 年 6 月 7 日
I have 400 images in my dataset(images).I want to split the dataset into 80% for training and 20% for testing.the below attached code works but , test_idx is empty?why?
train_idx contains 320 images.test_idx is empty.
% Load Image dataset
faceDatabase = imageSet('facedatabaseatt','recursive');
%splitting into training and testing sets
N = 400; % number of images
idx = 1:N ;
PD = 0.80 ;
train_idx = idx(1:round(PD*N)); % training indices
test_idx = idx(round(PD*N)+1:end,:) ; % test indices


Akira Agata
Akira Agata 2020 年 1 月 15 日
You can split your dataset by using partition function, like:
[setTrain, setTest] = partition(faceDatabase, [0.8, 0.2], 'randomized');
Akira Agata
Akira Agata 2020 年 6 月 7 日
>the original dataset ORL Facedatabaseatt contains 40 folders s1,s2,.....,s40(10 images each of 40 persons).
> like that 10x40=400 images.i have to choose 8 images from each person for training and 2 images for testing.
OK. in that case, I would recommend using imageDataStore function, like:
dataFolder = pwd; % if your 40 folders are stored in different folder, please change.
imgSet = imageDatastore(dataFolder,...
'IncludeSubfolders', true,...
'LabelSource', 'foldernames');
% Choose first 8 images from each folder and set them to training dataset, and 2 images for test dataset
[imgSetTrain, imgSetTest] = splitEachLabel(imgSet,0.8);
% If you want to choose 8 and 2 images from each folder randomly, please set 'randomized' option
[imgSetTrain, imgSetTest] = splitEachLabel(imgSet,0.8,'randomized');


