How to partition data in cells for validation in machine learning model?
3 ビュー (過去 30 日間)
古いコメントを表示
Hello there , I have training data for 4 trials stores in a 4x1 cell named "trainingdataX" and "trainingdataY" as whoen here and I am trying to pull out 15 percent of all this data for validation purposes and store it in variables "Xval" and "Yval". How would I be able to do this if the data is stored in a cells corresponding to the trials and ensure the corresponding value is partioned out for validation too? Any help is greatly appreciated!
%Exclude Data for Val
rng('default')
n = %im not sure what to put here to have it pull data from each of the 4 trials
partition = cvpartition(n,'Holdout',0.15);
idxTrain = training(partition);
FinalTrainX = trainingdataX(idxTrain,:)
FinalTrainY = trainingdataY(idxTrain,:)
idxNew = test(partition);
Xval = trainingdataX(idxNew,:)
Yval = trainingdataY(idxNew,:)
0 件のコメント
回答 (2 件)
YERRAMADAS
2024 年 8 月 1 日
Use the cross-validation method to maximize the data available for each of these sets
Aditya
2024 年 8 月 1 日
To partition data stored in cells for validation, you need to first concatenate the data from all trials into single matrices. After partitioning, you can then split the data back into the training and validation sets.
before moving forward you need to transpose your X and Y data, so that each row of X can correspond to the row of Y.
Here's a sample code for this:
% sample data
trainingdataX = cell(4, 1);
trainingdataY = cell(4, 1);
for i = 1:4
trainingdataX{i} = rand(541, 63);
trainingdataY{i} = rand(541, 1);
end
% Concatenate data
allX = vertcat(trainingdataX{:});
allY = vertcat(trainingdataY{:});
% Partition data (15% holdout for validation)
rng('default'); % For reproducibility
partition = cvpartition(size(allX, 1), 'Holdout', 0.15);
idxTrain = training(partition);
idxVal = test(partition);
% Split into training and validation sets
FinalTrainX = allX(idxTrain, :);
FinalTrainY = allY(idxTrain, :);
Xval = allX(idxVal, :);
Yval = allY(idxVal, :);
% Display results
fprintf('Training data X size: %dx%d\n', size(FinalTrainX, 1), size(FinalTrainX, 2));
fprintf('Training data Y size: %dx%d\n', size(FinalTrainY, 1), size(FinalTrainY, 2));
fprintf('Validation data X size: %dx%d\n', size(Xval, 1), size(Xval, 2));
fprintf('Validation data Y size: %dx%d\n', size(Yval, 1), size(Yval, 2));
I hope this helps!
2 件のコメント
Aditya
2024 年 8 月 1 日
編集済み: Aditya
2024 年 8 月 1 日
As mentioned in my post that your initial data is in shape: 63X541 & 1X541, which is incorrect for vertical concat, for this you need to take the transpose of it and use it:
Inorder to transpose it you can use the below line of code:
% Transpose each cell using cellfun
trainingdataX = cellfun(@transpose, trainingdataX, 'UniformOutput', false);
trainingdataY = cellfun(@transpose, trainingdataY, 'UniformOutput', false);
or you can do it manually using the for loop!
Hope this clarifies your doubt!
参考
カテゴリ
Help Center および File Exchange で Statistics and Machine Learning Toolbox についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!