Image Regression using .mat Files and a datastore

I would like to train a CNN for image regression using a datastore. My images are stored in .mat files (not png or jpeg). This is not image-to-image regression, rather an image to single regression label problem. Is it possible to do this using a datastore, or at least some other out-of-memory approach?

 採用された回答

luisa di monaco
luisa di monaco 2019 年 12 月 7 日
編集済み: luisa di monaco 2020 年 1 月 2 日

8 投票

I have solved something similar.
I'm trying to train a CNN for regression. My inputs are numeric matrices of size 32x32x2 (each input includes 2 grayscale images as two channels). My outputs are numeric vectors of length 6.
500 000 is the total amount of data.
I created 500 000 .mat file for inputs in folder 'inputData' and 500 000 .mat file for target in folder 'targetData'. Each .mat file contains only 1 variable of type double called 'C'.
The size of C is 32x32x2 (if input) or 1x6 (if target).
inputData=fileDatastore(fullfile('inputData'),'ReadFcn',@load,'FileExtensions','.mat');
targetData=fileDatastore(fullfile('targetData'),'ReadFcn',@load,'FileExtensions','.mat');
inputDatat = transform(inputData,@(data) rearrange_datastore(data));
targetDatat = transform(targetData,@(data) rearrange_datastore(data));
trainData=combine(inputDatat,targetDatat);
% here I defined my network architecture
% here I defined my training options
net=trainNetwork(trainData, Layers, options);
function image = rearrange_datastore(data)
image=data.C;
image= {image};
end

18 件のコメント

Matthew Fall
Matthew Fall 2019 年 12 月 10 日
Seems like it should work! I've already moved away from this project, but I will give this a try if it comes up again. What version of Matlab did you use?
luisa di monaco
luisa di monaco 2019 年 12 月 10 日
2019b because I need to define a custom regression layer. I don't know if 2019b is needed to use filedatastore, combine and transform.
shi long liu
shi long liu 2020 年 7 月 14 日
編集済み: shi long liu 2020 年 7 月 14 日
Thank you.
This is wonderful answer I have met in hanlding with this problem.
supriya Naik
supriya Naik 2020 年 10 月 7 日
500000 data means is this 500000 number of images???
luisa di monaco
luisa di monaco 2020 年 10 月 7 日
Yes!
Sofia Esteves
Sofia Esteves 2021 年 3 月 11 日
Hello Luisa,
Precious info!
How did you define the validation set using this scheme?
luisa di monaco
luisa di monaco 2021 年 3 月 11 日
Hi Sofia,
this is my main file (I used it to run both generation of inputs and training).
I hope this code can answer your question =)
clear
close all
clc
%% IMAGE GENERATION
tot_imagepairs=500000; % tot_imagepairs is the total number of input (validation + training)
val_fraction=1/10; % fraction of tot_imagepairs for validation set
val_imagepairs=tot_imagepairs*val_fraction; % validation set
train_imagepairs=tot_imagepairs-val_imagepairs; % training set
mkdir inputData train
mkdir inputData val
mkdir targetData train
mkdir targetData val
% image generator4 is a function that generates synthetic images (size 32x32x2)
% it uses a for loop and it saves images in the folder passed as input ('val' or 'train')
% as .mat file
tic
image_generator4('val', val_imagepairs,1)
toc
tic
image_generator4('train', tot_imagepairs,val_imagepairs+1)
toc
%% TRAINING
train_cnn_piv4_x2019b % script for datastore set up and training
Sofia Esteves
Sofia Esteves 2021 年 3 月 13 日
Yes, thank you for your reply! I assume you also use the fileDatastore, transform and combine functions for the validation set to later insert in the options field?
luisa di monaco
luisa di monaco 2021 年 3 月 13 日
Yes!
Sofia Esteves
Sofia Esteves 2021 年 4 月 14 日
編集済み: Sofia Esteves 2021 年 4 月 14 日
Hello again, Luisa!
I have another question: did you use the 'Shuffle','every-epoch' training option and parallel or multi-GPU training?
When I do, I get this warning: Input datastore is not shuffleable but trainingOptions specified shuffling. Training will proceed without shuffling.
The following isPartionable and isShuffeable functions return 1 in case the datastore is partionable/shuffeable and 0 in case it is not.
tf = isPartitionable(inputDatat)
tf = 1
tf = isShuffleable(inputDatat)
tf = 0
tf = isPartitionable(trainData)
tf = 0
tf = isShuffleable(trainData)
tf = 0
Were you able to solve this problem? Thank you
luisa di monaco
luisa di monaco 2021 年 4 月 17 日
Hi, Sofia! No, I faced this problem and found no solution. Fortunately, it was not a critical issue in my case, because I used randomly generated synthetic data and I managed to train my net even without shuffling. If your data absolutely need to be shuffled, I think you can try to shuffle them somehow before training and then you may be able to perform training without shuffling.
Sofia Esteves
Sofia Esteves 2021 年 4 月 18 日
Ok, thank you so much once again :)
tianliang wang
tianliang wang 2021 年 4 月 28 日
編集済み: tianliang wang 2021 年 4 月 28 日
Hi,Luisa, I have two folders (input and traget), each of these two folder has 100 mat files(image). I want to know how can i define the validation and the test dataset, as we all know, the imagedatastore use the function of splitEachLabel. And, how to set the training options?
luisa di monaco
luisa di monaco 2021 年 4 月 28 日
編集済み: luisa di monaco 2021 年 4 月 28 日
Hi.
I think the easiest way to set training options is to find a way to separate training and validation data before datastore generation.
I put training data and validation data into different folders (it was easy in my case because I generated synthetic data using Matlab code). Then, I defined a datastore for training data and a different datastore for validation data. Here is my code. I hope this can help!
%% IMAGE GENERATION
tot_imagepairs=500000; % image pairs for training
val_fraction=1/10; % validation data [fraction of tot_imagepairs]
val_imagepairs=tot_imagepairs*val_fraction;
train_imagepairs=tot_imagepairs-val_imagepairs;
mkdir inputData train
mkdir inputData val
mkdir targetData train
mkdir targetData val
image_generator4('val', val_imagepairs,1) % generation of validation dataset
image_generator4('train', tot_imagepairs,val_imagepairs+1) % generation of training dataset
%% LOAD AND REARRANGE DATA
% training data
inputData=fileDatastore(fullfile('inputData', 'train'),'ReadFcn',@load,'FileExtensions','.mat');
targetData=fileDatastore(fullfile('targetData','train'),'ReadFcn',@load,'FileExtensions','.mat');
inputDatat = transform(inputData,@(data) rearrange_datastore(data));
targetDatat = transform(targetData,@(data) rearrange_datastore(data));
trainData=combine(inputDatat,targetDatat);
% validation data
inputData=fileDatastore(fullfile('inputData', 'val'),'ReadFcn',@load,'FileExtensions','.mat');
targetData=fileDatastore(fullfile('targetData','val'),'ReadFcn',@load,'FileExtensions','.mat');
inputDatat = transform(inputData,@(data) rearrange_datastore(data));
targetDatat = transform(targetData,@(data) rearrange_datastore(data));
valData=combine(inputDatat,targetDatat);
%%
options = trainingOptions(...,
'Validationdata', valData,...
'ValidationFrequency',1000);
tianliang wang
tianliang wang 2021 年 4 月 30 日
OK! Thanks for your reply. I have solved my problem!
muhammed shames
muhammed shames 2021 年 10 月 6 日
@tianliang wang hello sir, can I please get in touch with you, I need you to help me solve the same problem
Fadhurrahman
Fadhurrahman 2022 年 1 月 6 日
編集済み: Fadhurrahman 2022 年 1 月 6 日
hello @luisa di mona how did you create all 50000 mat files with 32x32? is there any refrence to do it?
luisa di monaco
luisa di monaco 2022 年 1 月 6 日
Hi,
the creation process was part of my thesis work. Here you can download my thesis:
http://webthesis.biblio.polito.it/id/eprint/14716 . Dataset creation is described in chapter 4 (4.2, 4.3 and 4.5) .
Here you can find some Matlab code: https://github.com/lu-p/standard-PIV-image-generator
I hope this can help.

サインインしてコメントする。

その他の回答 (2 件)

Johanna Pingel
Johanna Pingel 2019 年 4 月 29 日
編集済み: Johanna Pingel 2019 年 4 月 29 日

0 投票

I've used a .mat to imagedatastore conversion here:
imds = imageDatastore(ImagesDir,'FileExtensions','.mat','ReadFcn',@matRead);
function data = matRead(filename)
inp = load(filename);
f = fields(inp);
data = inp.(f{1});

2 件のコメント

Matthew Fall
Matthew Fall 2019 年 4 月 29 日
Thank you for your swift reply.
Unfortunately, the matlab regression example requires loading all of the training and validation data in memory, which I want to avoid by using the datastore.
I've tried using the imageDatastore with regression labels before, but then trainNetwork gives me the error:
Error using trainNetwork (line 150)
Invalid training data. The labels of the ImageDatastore must be a categorical vector.
tianliang wang
tianliang wang 2021 年 4 月 28 日
Is it more convenient to use mat files as the training set for the images to vectors regression ?

サインインしてコメントする。

Lykke Kempfner
Lykke Kempfner 2019 年 8 月 16 日

0 投票

I have same problem.
I have many *.mat files with data that can not fit in memory. You may consider the files as not standard images. I have the ReadFunction for the files. I wish to create a datastore (?) where each sample are associated with two single values and not a class.
Are there any solution to this issue ?

2 件のコメント

Tomer Nahshon
Tomer Nahshon 2020 年 1 月 22 日
Same here
tanfeng
tanfeng 2020 年 10 月 12 日
You could try this
tblTrain=table(X,Y)
net = trainNetwork(tblTrain,layers,options);

サインインしてコメントする。

カテゴリ

ヘルプ センター および File ExchangeDeep Learning Toolbox についてさらに検索

製品

リリース

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by