How to implement Siamese network with the two subnetworks not share weights

7 ビュー (過去 30 日間)
Cloud Wind
Cloud Wind 2022 年 8 月 22 日
コメント済み: Joss Knight 2022 年 9 月 10 日
I was implementing a Siamese using matlab deep learning toolbox. It is easy to implement such a network when the two subnetworks of the Siamese network share weights follwoing this official demo. Now I want to implement a Siamese network with the two subnetworks not share weights. Is there any easy solutions? I know we can set two "dlnetwork", one for input image A and the other for input image B. But the problem is you need to load two subnetworks into GPU memory, which is unavailable when there is no enough memory.
Any good solutions is welcomed, thank you!

回答 (1 件)

Joss Knight
Joss Knight 2022 年 9 月 1 日
You can try gathering the weights back from each network after you've used it, as in net = dlupdate(@gather,net). This should save some memory.
  4 件のコメント
Cloud Wind
Cloud Wind 2022 年 9 月 2 日
Hi Joss, I post the example code. It is following the official Siamese demo. In the demo, the two subnetworks share weights. While I want to implement a Siamese network (without sharing weights) that does need much GPU memory.
clc; clear;
downloadFolder = 'F:\exps\siamese_scd\data';
dataFolderTrain = fullfile(downloadFolder,'train');
dataFolderTest = fullfile(downloadFolder,'test');
%***************************************************
net = alexnet;
layers=net.Layers(1:22);
layers=[layers(1:18,:);layers(20:21,:)];
lgraph = layerGraph(layers);
dlnet = dlnetwork(lgraph);
fcWeights = dlarray(0.01*randn(1,4352));
fcBias = dlarray(0.01*randn(1,1));
fcParams = struct(...
"FcWeights",fcWeights,...
"FcBias",fcBias);
clear net layers
%***************************************************
imdsTrain = imageDatastore(dataFolderTrain, ...
'IncludeSubfolders',true, ...
'LabelSource','none');
files = imdsTrain.Files;
parts = split(files,filesep);
labels = join(parts(:,(end-2):(end-1)),'_');
imdsTrain.Labels = categorical(labels);
imdsTest = imageDatastore(dataFolderTest, ...
'IncludeSubfolders',true, ...
'LabelSource','none');
files = imdsTest.Files;
parts = split(files,filesep);
labels = join(parts(:,(end-2):(end-1)),'_');
imdsTest.Labels = categorical(labels);
%***************************************************
numIterations =1000;
train_miniBatchSize =8;
test_minBatchSize = 4;
learningRate = 2e-5;
trailingAvgSubnet = [];
trailingAvgSqSubnet = [];
trailingAvgParams = [];
trailingAvgSqParams = [];
gradDecay = 0.9;
gradDecaySq = 0.99;
executionEnvironment = "gpu";
plots = "training-progress";
if plots == "training-progress"
figure
subplot(2,1,1)
trainingPlotAxes = gca;
% trainingPlotAxes.YLim = [0 1];
lineLossTrain = animatedline(trainingPlotAxes);
xlabel(trainingPlotAxes,"Iteration")
ylabel(trainingPlotAxes,"Loss")
title(trainingPlotAxes,"Loss During Training")
subplot(2,1,2)
testingPlotAxes = gca;
% testingPlotAxes.YLim = [0 1];
lineLosstest = animatedline(testingPlotAxes);
xlabel(testingPlotAxes,"Iteration")
ylabel(testingPlotAxes,"Loss")
title(testingPlotAxes,"Loss During Testing")
end
%***************************************************
for iteration = 1:numIterations
[X1,X2,pairLabels] = getAlexnetBatch(imdsTrain,train_miniBatchSize);
[tX1,tX2,tpairLabels] = getAlexnetTest(imdsTest,test_minBatchSize);
dlX1 = dlarray(single(X1),'SSCB');
dlX2 = dlarray(single(X2),'SSCB');
tdlX1 = dlarray(single(tX1),'SSCB');
tdlX2 = dlarray(single(tX2),'SSCB');
if executionEnvironment == "gpu"
dlX1 = gpuArray(dlX1);
dlX2 = gpuArray(dlX2);
tdlX1 = gpuArray(tdlX1);
tdlX2 = gpuArray(tdlX2);
end
[gradientsSubnet,gradientsParams,loss] = dlfeval(@modelGradients,dlnet,fcParams,dlX1,dlX2,pairLabels);
lossValue = double(gather(extractdata(loss)));
[~,~,tloss] = dlfeval(@modelGradients,dlnet,fcParams,tdlX1,tdlX2,tpairLabels);
tlossValue = double(gather(extractdata(tloss)));
clear dlX1 dlX2 tdlX1 tdlX2
%***************************************************
[dlnet,trailingAvgSubnet,trailingAvgSqSubnet] = ...
adamupdate(dlnet,gradientsSubnet, ...
trailingAvgSubnet,trailingAvgSqSubnet,iteration,learningRate,gradDecay,gradDecaySq);
[fcParams,trailingAvgParams,trailingAvgSqParams] = ...
adamupdate(fcParams,gradientsParams, ...
trailingAvgParams,trailingAvgSqParams,iteration,learningRate,gradDecay,gradDecaySq);
if plots == "training-progress"
addpoints(lineLossTrain,iteration,lossValue);
addpoints(lineLosstest,iteration,tlossValue);
end
drawnow;
temp1=sprintf('iteration: %d ----- %d',[iteration,numIterations]);
temp2=sprintf('loss: Training:%0.4f ----- Testing:%0.4f',[lossValue,tlossValue]);
disp(temp1);
disp(temp2);
end
%******************************************************************************************
% the called functions
%******************************************************************************************
function [gradientsSubnet,gradientsParams,loss] = modelGradients(dlnet,fcParams,dlX1,dlX2,pairLabels)
% Pass the image pair through the network
Y = forwardSiamese(dlnet,fcParams,dlX1,dlX2);
% Calculate binary cross-entropy loss
loss = binarycrossentropy(Y,pairLabels);
% Calculate gradients of the loss with respect to the network learnable
% parameters
[gradientsSubnet,gradientsParams] = dlgradient(loss,dlnet.Learnables,fcParams);
end
function loss = binarycrossentropy(Y,pairLabels)
% binarycrossentropy accepts the network's prediction Y, the true
% label, and pairLabels, and returns the binary cross-entropy loss value.
% Get precision of prediction to prevent errors due to floating
% point precision
precision = underlyingType(Y);
% Convert values less than floating point precision to eps.
Y(Y < eps(precision)) = eps(precision);
%convert values between 1-eps and 1 to 1-eps.
Y(Y > 1 - eps(precision)) = 1 - eps(precision);
% Calculate binary cross-entropy loss for each pair
loss = -pairLabels.*log(Y) - (1 - pairLabels).*log(1 - Y);
% Sum over all pairs in minibatch and normalize.
loss = sum(loss)/numel(pairLabels);
end
%***************************************************************************************
function Y = forwardSiamese(dlnet,fcParams,dlX1,dlX2)
% forwardSiamese accepts the network and pair of training images, and returns a
% prediction of the probability of the pair being similar (closer to 1) or
% dissimilar (closer to 0). Use forwardSiamese during training.
% Pass the first image through the twin subnetwork
F1 = forward(dlnet,dlX1);
F1 = sigmoid(F1);
% Pass the second image through the twin subnetwork
F2 = forward(dlnet,dlX2);
F2 = sigmoid(F2);
% Subtract the feature vectors
Y = abs(F1 - F2);
end
%***************************************************************************************
function [X1,X2,pairLabels] = getAlexnetTest(imds,miniBatchSize)
pairLabels = zeros(1,miniBatchSize);
X1 = zeros([227 227 3 miniBatchSize]);
X2 = zeros([227 227 3 miniBatchSize]);
imdsaug = augmentedImageDatastore([227 227],imds);
batch=readall(imdsaug);
for i = 1:miniBatchSize
choice = rand(1);
if choice < 0.5
[pairIdx1,pairIdx2,pairLabels(i)] = getSimilarPair(batch.response);
else
[pairIdx1,pairIdx2,pairLabels(i)] = getDissimilarPair(batch.response);
end
X1(:,:,:,i) =batch.input{pairIdx1};
X2(:,:,:,i) =batch.input{pairIdx2};
end
end
function [pairIdx1,pairIdx2,pairLabel] = getSimilarPair(classLabel)
% getSimilarSiamesePair returns a random pair of indices for images
% that are in the same class and the similar pair label = 1.
% Find all unique classes.
classes = unique(classLabel);
% Choose a class randomly which will be used to get a similar pair.
classChoice = randi(numel(classes));
% Find the indices of all the observations from the chosen class.
idxs = find(classLabel==classes(classChoice));
% Randomly choose two different images from the chosen class.
pairIdxChoice = randperm(numel(idxs),2);
pairIdx1 = idxs(pairIdxChoice(1));
pairIdx2 = idxs(pairIdxChoice(2));
pairLabel = 1;
end
function [pairIdx1,pairIdx2,label] = getDissimilarPair(classLabel)
% Find all unique classes.
classes = unique(classLabel);
% Choose two different classes randomly which will be used to get a dissimilar pair.
classesChoice = randperm(numel(classes),2);
% Find the indices of all the observations from the first and second classes.
idxs1 = find(classLabel==classes(classesChoice(1)));
idxs2 = find(classLabel==classes(classesChoice(2)));
% Randomly choose one image from each class.
pairIdx1Choice = randi(numel(idxs1));
pairIdx2Choice = randi(numel(idxs2));
pairIdx1 = idxs1(pairIdx1Choice);
pairIdx2 = idxs2(pairIdx2Choice);
label = 0;
end
%***************************************************************************************
function [X1,X2,pairLabels] = getAlexnetBatch(imds,miniBatchSize)
pairLabels = zeros(1,miniBatchSize);
X1 = zeros([227 227 3 miniBatchSize]);
X2 = zeros([227 227 3 miniBatchSize]);
imageAugmenter = imageDataAugmenter('RandRotation',[90,270],'RandXReflection',true,'RandYReflection',true);
imdsaug = augmentedImageDatastore([227 227],imds,'DataAugmentation',imageAugmenter);
batch=readall(imdsaug);
for i = 1:miniBatchSize
choice = rand(1);
if choice < 0.5
[pairIdx1,pairIdx2,pairLabels(i)] = getSimilarPair(batch.response);
else
[pairIdx1,pairIdx2,pairLabels(i)] = getDissimilarPair(batch.response);
end
X1(:,:,:,i) =batch.input{pairIdx1};
X2(:,:,:,i) =batch.input{pairIdx2};
end
end
function [pairIdx1,pairIdx2,pairLabel] = getSimilarPair(classLabel)
% getSimilarSiamesePair returns a random pair of indices for images
% that are in the same class and the similar pair label = 1.
% Find all unique classes.
classes = unique(classLabel);
% Choose a class randomly which will be used to get a similar pair.
classChoice = randi(numel(classes));
% Find the indices of all the observations from the chosen class.
idxs = find(classLabel==classes(classChoice));
% Randomly choose two different images from the chosen class.
pairIdxChoice = randperm(numel(idxs),2);
pairIdx1 = idxs(pairIdxChoice(1));
pairIdx2 = idxs(pairIdxChoice(2));
pairLabel = 1;
end
function [pairIdx1,pairIdx2,label] = getDissimilarPair(classLabel)
% getDissimilarSiamesePair returns a random pair of indices for images
% that are in different classes and the dissimilar pair label = 0.
% Find all unique classes.
classes = unique(classLabel);
% Choose two different classes randomly which will be used to get a dissimilar pair.
classesChoice = randperm(numel(classes),2);
% Find the indices of all the observations from the first and second classes.
idxs1 = find(classLabel==classes(classesChoice(1)));
idxs2 = find(classLabel==classes(classesChoice(2)));
% Randomly choose one image from each class.
pairIdx1Choice = randi(numel(idxs1));
pairIdx2Choice = randi(numel(idxs2));
pairIdx1 = idxs1(pairIdx1Choice);
pairIdx2 = idxs2(pairIdx2Choice);
label = 0;
end
Joss Knight
Joss Knight 2022 年 9 月 10 日
I'm imagining that you would do something like this, in your forwardSiamese function:
dlnet1 = dlupdate(@gpuArray,dlnet1);
F1 = forward(dlnet1,dlX1);
F1 = sigmoid(F1);
dlnet1 = dlupdate(@gather,dlnet1);
dlnet2 = dlupdate(@gpuArray,dlnet2);
% Pass the second image through the twin subnetwork
F2 = forward(dlnet2,dlX2);
F2 = sigmoid(F2);
dlnet1 = dlupdate(@gather,dlnet1);
For this to work you will need to ensure you always pass in your two networks, at the call to dlfeval as fully host-side networks, so something like
dlnet1 = dlupdate(@gather,dlnet1);
dlnet2 = dlupdate(@gather,dlnet2);
[gradientsSubnet,gradientsParams,loss] = dlfeval(@modelGradients,dlnet1,dlnet2,fcParams,dlX1,dlX2,pairLabels);
If you don't do this then it won't make any difference what you do inside modelGradients because MATLAB will hold onto the GPU copy from the calling code.
You should also remove the fcParams part of the code, since you seem to have deleted the fullyconnect operation and therefore it's wasting space.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeSequence and Numeric Feature Data Workflows についてさらに検索

製品


リリース

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by