How to implement Siamese network with the two subnetworks not share weights

Question

Cloud Wind 2022 年 8 月 22 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1783330-how-to-implement-siamese-network-with-the-two-subnetworks-not-share-weights

コメント済み: Joss Knight 2022 年 9 月 10 日

I was implementing a Siamese using matlab deep learning toolbox. It is easy to implement such a network when the two subnetworks of the Siamese network share weights follwoing this official demo. Now I want to implement a Siamese network with the two subnetworks not share weights. Is there any easy solutions? I know we can set two "dlnetwork", one for input image A and the other for input image B. But the problem is you need to load two subnetworks into GPU memory, which is unavailable when there is no enough memory.

Any good solutions is welcomed, thank you!

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Joss Knight 2022 年 9 月 1 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1783330-how-to-implement-siamese-network-with-the-two-subnetworks-not-share-weights#answer_1039965

You can try gathering the weights back from each network after you've used it, as in net = dlupdate(@gather,net). This should save some memory.

4 件のコメント
2 件の古いコメントを表示2 件の古いコメントを非表示

Cloud Wind 2022 年 9 月 2 日

MATLAB Online で開く

Hi Joss, I post the example code. It is following the official Siamese demo. In the demo, the two subnetworks share weights. While I want to implement a Siamese network (without sharing weights) that does need much GPU memory.

clc; clear;
downloadFolder = 'F:\exps\siamese_scd\data';
dataFolderTrain = fullfile(downloadFolder,'train');
dataFolderTest = fullfile(downloadFolder,'test');
%***************************************************
   net = alexnet;
   layers=net.Layers(1:22);
   layers=[layers(1:18,:);layers(20:21,:)];  
   lgraph = layerGraph(layers);
   dlnet = dlnetwork(lgraph);
   fcWeights = dlarray(0.01*randn(1,4352));
   fcBias = dlarray(0.01*randn(1,1));
   fcParams = struct(...
     "FcWeights",fcWeights,...
     "FcBias",fcBias); 
clear net layers
%***************************************************
imdsTrain = imageDatastore(dataFolderTrain, ...
    'IncludeSubfolders',true, ...
    'LabelSource','none');
files = imdsTrain.Files;
parts = split(files,filesep);
labels = join(parts(:,(end-2):(end-1)),'_');
imdsTrain.Labels = categorical(labels);
imdsTest = imageDatastore(dataFolderTest, ...
    'IncludeSubfolders',true, ...
    'LabelSource','none');
files = imdsTest.Files;
parts = split(files,filesep);
labels = join(parts(:,(end-2):(end-1)),'_');
imdsTest.Labels = categorical(labels);
%***************************************************
numIterations =1000;
train_miniBatchSize =8;
test_minBatchSize = 4;
learningRate = 2e-5;  
trailingAvgSubnet = [];
trailingAvgSqSubnet = [];
trailingAvgParams = [];
trailingAvgSqParams = [];
gradDecay = 0.9;
gradDecaySq = 0.99;
executionEnvironment = "gpu";
plots = "training-progress";
if plots == "training-progress"
    figure
    subplot(2,1,1)
    
    trainingPlotAxes = gca;
%    trainingPlotAxes.YLim = [0 1];
    lineLossTrain = animatedline(trainingPlotAxes);
    xlabel(trainingPlotAxes,"Iteration")
    ylabel(trainingPlotAxes,"Loss")
    title(trainingPlotAxes,"Loss During Training")
    subplot(2,1,2)
    
    testingPlotAxes = gca;
%    testingPlotAxes.YLim = [0 1];
    lineLosstest = animatedline(testingPlotAxes);
    xlabel(testingPlotAxes,"Iteration")
    ylabel(testingPlotAxes,"Loss")
    title(testingPlotAxes,"Loss During Testing")
end
%***************************************************
for iteration = 1:numIterations
       
    [X1,X2,pairLabels] = getAlexnetBatch(imdsTrain,train_miniBatchSize);
    [tX1,tX2,tpairLabels] = getAlexnetTest(imdsTest,test_minBatchSize);
 
    dlX1 = dlarray(single(X1),'SSCB');
    dlX2 = dlarray(single(X2),'SSCB');
    
    tdlX1 = dlarray(single(tX1),'SSCB');
    tdlX2 = dlarray(single(tX2),'SSCB');
	
   
    if executionEnvironment == "gpu"
        dlX1 = gpuArray(dlX1);
        dlX2 = gpuArray(dlX2);
        tdlX1 = gpuArray(tdlX1);
        tdlX2 = gpuArray(tdlX2);
    end
    
    [gradientsSubnet,gradientsParams,loss] = dlfeval(@modelGradients,dlnet,fcParams,dlX1,dlX2,pairLabels);
    lossValue = double(gather(extractdata(loss)));
    
    [~,~,tloss] = dlfeval(@modelGradients,dlnet,fcParams,tdlX1,tdlX2,tpairLabels);
    tlossValue = double(gather(extractdata(tloss)));
    clear dlX1 dlX2 tdlX1 tdlX2
%***************************************************
    [dlnet,trailingAvgSubnet,trailingAvgSqSubnet] = ...
        adamupdate(dlnet,gradientsSubnet, ...
        trailingAvgSubnet,trailingAvgSqSubnet,iteration,learningRate,gradDecay,gradDecaySq);
    
    [fcParams,trailingAvgParams,trailingAvgSqParams] = ...
        adamupdate(fcParams,gradientsParams, ...
        trailingAvgParams,trailingAvgSqParams,iteration,learningRate,gradDecay,gradDecaySq);
    
    if plots == "training-progress"
        addpoints(lineLossTrain,iteration,lossValue);
        addpoints(lineLosstest,iteration,tlossValue);
    end
    drawnow;
    
    temp1=sprintf('iteration: %d ----- %d',[iteration,numIterations]);
    temp2=sprintf('loss: Training：%0.4f ----- Testing：%0.4f',[lossValue,tlossValue]);
    disp(temp1);
    disp(temp2);
end
%******************************************************************************************
% the called functions
%******************************************************************************************
function [gradientsSubnet,gradientsParams,loss] = modelGradients(dlnet,fcParams,dlX1,dlX2,pairLabels)
    % Pass the image pair through the network 
    Y = forwardSiamese(dlnet,fcParams,dlX1,dlX2);
    
    % Calculate binary cross-entropy loss
    loss = binarycrossentropy(Y,pairLabels);
       
    % Calculate gradients of the loss with respect to the network learnable
    % parameters
    [gradientsSubnet,gradientsParams] = dlgradient(loss,dlnet.Learnables,fcParams);
end
function loss = binarycrossentropy(Y,pairLabels)
    % binarycrossentropy accepts the network's prediction Y, the true
    % label, and pairLabels, and returns the binary cross-entropy loss value.
    
    % Get precision of prediction to prevent errors due to floating
    % point precision    
    precision = underlyingType(Y);
      
    % Convert values less than floating point precision to eps.
    Y(Y < eps(precision)) = eps(precision);
    %convert values between 1-eps and 1 to 1-eps.
    Y(Y > 1 - eps(precision)) = 1 - eps(precision);
    
    % Calculate binary cross-entropy loss for each pair
    loss = -pairLabels.*log(Y) - (1 - pairLabels).*log(1 - Y);
    
    % Sum over all pairs in minibatch and normalize.
    loss = sum(loss)/numel(pairLabels);
end
%***************************************************************************************
function Y = forwardSiamese(dlnet,fcParams,dlX1,dlX2)
% forwardSiamese accepts the network and pair of training images, and returns a
% prediction of the probability of the pair being similar (closer to 1) or 
% dissimilar (closer to 0). Use forwardSiamese during training.
    % Pass the first image through the twin subnetwork
 
    F1 = forward(dlnet,dlX1);
    F1 = sigmoid(F1);
    
    % Pass the second image through the twin subnetwork
    F2 = forward(dlnet,dlX2);
    F2 = sigmoid(F2);
    % Subtract the feature vectors
    Y = abs(F1 - F2);
end
%***************************************************************************************
function [X1,X2,pairLabels] = getAlexnetTest(imds,miniBatchSize)
pairLabels = zeros(1,miniBatchSize);
X1 = zeros([227 227 3 miniBatchSize]);
X2 = zeros([227 227 3 miniBatchSize]);
imdsaug = augmentedImageDatastore([227 227],imds);
batch=readall(imdsaug);
    for i = 1:miniBatchSize
        choice = rand(1);
        if choice < 0.5
            [pairIdx1,pairIdx2,pairLabels(i)] = getSimilarPair(batch.response);
        else
            [pairIdx1,pairIdx2,pairLabels(i)] = getDissimilarPair(batch.response);
        end
        
        X1(:,:,:,i) =batch.input{pairIdx1};
        X2(:,:,:,i) =batch.input{pairIdx2};
        
    end
end
function [pairIdx1,pairIdx2,pairLabel] = getSimilarPair(classLabel)
% getSimilarSiamesePair returns a random pair of indices for images
% that are in the same class and the similar pair label = 1.
    % Find all unique classes.
    classes = unique(classLabel);
    
    % Choose a class randomly which will be used to get a similar pair.
    classChoice = randi(numel(classes));
    
    % Find the indices of all the observations from the chosen class.
    idxs = find(classLabel==classes(classChoice));
    
    % Randomly choose two different images from the chosen class.
    pairIdxChoice = randperm(numel(idxs),2);
    pairIdx1 = idxs(pairIdxChoice(1));
    pairIdx2 = idxs(pairIdxChoice(2));
    pairLabel = 1;
end
function  [pairIdx1,pairIdx2,label] = getDissimilarPair(classLabel)
    % Find all unique classes.
    classes = unique(classLabel);
    
    % Choose two different classes randomly which will be used to get a dissimilar pair.
    classesChoice = randperm(numel(classes),2);
    
    % Find the indices of all the observations from the first and second classes.
    idxs1 = find(classLabel==classes(classesChoice(1)));
    idxs2 = find(classLabel==classes(classesChoice(2)));
    
    % Randomly choose one image from each class.
    pairIdx1Choice = randi(numel(idxs1));
    pairIdx2Choice = randi(numel(idxs2));
    pairIdx1 = idxs1(pairIdx1Choice);
    pairIdx2 = idxs2(pairIdx2Choice);
    label = 0;
end
%***************************************************************************************
function [X1,X2,pairLabels] = getAlexnetBatch(imds,miniBatchSize)
pairLabels = zeros(1,miniBatchSize);
X1 = zeros([227 227 3 miniBatchSize]);
X2 = zeros([227 227 3 miniBatchSize]);
imageAugmenter = imageDataAugmenter('RandRotation',[90,270],'RandXReflection',true,'RandYReflection',true);
imdsaug = augmentedImageDatastore([227 227],imds,'DataAugmentation',imageAugmenter);
batch=readall(imdsaug);
    for i = 1:miniBatchSize
        choice = rand(1);
        if choice < 0.5
            [pairIdx1,pairIdx2,pairLabels(i)] = getSimilarPair(batch.response);
        else
            [pairIdx1,pairIdx2,pairLabels(i)] = getDissimilarPair(batch.response);
        end
        
        X1(:,:,:,i) =batch.input{pairIdx1};
        X2(:,:,:,i) =batch.input{pairIdx2};
        
    end
end
function [pairIdx1,pairIdx2,pairLabel] = getSimilarPair(classLabel)
% getSimilarSiamesePair returns a random pair of indices for images
% that are in the same class and the similar pair label = 1.
    % Find all unique classes.
    classes = unique(classLabel);
    
    % Choose a class randomly which will be used to get a similar pair.
    classChoice = randi(numel(classes));
    
    % Find the indices of all the observations from the chosen class.
    idxs = find(classLabel==classes(classChoice));
    
    % Randomly choose two different images from the chosen class.
    pairIdxChoice = randperm(numel(idxs),2);
    pairIdx1 = idxs(pairIdxChoice(1));
    pairIdx2 = idxs(pairIdxChoice(2));
    pairLabel = 1;
end
function  [pairIdx1,pairIdx2,label] = getDissimilarPair(classLabel)
% getDissimilarSiamesePair returns a random pair of indices for images
% that are in different classes and the dissimilar pair label = 0.
    % Find all unique classes.
    classes = unique(classLabel);
    
    % Choose two different classes randomly which will be used to get a dissimilar pair.
    classesChoice = randperm(numel(classes),2);
    
    % Find the indices of all the observations from the first and second classes.
    idxs1 = find(classLabel==classes(classesChoice(1)));
    idxs2 = find(classLabel==classes(classesChoice(2)));
    
    % Randomly choose one image from each class.
    pairIdx1Choice = randi(numel(idxs1));
    pairIdx2Choice = randi(numel(idxs2));
    pairIdx1 = idxs1(pairIdx1Choice);
    pairIdx2 = idxs2(pairIdx2Choice);
    label = 0;
end

Joss Knight 2022 年 9 月 10 日

MATLAB Online で開く

I'm imagining that you would do something like this, in your forwardSiamese function:

dlnet1 = dlupdate(@gpuArray,dlnet1);
F1 = forward(dlnet1,dlX1);
F1 = sigmoid(F1);
dlnet1 = dlupdate(@gather,dlnet1);
dlnet2 = dlupdate(@gpuArray,dlnet2);
% Pass the second image through the twin subnetwork
F2 = forward(dlnet2,dlX2);
F2 = sigmoid(F2);
dlnet1 = dlupdate(@gather,dlnet1);

For this to work you will need to ensure you always pass in your two networks, at the call to dlfeval as fully host-side networks, so something like

dlnet1 = dlupdate(@gather,dlnet1);
dlnet2 = dlupdate(@gather,dlnet2);
[gradientsSubnet,gradientsParams,loss] = dlfeval(@modelGradients,dlnet1,dlnet2,fcParams,dlX1,dlX2,pairLabels);

If you don't do this then it won't make any difference what you do inside modelGradients because MATLAB will hold onto the GPU copy from the calling code.

You should also remove the fcParams part of the code, since you seem to have deleted the fullyconnect operation and therefore it's wasting space.

サインインしてコメントする。

How to implement Siamese network with the two subnetworks not share weights

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

4 件のコメント
2 件の古いコメントを表示2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

How to implement Siamese network with the two subnetworks not share weights

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

4 件のコメント 2 件の古いコメントを表示2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

4 件のコメント
2 件の古いコメントを表示2 件の古いコメントを非表示