YOLOv3 miniBatchSize problem

Question

1 投票

Hello!

I'm trying to use the new MATLAB2020 version of YOLOv3 (https://www.mathworks.com/help/vision/examples/object-detection-using-yolo-v3-deep-learning.html) based on squeezenet network. The default miniBatchSize is 8.

///
miniBatchSize = 8;
imdsTrain.ReadSize = miniBatchSize;
bldsTrain.ReadSize = miniBatchSize;
///

With this value and more, this example works great. But when I change the miniBatchSize values to 4 I get the following error message:

Error using bboxOverlapRatio
The value of 'bboxA' is invalid. Expected input to be finite.
Error in bboxOverlapRatio>validateAndParseInputs (line 195)
parser.parse(bboxA,bboxB,varargin{:});
Error in bboxOverlapRatio>iParseInputs (line 94)
     [bboxA, bboxB, ratioType] = validateAndParseInputs(bboxA, bboxB, varargin{:});
Error in bboxOverlapRatio (line 55)
[bboxA, bboxB, ratioType, isUsingCodeGeneration] = iParseInputs(bboxA,bboxB,varargin{:});
Error in generateTargets>getMaxIOUPredictedWithGroundTruth (line 138)
    overlap = bboxOverlapRatio(predb,truthBatch);
Error in generateTargets (line 45)
    iou = getMaxIOUPredictedWithGroundTruth(bx,by,bw,bh,groundTruth);
Error in modelGradients (line 16)
[boxTarget, objectnessTarget, classTarget, objectMaskTarget, boxErrorScale] =
generateTargets(gatheredPredictions, YTrain, inputImageSize, anchors, mask, penaltyThreshold);
Error in deep.internal.dlfeval (line 18)
[varargout{1:nout}] = fun(x{:});
Error in dlfeval (line 41)
    [varargout{1:nout}] = deep.internal.dlfeval(fun,varargin{:});
Error in YOLOV3Darknet53detector (line 178)
        [gradients,loss,state] = dlfeval(@modelGradients, net, XTrain, YTrain, anchorBoxes,
        anchorBoxMasks, penaltyThreshold, networkOutputs);

At this moment, the total loss graphs look like this (you can see a sharp jump in the end):

Why is this question important? I am trying to build a YOLOv3 object detector based on darknet53 network. When I start training, I get the error described below. To avoid this, I reduce the miniBatchSize to 4 or 2, but in this case I get the above error.

Error using nnet.internal.cnn.dlnetwork/forward (line 218)
Layer 'batch_norm_18': Invalid input data. Out of memory on device. To view more detail about
available memory on the GPU, use 'gpuDevice()'. If the problem persists, reset the GPU by calling
'gpuDevice(1)'.
Error in dlnetwork/forward (line 252)
                [varargout{1:nargout}] = forward(net.PrivateNetwork, x, layerIndices,
                layerOutputIndices);
Error in yolov3Forward (line 5)
[YPredictions{:}, state] = forward(net, XTrain, 'Outputs', networkOutputs);
Error in modelGradients (line 5)
[YPredCell, state] = yolov3Forward(net,XTrain,networkOutputs,mask);
Error in deep.internal.dlfeval (line 18)
[varargout{1:nout}] = fun(x{:});
Error in dlfeval (line 41)
    [varargout{1:nout}] = deep.internal.dlfeval(fun,varargin{:});
Error in YOLOV3Darknet53detector (line 178)
        [gradients,loss,state] = dlfeval(@modelGradients, net, XTrain, YTrain, anchorBoxes,
        anchorBoxMasks, penaltyThreshold, networkOutputs);      

For calculation, I use GPU. Its characteristics are given below. How can I avoid these two errors?

          CUDADevice with properties:
                      Name: 'GeForce GTX 1650'
                     Index: 1
         ComputeCapability: '7.5'
            SupportsDouble: 1
             DriverVersion: 10.2000
            ToolkitVersion: 10.1000
        MaxThreadsPerBlock: 1024
          MaxShmemPerBlock: 49152
        MaxThreadBlockSize: [1024 1024 64]
               MaxGridSize: [2.1475e+09 65535 65535]
                 SIMDWidth: 32
               TotalMemory: 4.2950e+09
           AvailableMemory: 3.0083e+09
       MultiprocessorCount: 14
              ClockRateKHz: 1665000
               ComputeMode: 'Default'
      GPUOverlapsTransfers: 1
    KernelExecutionTimeout: 1
          CanMapHostMemory: 1
           DeviceSupported: 1
            DeviceSelected: 1

2 件のコメント
なしを表示なしを非表示

sweta panigrahi 2020 年 10 月 5 日

hello, did u solve this? even i am getting error when using darknet53 with yolov3

ahmed shahin 2021 年 6 月 10 日

unfortunattly high price license with no value

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Vivek Akkala 2022 年 4 月 28 日

0 投票

Hi,

The first error in your case might be due to presence of zeros in the 'bboxA'. Having zero in either [x, y, width or height] will cause this issue.

The second issue is caused due to GPU memory limitaion. Try reducing the following:

image size of the training set.
miniBatchSize
Use a smaller base network instead of darknet-53

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

YOLOv3 miniBatchSize problem

2 件のコメント
なしを表示なしを非表示

回答 (1 件)

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

カテゴリ

製品

リリース

タグ

Community Treasure Hunt

YOLOv3 miniBatchSize problem

2 件のコメント なしを表示 なしを非表示

回答 (1 件)

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

カテゴリ

製品

リリース

タグ

参考

Community Treasure Hunt

2 件のコメント
なしを表示なしを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示