車両検出のための YOLO v2 ネットワークの学習

この例では次を使用します。

車両検出用の学習データをワークスペースに読み込みます。

data = load('vehicleTrainingData.mat');
trainingData = data.vehicleTrainingData;

学習サンプルが保存されているディレクトリを指定します。ファイル名の絶対パスを学習データに追加します。

dataDir = fullfile(toolboxdir('vision'),'visiondata');
trainingData.imageFilename = fullfile(dataDir,trainingData.imageFilename);

学習のためにデータをランダムにシャッフルします。

rng(0);
shuffledIdx = randperm(height(trainingData));
trainingData = trainingData(shuffledIdx,:);

table のファイルを使用して imageDatastore を作成します。

imds = imageDatastore(trainingData.imageFilename);

table のラベル列を使用して boxLabelDatastore を作成します。

blds = boxLabelDatastore(trainingData(:,2:end));

データストアを統合します。

ds = combine(imds, blds);

事前に初期化された YOLO v2 オブジェクト検出ネットワークを読み込みます。

net = load('yolov2VehicleDetector.mat');
lgraph = net.lgraph

lgraph = 
  LayerGraph with properties:

         Layers: [25×1 nnet.cnn.layer.Layer]
    Connections: [24×2 table]
     InputNames: {'input'}
    OutputNames: {'yolov2OutputLayer'}

YOLO v2 ネットワークの層とそのプロパティを調べます。YOLO v2 オブジェクト検出ネットワークの作成の手順に従って YOLO v2 ネットワークを作成することもできます。

lgraph.Layers

ans = 
  25x1 Layer array with layers:

     1   'input'               Image Input                128x128x3 images
     2   'conv_1'              Convolution                16 3x3 convolutions with stride [1  1] and padding [1  1  1  1]
     3   'BN1'                 Batch Normalization        Batch normalization
     4   'relu_1'              ReLU                       ReLU
     5   'maxpool1'            Max Pooling                2x2 max pooling with stride [2  2] and padding [0  0  0  0]
     6   'conv_2'              Convolution                32 3x3 convolutions with stride [1  1] and padding [1  1  1  1]
     7   'BN2'                 Batch Normalization        Batch normalization
     8   'relu_2'              ReLU                       ReLU
     9   'maxpool2'            Max Pooling                2x2 max pooling with stride [2  2] and padding [0  0  0  0]
    10   'conv_3'              Convolution                64 3x3 convolutions with stride [1  1] and padding [1  1  1  1]
    11   'BN3'                 Batch Normalization        Batch normalization
    12   'relu_3'              ReLU                       ReLU
    13   'maxpool3'            Max Pooling                2x2 max pooling with stride [2  2] and padding [0  0  0  0]
    14   'conv_4'              Convolution                128 3x3 convolutions with stride [1  1] and padding [1  1  1  1]
    15   'BN4'                 Batch Normalization        Batch normalization
    16   'relu_4'              ReLU                       ReLU
    17   'yolov2Conv1'         Convolution                128 3x3 convolutions with stride [1  1] and padding 'same'
    18   'yolov2Batch1'        Batch Normalization        Batch normalization
    19   'yolov2Relu1'         ReLU                       ReLU
    20   'yolov2Conv2'         Convolution                128 3x3 convolutions with stride [1  1] and padding 'same'
    21   'yolov2Batch2'        Batch Normalization        Batch normalization
    22   'yolov2Relu2'         ReLU                       ReLU
    23   'yolov2ClassConv'     Convolution                24 1x1 convolutions with stride [1  1] and padding [0  0  0  0]
    24   'yolov2Transform'     YOLO v2 Transform Layer.   YOLO v2 Transform Layer with 4 anchors.
    25   'yolov2OutputLayer'   YOLO v2 Output             YOLO v2 Output with 4 anchors.

ネットワーク学習オプションを構成します。

options = trainingOptions('sgdm',...
          'InitialLearnRate',0.001,...
          'Verbose',true,...
          'MiniBatchSize',16,...
          'MaxEpochs',30,...
          'Shuffle','never',...
          'VerboseFrequency',30,...
          'CheckpointPath',tempdir);

YOLO v2 ネットワークに学習させます。

[detector,info] = trainYOLOv2ObjectDetector(ds,lgraph,options);

*************************************************************************
Training a YOLO v2 Object Detector for the following object classes:

* vehicle

Training on single CPU.
|========================================================================================|
|  Epoch  |  Iteration  |  Time Elapsed  |  Mini-batch  |  Mini-batch  |  Base Learning  |
|         |             |   (hh:mm:ss)   |     RMSE     |     Loss     |      Rate       |
|========================================================================================|
|       1 |           1 |       00:00:01 |         7.13 |         50.8 |          0.0010 |
|       2 |          30 |       00:00:14 |         1.35 |          1.8 |          0.0010 |
|       4 |          60 |       00:00:27 |         1.13 |          1.3 |          0.0010 |
|       5 |          90 |       00:00:39 |         0.64 |          0.4 |          0.0010 |
|       7 |         120 |       00:00:51 |         0.65 |          0.4 |          0.0010 |
|       9 |         150 |       00:01:04 |         0.72 |          0.5 |          0.0010 |
|      10 |         180 |       00:01:16 |         0.52 |          0.3 |          0.0010 |
|      12 |         210 |       00:01:28 |         0.45 |          0.2 |          0.0010 |
|      14 |         240 |       00:01:41 |         0.61 |          0.4 |          0.0010 |
|      15 |         270 |       00:01:52 |         0.43 |          0.2 |          0.0010 |
|      17 |         300 |       00:02:05 |         0.42 |          0.2 |          0.0010 |
|      19 |         330 |       00:02:17 |         0.52 |          0.3 |          0.0010 |
|      20 |         360 |       00:02:29 |         0.43 |          0.2 |          0.0010 |
|      22 |         390 |       00:02:42 |         0.43 |          0.2 |          0.0010 |
|      24 |         420 |       00:02:54 |         0.59 |          0.4 |          0.0010 |
|      25 |         450 |       00:03:06 |         0.61 |          0.4 |          0.0010 |
|      27 |         480 |       00:03:18 |         0.65 |          0.4 |          0.0010 |
|      29 |         510 |       00:03:31 |         0.48 |          0.2 |          0.0010 |
|      30 |         540 |       00:03:42 |         0.34 |          0.1 |          0.0010 |
|========================================================================================|
Detector training complete.
*************************************************************************

検出器のプロパティを調べます。

detector

detector = 
  yolov2ObjectDetector with properties:

            ModelName: 'vehicle'
              Network: [1×1 DAGNetwork]
    TrainingImageSize: [128 128]
          AnchorBoxes: [4×2 double]
           ClassNames: vehicle

反復ごとの学習損失を調べることで、学習精度を確認できます。

figure
plot(info.TrainingLoss)
grid on
xlabel('Number of Iterations')
ylabel('Training Loss for Each Iteration')

テストイメージをワークスペースに読み取ります。

img = imread('detectcars.png');

学習済みの YOLO v2 オブジェクト検出器をテストイメージに対して実行し、車両検出を行います。

[bboxes,scores] = detect(detector,img);

検出結果を表示します。

if(~isempty(bboxes))
    img = insertObjectAnnotation(img,'rectangle',bboxes,scores);
end
figure
imshow(img)