オブジェクトの検出

YOLO や Grounding DINO などの事前学習済み AI モデルを使用してグラウンドトゥルースにラベルを付け、オブジェクトを検出し、転移学習を使用してカスタム検出器を作成する

Computer Vision Toolbox™ は、深層学習と従来型のコンピュータービジョン技術の両方を使用して、オブジェクト検出モデルの作成、学習、評価、および展開を行うための包括的なツールと関数を提供します。イメージラベラーアプリとビデオラベラーアプリを使用して、ラベル付きのグラウンドトゥルースを作成することから始められます。これらのアプリは、対話形式および AI アシストによる、イメージやビデオフレーム内のオブジェクトを囲む境界ボックスの注釈付けをサポートしています。

ラベル付きデータを入手したら、YOLO v2、YOLO v3、YOLO v4、YOLOX、RTMDet、SSD、Grounding DINO など、幅広い事前学習済み深層学習オブジェクト検出器から選択できます。ツールボックスには、人物認識や顔認識タスクのための、peopleDetector や faceDetector といった専用の検出器も含まれています。これらのモデルは、推論に直接使用することも、転移学習の出発点として使用して特定のデータセットや用途に合わせてモデルをカスタマイズすることもできます。詳細については、深層学習を使用したオブジェクト検出入門を参照してください。従来のオブジェクト検出手法のために、ツールボックスには集約チャネル特徴 (ACF) およびカスケード (Viola-Jones) オブジェクト検出器のサポートが含まれています。

ツールボックスは、転移学習を使用してオブジェクト検出器に学習させるための関数を提供します。さらに、ツールボックスは、学習データの管理と前処理機能に加え、現実世界の変動をシミュレーションすることでロバストなモデル学習を実現するデータ拡張ツールも提供します。詳細については、深層学習用イメージ前処理とイメージ拡張の入門を参照してください。

事前学習済みモデルまたはカスタムモデルを使用して検出結果を生成した後、オブジェクト検出器アナライザーアプリを使用して、検出結果をグラウンドトゥルースと比較できます。このアプリを使用すると、さまざまな intersection over union (IOU) しきい値にわたって、混同行列、適合率、再現率、F1 スコア、平均適合率 (mAP) などの主要なパフォーマンスメトリクスを評価できます。あるいは、evaluateObjectDetection 関数を使用して、検出パフォーマンスメトリクスを評価することもできます。詳細については、Evaluate Object Detector PerformanceとGet Started with Object Detector Analyzer Appを参照してください。

Three images: the first contains labeled boats, the second a diagram of a neural network, and the third the keypoints from a person detector overlaid on the image of the people it has detected.

アプリ

イメージラベラー	コンピュータービジョンの応用に使用するラベルイメージ
ビデオラベラー	Label video for computer vision applications
オブジェクト検出器アナライザー	Interactively visualize and evaluate object detection results against ground truth (R2026a 以降)

関数

すべて展開する

事前学習済み AI モデルを使用したオブジェクトの検出

深層学習検出器

`groundingDinoObjectDetector`	Detect and localize objects using Grounding DINO object detector (R2026a 以降)
`rtmdetObjectDetector`	Detect objects using RTMDet object detector (R2024b 以降)
`ssdObjectDetector`	Detect objects using SSD deep learning detector
`yolov2ObjectDetector`	Detect objects using YOLO v2 object detector
`yolov3ObjectDetector`	Detect objects using YOLO v3 object detector
`yolov4ObjectDetector`	Detect objects using YOLO v4 object detector (R2022a 以降)
`yoloxObjectDetector`	YOLOX オブジェクト検出器を使用したオブジェクトの検出 (R2023b 以降)
`peopleDetector`	Detect people using pretrained deep learning object detector (R2024b 以降)
`faceDetector`	Detect faces using pretrained RetinaFace face detector (R2025a 以降)
`detectTextCRAFT`	Detect texts in images by using CRAFT deep learning model (R2022a 以降)
`imfindcirclesYOLO`	Find circles using YOLOX object detector (R2026a 以降)

特徴ベースの検出器

`acfObjectDetector`	Detect objects using aggregate channel features
`peopleDetectorACF`	集約チャネル特徴を使用した人物の検出
`vision.CascadeObjectDetector`	Viola-Jones アルゴリズムを使用したオブジェクトの検出
`vision.ForegroundDetector`	混合ガウスモデルを使用した前景の検出
`vision.BlobAnalysis`	連結された領域のプロパティ

検出オブジェクトの選択

`selectStrongestBbox`	Select strongest bounding boxes from overlapping clusters using nonmaximal suppression (NMS)
`selectStrongestBboxMulticlass`	Select strongest multiclass bounding boxes from overlapping clusters using nonmaximal suppression (NMS)

転移学習を使用したカスタムオブジェクト検出器の学習

学習データの読み込み

`boxLabelDatastore`	Datastore for bounding box label data
`groundTruth`	グラウンドトゥルースラベルデータ
`imageDatastore`	イメージデータのデータストア
`objectDetectorTrainingData`	オブジェクト検出器用の学習データの作成
`combine`	複数のデータストアのデータを統合

深層学習ベースのオブジェクト検出器の学習

`trainSSDObjectDetector`	Train SSD deep learning object detector
`trainYOLOv2ObjectDetector`	Train YOLO v2 object detector
`trainYOLOv3ObjectDetector`	Train YOLO v3 object detector (R2024a 以降)
`trainYOLOv4ObjectDetector`	Train YOLO v4 object detector (R2022a 以降)
`trainYOLOXObjectDetector`	Train YOLOX object detector (R2023b 以降)

特徴ベースのオブジェクト検出器の学習

`trainACFObjectDetector`	ACF オブジェクト検出器に学習させる
`trainCascadeObjectDetector`	カスケード型オブジェクト検出器モデルの学習

深層学習用の学習データの拡張と前処理

`balanceBoxLabels`	Balance bounding box labels for object detection
`bboxcrop`	境界ボックスのトリミング
`bboxerase`	Remove bounding boxes
`bboxresize`	境界ボックスのサイズ変更
`bboxwarp`	Apply geometric transformation to bounding boxes
`bbox2points`	Convert rectangle to corner points list
`blockLocationsWithROI`	Select image block locations that contain bounding box ROIs (R2025a 以降)
`imwarp`	イメージへの幾何学的変換の適用
`imcrop`	イメージのトリミング
`imresize`	イメージのサイズ変更
`randomAffine2d`	ランダムな 2 次元アフィン変換の作成
`centerCropWindow2d`	四角形の中央トリミングウィンドウの作成
`randomWindow2d`	Randomly select rectangular region in image
`integralImage`	2 次元積分イメージの計算
`transform`	データストアの変換

オブジェクト検出用の深層ニューラルネットワークの設計

R-CNN (Regions with Convolutional Neural Networks)

`roiAlignLayer`	Non-quantized ROI pooling layer for Mask-CNN
`roiMaxPooling2dLayer`	Neural network layer used to output fixed-size feature maps for rectangular ROIs
`roialign`	Non-quantized ROI pooling of `dlarray` data (R2021b 以降)

YOLO v2 (You Only Look Once version 2)

`yolov2TransformLayer`	Create transform layer for YOLO v2 object detection network
`spaceToDepthLayer`	空間から深さへの変換層

焦点損失

focalCrossEntropy Compute focal cross-entropy loss

SSD (シングルショット検出器)

ssdMergeLayer Create SSD merge layer for object detection

アンカーボックス

estimateAnchorBoxes Estimate anchor boxes for deep learning object detectors

オブジェクト検出結果の評価

`evaluateObjectDetection`	Evaluate object detection data set against ground truth (R2023b 以降)
`objectDetectionMetrics`	Object detection quality metrics (R2023b 以降)
`mAPObjectDetectionMetric`	Mean average precision (mAP) metric for object detection (R2024a 以降)
`bboxOverlapRatio`	境界ボックスのオーバーラップ率の計算
`bboxPrecisionRecall`	Compute bounding box precision and recall against ground truth
`drise`	Explain object detection network predictions using D-RISE (R2024a 以降)

オブジェクト検出結果の可視化

`cuboid2img`	Project cuboids from 3-D world coordinates to 2-D image coordinates (R2022b 以降)
`insertObjectAnnotation`	トゥルーカラーイメージ、グレースケールイメージ、またはビデオへの注釈付け
`insertObjectMask`	Insert masks in image or video stream
`insertShape`	イメージまたはビデオへの形状の挿入
`insertText`	イメージまたはビデオへのテキストの挿入
`showShape`	Display shapes on image, video, or point cloud

ブロック

Deep Learning Object Detector

学習済み深層学習オブジェクト検出器を使用したオブジェクトの検出 (R2021b 以降)

トピック

オブジェクト検出用のグラウンドトゥルースと学習データの作成

イメージラベラー入門
四角形の ROI (オブジェクト検出用)、ピクセル (セマンティックセグメンテーション用)、多角形 (インスタンスセグメンテーション用)、およびシーン (イメージ分類用) に対話形式でラベルを付ける。
ビデオラベラー入門
ビデオおよびイメージのシーケンス内の四角形の ROI (オブジェクト検出用)、ピクセル (セマンティックセグメンテーション用)、多角形 (インスタンスセグメンテーション用)、およびシーン (イメージ分類用) に対話形式でラベルを付ける。
オブジェクト検出およびセマンティックセグメンテーション用の学習データ
イメージラベラーやビデオラベラーを使用して、オブジェクト検出器やセマンティックセグメンテーションの学習データを作成します。
深層学習用イメージ前処理とイメージ拡張の入門
サイズ変更などの確定的演算を使用して深層学習アプリケーション用にデータを前処理する。あるいは、ランダムトリミングなどのランダム演算を使用して学習データを拡張する。

事前学習済み検出器を使用したオブジェクトの検出

深層学習を使用したオブジェクト検出入門
YOLOX、YOLO v4、SSD などの深層学習ニューラルネットワークを使用してオブジェクト検出を実行する。
オブジェクト検出器の選択
YOLOX、YOLO v4、RTMDet、SSD などのオブジェクト検出深層学習モデルの比較。
カスケード型オブジェクト検出器入門
カスタム分類器に学習させる。
MATLAB による深層学習 (Deep Learning Toolbox)
畳み込みニューラルネットワークを使用して分類や回帰を行う MATLAB^® の深層学習機能を確認します。これには、事前学習済みのネットワークと転移学習のほか、GPU、CPU、クラスター、およびクラウドでの学習が含まれます。
事前学習済みの深層ニューラルネットワーク (Deep Learning Toolbox)
分類、転移学習、特徴抽出用の事前学習済みの畳み込みニューラルネットワークのダウンロード方法と使用方法を学習します。

オブジェクト検出結果の評価

Evaluate Object Detector Performance
Evaluate object detector performance using metrics such as average precision, precision recall, and confusion matrix.
Get Started with Object Detector Analyzer App
Use Object Detector Analyzer app to evaluate pretrained object detectors or precomputed detection results against the ground truth data, and evaluate performance metrics.
Calibrate Object Detection Confidence Scores
This example shows how to calibrate the confidence scores of an object using Platt scaling.

注目の例

新規

Automatically Search and Label Video Frames Using VLMs

Automatically search and detect objects based on natural language text queries using vision-language models (VLMs).

R2026a 以降
ライブスクリプトを開く

新規

Visualize Object Detection Results from Pretrained PyTorch Model

Detect objects using a pretrained PyTorch® model and visualize the results in Object Detector Analyzer.

R2026a 以降
ライブスクリプトを開く

新規

Automatically Label Ground Truth Using Vision-Language Model

Automatically label ground truth images for object detection using the Grounding DINO vision-language model (VLM).

R2026a 以降
ライブスクリプトを開く

Detect Small Objects Using Tiled Training of YOLOX Network

Detect small objects in full-resolution images using tiled training of a you only look once version X (YOLOX) deep learning network.

R2024b 以降
ライブスクリプトを開く

Object Detection in Large Satellite Imagery Using Deep Learning

Perform object detection on large satellite imagery using deep learning.

ライブスクリプトを開く

YOLO v4 深層学習を使用したオブジェクトの検出

この例では、You Only Look Once version 4 (YOLO v4) 深層学習ネットワークを使用して、イメージ内のオブジェクトを検出する方法を説明します。この例では、次の作業を行います。

ライブスクリプトを開く

YOLO v2 深層学習を使用したマルチクラスオブジェクト検出

YOLO v2 マルチクラスオブジェクト検出器に学習させ、選択したクラスとオーバーラップしきい値についてオブジェクト検出器のパフォーマンスを評価する。

R2024b 以降
ライブスクリプトを開く

Train Object Detectors in Experiment Manager

Use the Experiment Manager app to find optimal training options for object detectors.

スクリプトを開く

イメージポイント機能を使用した要素の多いシーン内のオブジェクトの検出

この例では、オブジェクトの参照イメージがある場合に、要素の多いシーンで特定のオブジェクトを検出する方法を説明します。

スクリプトを開く

混合ガウスモデルを使用した自動車の検出

この例では、混合ガウスモデル (GMM) に基づく前景検出器を使用してビデオシーケンスから自動車を検出し、その数をカウントする方法を説明します。

スクリプトを開く

事前学習済みの ONNX YOLO v2 オブジェクト検出器のインポート

事前学習済みの YOLO v2 オブジェクト検出器を ONNX 深層学習フレームワークからインポートします。

ライブスクリプトを開く

YOLO v2 オブジェクト検出器の ONNX へのエクスポート

事前学習済みの YOLO v2 オブジェクト検出器を ONNX 深層学習フレームワークへエクスポートします。

ライブスクリプトを開く

Generate Code for Detecting Objects in Images by Using ACF Object Detector

Generate code from a MATLAB® function that detects objects in images by using an acfObjectDetector object. When you intend to generate code from your MATLAB function that uses an acfObjectDetector object, you must create the object outside of the MATLAB function. The example explains how to modify the MATLAB code in ACF オブジェクト検出器を使用した一時停止標識検出器の学習 to support code generation.