[YOLOv2] image size and Input size of the network

Question

Tai 2023 年 11 月 27 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2052782-yolov2-image-size-and-input-size-of-the-network

回答済み: Garmit Pant 2024 年 1 月 9 日

採用された回答: Garmit Pant

I am currently working with YOLOv2, and find out some issues regarding the input size.

For instance, I use ResNet50 as feature network, and its original input is 224*224*3.

During the training, I can define a different input size, e.g., 720*720*3. It is different from the input size of ResNet50. How can it work? Does Matlab implicitly resize the input image before feeding it into the network? If so, does it make sense to define a input size for the network, which is different from 224*224*3?
I have also tried to define an input size which is not quadratic, e.g., 720*1280*3, during the training. The trained network has a very bad performance. What is reason? Is it required to define an input size which is proportional to the network input size (224*224*3)?
After training the network with input size 720*720*3, I can feed the trained network with images of different sizes, e.g., 720*1280*3, and the performance is still good. What happens here? The input image is implicitly resized?

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Walter Roberson 2023 年 11 月 27 日

If there is an augmented image datastore https://www.mathworks.com/help/deeplearning/ref/augmentedimagedatastore.html then it might contain a resize step that is automatically adjusting the size of whatever comes in.

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Garmit Pant 2024 年 1 月 9 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2052782-yolov2-image-size-and-input-size-of-the-network#answer_1385866

Hello Tai,

I understand that you are working with YOLOv2 and using ResNet50 as the feature extractor and have queries regarding the “inputSize” property and how it affects the training and inferencing using the YOLOv2 object detector.

To address your queries:

ResNet50 can have “inputSize” other than 224x224x3. On setting a different “inputSize”, the input layer of the network is replaced with a layer that accepts input images of the specified size.
The network works best on square images. The “inputSize” need not be proportional but it is better to have square images.
After training, a “yoloV2ObjectDetector” object can make predictions using “detect” function on any image that is of the same size or greater than the training image size. The function “detect” automatically resizes any image that is of a greater size than the training image size.

For further understanding on Yolov2, you can refer to the following MATLAB Documentation:

Example script to train and detect using Yolov2- https://www.mathworks.com/help/deeplearning/ug/object-detection-using-yolo-v2.html
Documentation for “detect” function - https://www.mathworks.com/help/vision/ref/yolov2objectdetector.detect.html

I hope you find the above explanation and suggestions useful!

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

[YOLOv2] image size and Input size of the network

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

採用された回答

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

[YOLOv2] image size and Input size of the network

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

採用された回答

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示