How does the OverlapThreshold argument in balanceBoxLabels get passed along to the resulting blockLocationSet?

1 回表示 (過去 30 日間)
If balanceBoxLabels simply creates a blockLocationSet ("bSet"), how exactly does the OverlapThreshold argument/value specified in balanceBoxLabels get 'passed along' to the subsequent step of creating a boxLabelDatastore ("blds") based on the blockLocationSet, given that a blockLocationSet only contains the properties (1) Image number(s), (2) Block origin(s), (3) Block size and (4) Resolution level(s)?
For example, if I do:
bSet = balanceBoxLabels(boxLabels,blockedImages,blockSize,numObservations,OverlapThreshold=0.5)
This is supposed to specify that any bounding box that partially overlaps a block by >50% of the bounding box's size gets cropped to the block boundary, while any bounding box that partially overlaps a block by <50% of the bounding box's size gets discarded. So far, so good. But then when I do:
blds = boxLabelDatastore(boxLabels,bSet)
How does the OverlapThreshold value of 0.5 specified when creating "bSet" using BalanceBoxLabels get passed along to the creation of "blds" using boxLabelDatastore?
  1 件のコメント
Dominique
Dominique 2024 年 5 月 6 日
As a follow-up, after running some tests, as far as I can tell the OverlapThreshold value is not passed on in any way to the resulting block location set and subsequent box label datastore; meaning that as far as I can tell it's a pointless feature that doesn't actually do what it's supposed to.
For example, even if set the OverlapThreshold to 1--which is supposed to result in discarding any and all bounding boxes that overlap block boundaries rather than cropping them--when I create a box label datastore with the resulting block location set, the box label datastore still contains cropped bounding boxes even though it is expected to contain only whole bounding boxes.
So either it's an impotent feature, or I'm somehow not utilizing it correctly.

サインインしてコメントする。

回答 (1 件)

Aneela
Aneela 2024 年 5 月 22 日
Hi Dominique,
The “boxLocationSet”, “bSet”, resulting from the “balanceBoxLabels” does not explicitly carry the “OvelapThreshold” as one of its properties.
When the “OverlapThreshold” in “balanceBoxLabels” is specified, the function uses this threshold to decide which bounding boxes to include or exclude in each block.
bSet = balanceBoxLabels(boxLabels,blockedImages,blockSize, numObservations,OverlapThreshold=0.5);
  • This decision-making and adjustment process happens during the execution of “balanceBoxLabels”.
  • The outcome is a “boxLocationSet” object, “bSet”, that contains information about Image numbers, block origins, block sizes, and the information about which bounding boxes to include.
When a “boxLabelDatastore” is created using:
blds = boxLabelDatastore(boxLabels,bSet);
  • The “bSet” already incorporates the effects of the “OverlapThreshold” through its content, which blocks were created and which bounding boxes they include.
  • The “boxLabelDatastore” function then uses this information to create a datastore of the cropped images and their corresponding adjusted bounding boxes.
For more information on “balanceBoxLabels”, refer to the following MathWorks documentation: https://www.mathworks.com/help/vision/ref/balanceboxlabels.html
  4 件のコメント
Dominique
Dominique 2024 年 5 月 24 日
Hi Corey,
Thank you very much for chiming in and confirming my observations.
I agree with you that it's generally good for a model to be trained on partial objects because it's likely to subsequently encounter some during inference.
Regarding the documentation, I would say it's definitely not clearly stated that OverlapThreshold specifically impacts the process of deciding which randomly sampled blocks to keep vs. discard, and moreover that it solely impacts that specific process. As currently stated, you really get the impression that it gives full control over the extent of clipping of all boxes that end up in your eventual box label datastore:
"When the overlap between a bounding box and a cropping window is greater than the threshold, boxes in the boxLabels input are clipped to the image block window border. When the overlap is less than the threshold, the boxes are discarded."
As a suggestion, I think revising it as follows would make it clearer:
"When the overlap between at least one bounding box and a cropping window is greater than the threshold, all boxes overlapping the cropping window are clipped to the image block window border. When no box overlaps the cropping window by more than the threshold, the block is discarded."
Beyond that, since I've got your attention, I would add that I think the balanceBoxLabels function has the potential to be further developed and presented as a more versatile and powerful function. For example, despite its name and implicit focus on balancing imbalanced training sets, I've actually found it to be a generally useful function to sample network-input-size blocks from larger annotated images even in cases where there's just a single object class.
Furthermore, what would make it truly powerful is if it could be refashioned to work like a sort of object detection version of randomPatchExtractionDatastore (which is used for semantic segmentation tasks); whereby instead of randomly extracting blocks prior to model training and then being stuck with that finite set of blocks throughout training, it extracts an all new random sampling of blocks at each epoch, during training. I find this to be a great approach to counteract overfitting, which even in a sense redefines the very meaning of an "epoch", since the model virtually never sees any given randomly extracted image more than once throughout training.
Corey Kurowski
Corey Kurowski 2024 年 5 月 30 日
Hey Dominique,
Thanks for this feedback. Both of your noted additional functionalities are definitely something for us to consider in near future releases to help with a more robust deep learning training workflow. While we will likely keep balanceBoxLabels as a dedicated function, we may very well introduce additional functions that can assist in each of these cases.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeComputer Vision with Simulink についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by