Network doesn't work on test image

1 回表示 (過去 30 日間)
Mario
Mario 2024 年 8 月 20 日
回答済み: Vivek Akkala 2024 年 10 月 1 日
Hi all.
I'm trying to use and train a pretrained network (YOLO v2) on my own dataset (80 images).
I divided my set in training, validation and test, but network works correctly only on these images and not on other images that aren't in these set.
What can I do? I have to test my network only on the images that I have labeled?
Thanks

回答 (4 件)

arushi
arushi 2024 年 8 月 20 日
Hi Mario,
When using a pretrained network like YOLO v2 on your own dataset, it's important to ensure that your model generalizes well to new, unseen data. If your network performs well only on the images in your training, validation, and test sets, but not on other images, there are several factors and strategies you can consider to improve generalization:
1. Increase Dataset Size:
- Data Augmentation: Use techniques such as rotation, scaling, flipping, cropping, and color jittering to artificially increase the size and diversity of your dataset. This can help the model learn more robust features.
- Collect More Data: If possible, gather more labeled images that cover a wider variety of scenarios and conditions.
2. Improve Labels and Annotations:
- Ensure that your labels are accurate and consistent. Poor labeling can lead to poor model performance.
3. Fine-Tuning the Model:
- Adjust Learning Rate: Experiment with different learning rates. A learning rate that is too high might cause the model to converge too quickly to a suboptimal solution.
- More Epochs: Train for more epochs to allow the model to learn better representations, but watch out for overfitting.
4. Regularization Techniques:
- Use techniques like dropout, weight decay, or early stopping to prevent overfitting.
5. Evaluate and Adjust the Model:
- Validation: Regularly validate the model on a separate validation set to monitor its performance and adjust hyperparameters accordingly.
- Error Analysis: Analyze where the model is failing on new images. This can give insights into what features or scenarios the model is not capturing well.
6. Test on New Data:
- Ideally, you should test your model on a completely separate dataset that was not used during training or validation. This can give you a better indication of how well your model generalizes to new data.
By applying these strategies, you should be able to improve the generalization of your YOLO v2 model and achieve better performance on new, unseen images.
Hope this helps.
  2 件のコメント
Mario
Mario 2024 年 8 月 20 日
Thank you. I will do other tests soon.
Mario
Mario 2024 年 8 月 20 日
Do you think that there are few images? How many images I have to use? Labeling images is time-expensive!

サインインしてコメントする。


Saurav
Saurav 2024 年 8 月 21 日
編集済み: Saurav 2024 年 8 月 21 日
Hi Mario,
As I understand it, you are using a pretrained network (YOLO v2) on your dataset in MATLAB; however, it fails to work for images not in the dataset.
Training a YOLO v2 model on a limited dataset, consisting of only 80 photos, can be a problem due to the potential for overfitting. To enhance your model's performance using MATLAB, consider the following major step:
  • Data Augmentation:
  1. Data augmentation refers to the process of artificially increasing the diversity and size of a training dataset by applying various transformations to the existing data, especially when dealing with limited data.
  2. Augmentation effectively increases the size of the dataset without the need for additional data collection. Refer to the following documentation to learn more about this concept: https://www.mathworks.com/help/deeplearning/ref/imagedataaugmenter.html
Additional steps that can be addressed include:
  • Label your Dataset:
  1. Ensure that your dataset is accurately labeled. You can also use MATLAB's Image Labeler app to create bounding box annotations for each image. https://www.mathworks.com/help/vision/ug/get-started-with-the-image-labeler.html
  • Dividing the Dataset:
  1. A common split of the dataset is 70% training, 15% validation, and 15% test. However, with a small dataset, you might need to adjust these ratios to ensure enough data for training.https://www.mathworks.com/help/deeplearning/ug/divide-data-for-optimal-neural-network-training.html
  • Modify & Train the Network:
  1. Configure training options to prevent overfitting. Use a lower learning rate and consider using dropout if available. Refer: https://www.mathworks.com/help/deeplearning/ref/trainingoptions.html
  2. Train the model using the modified network and augmented data. Experiment with different learning rates, batch sizes, and augmentation techniques.
By following these steps and iteratively refining your approach, you can improve the accuracy of your YOLO v2 model on new, unseen images.
Let me know if this works or if you need further help!
  1 件のコメント
Mario
Mario 2024 年 8 月 23 日
I've defined an augment data function (that I find on this website). I don't know if it works, because the training time is very long: I've started it about 40 minutes ago, but it's still loading. I will publish my code soon.

サインインしてコメントする。


Image Analyst
Image Analyst 2024 年 8 月 23 日
Make sure your new images are resized to the required size for your model (the same size as your training, validation, and test set images). Are you sure they're all the same size?
What exactly does not "works correctly" mean? Does an error get thrown and it not give any result, or it gives a result but you just don't believe/like the result?
  9 件のコメント
Mario
Mario 2024 年 8 月 25 日
These are my folder and my gTruth file:
f = fullfile("gTruth80.mat")
f = "gTruth80.mat"
folder = fullfile("images.zip")
folder = "images.zip"
Mario
Mario 2024 年 8 月 26 日
Hi, I did other tests, but it doesn't work. Labels and bboxes are always empty. I don't know what I can do.

サインインしてコメントする。


Vivek Akkala
Vivek Akkala 2024 年 10 月 1 日
Hi Mario,
Training YOLO v2 with fewer than 80 images (considering you are dividing the total into training, validation, and test sets) is not feasible. I suggest increasing the size of your training dataset. While an ideal number of images cannot be precisely determined due to factors like object size, noise, lighting, and other elements, if you plan to train YOLO v2 to detect a single class, using between 300 to 400 images should yield optimal results. As mentioend in Arushi's suggestion it's good to have validation data. Ensure you have a sufficient amount of validation data (around 100 images) and regularly monitor validation performance to understand how the model performs on unseen data. Ideally, you can use the trained YOLO v2 network for inference once the validation accuracy exceeds 95%.

製品


リリース

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by