メインコンテンツ

Ground Truth Images and Video

Interactively label images and videos using AI-assisted automation, create training data for AI models, and manage collaborative team labeling for large data sets

Computer Vision Toolbox™ provides a complete workflow for generating ground truth data from images and videos to train AI models for tasks such as object detection, semantic segmentation, instance segmentation, text recognition, and image or video classification. You can start by using the Image Labeler and Video Labeler apps to interactively annotate data with a wide range of label types. These include rectangles, polygons, polylines, scene labels, and pixel-level labels. To get started labeling a collection of images, see Get Started with the Image Labeler. To get started labeling a video or sequence of images, see Get Started with the Video Labeler.

The Image Labeler and Video Labeler apps support manual, AI-assisted and automated annotation, allowing you to accelerate labeling using built-in AI models like the Segment Anything Model (SAM) and Grounding DINO. For more information, see Get Started with AI-Assisted and Automated Labeling. You can also integrate custom automation algorithms to tailor the labeling process to your specific needs. For more details, see Create Custom Automation Algorithm for Labeling.

Once labeling is complete, you can export the annotated data and postprocess it to create training data sets for AI models. The toolbox supports workflows for organizing and managing labeled data, enabling seamless integration with training pipelines for classification, detection, and segmentation tasks.

For collaborative projects, the Image Labeler app includes features to manage team-based labeling, enabling you to distribute labeling tasks, review annotations, provide feedback, and track progress across multiple contributors. This makes it easier to scale labeling efforts and maintain consistency across large data sets. For more details, see Get Started with Team-Based Labeling.

Montage with image on the left showing rectangle and projected cuboid bounding boxes, while the image on the right shows semantic pixel labels and polygon ROI labels.

Categories

Featured Examples