Main Content

Create Datastores for Medical Image Semantic Segmentation

Semantic segmentation deep learning networks segment a medical image by assigning a class label, such as tumor or lung, to every pixel in the image. To train a semantic segmentation network, you must have a collection of images, or data sources, and a collection of label images that contain labels for the pixels in the data source images. Manage training data for semantic segmentation by using datastores:

Data source and label image for a 2-D chest CT slice

Medical Image Ground Truth Data

You can use the Medical Image Labeler app to label 2-D or 3-D medical images to generate training data for semantic segmentation networks. The app stores labeling results in a groundTruthMedical object, which specifies the filenames of data source and pixel label images in its DataSource and LabelData properties, respectively. The table shows how a groundTruthMedical object formats the data source and label image information for 2-D versus 3-D data.

Type of DataData Source FormatLabel Data Format
2-D medical images or multiframe 2-D image series

The DataSource property contains an ImageSource object that specifies 2-D images or image series in one of these formats:

  • Single DICOM file.

  • Single NIfTI file.

Note

A groundTruthMedical object can specify a combination of 2-D DICOM and NIfTI data sources.

The LabelData property contains a string array. Each element specifies the filename of the label image for the corresponding data source.

  • 2-D label images are MAT files, regardless of the data source file format.

  • If a data source has no labels, the corresponding element of LabelData is an empty string, "".

3-D medical image volumes

The DataSource property contains a VolumeSource object that specifies 3-D image volumes in one of these formats:

  • Directory of DICOM files corresponding to one volume.

  • Single DICOM file.

  • Single NIfTI file.

  • Single NRRD file.

Note

A groundTruthMedical object can specify a combination of 3-D DICOM, NIfTI, and NRRD data sources.

The LabelData property contains a string array. Each element specifies the filename of the label image for the corresponding data source.

  • 3-D label images are NIfTI files, regardless of the data source file format.

  • If a data source has no labels, the corresponding element of LabelData is an empty string, "".

Datastores for Semantic Segmentation

You can perform medical image semantic segmentation using 2-D or 3-D deep learning networks. A 2-D network accepts 2-D input images and predicts segmentation labels using 2-D convolution kernels. The input images can be one of these sources:

  • Images from 2-D modalities, such as X-ray.

  • Individual frames extracted from a multiframe 2-D image series, such as an ultrasound video.

  • Individual slices extracted from a 3-D image volume, such as a CT or MRI scan.

A 3-D network accepts 3-D input images and predicts segmentation labels using 3-D convolution kernels. The input images are 3-D medical volumes, such as entire CT or MRI volumes.

The benefits of 2-D networks include faster prediction speeds and lower memory requirements. Additionally, you can generate many 2-D training images from one image volume or series. Therefore, fewer scans are required to train a 2-D network that segments a volume slice-by-slice versus training a fully 3-D network. The major benefit of 3-D networks is that they use information from adjacent slices or frames to predict segmentation labels, which can produce more accurate results.

See Also

| | | | (Computer Vision Toolbox) | |

Related Topics