partitionByIndex

Class: matlab.io.datastore.PartitionableByIndex
Namespace: matlab.io.datastore

(Not recommended) Partition datastore according to indices

partitionByIndex is not recommended. For more information, see Compatibility Considerations.

Syntax

ds2 = partitionByIndex(ds,ind)

Description

ds2 = partitionByIndex(ds,ind) partitions a subset of observations in a datastore, ds, into a new datastore, ds2. The desired observations are specified by indices, ind.

Input Arguments

expand all

`ds` — Input datastore
`Datastore` object

Input datastore, specified as a Datastore object.

`ind` — Indices
vector of positive integers

Indices of observations, specified as a vector of positive integers.

Output Arguments

expand all

`ds2` — Partitioned datastore
`Datastore` object

Partitioned datastore, returned as a Datastore object.

Attributes

`Abstract`	`true`
`Access`	`Public`

To learn about attributes of methods, see Method Attributes.

Tips

You must implement the partitionByIndex method by deriving a subclass from the matlab.io.datastore.Partitionable class.

Version History

Introduced in R2018a

collapse all

R2019a: `partitionByIndex` is not recommended

Before R2018a, to perform custom image preprocessing for training deep learning networks, you had to specify a custom read function using the readFcn property of imageDatastore. However, reading files using a custom read function was slow because imageDatastore did not prefetch files.

In R2018a, four classes including matlab.io.datastore.MiniBatchable and matlab.io.datastore.PartitionableByIndex were introduced as a solution to perform custom image preprocessing with support for prefetching, shuffling, and parallel training. Implementing a custom mini-batch datastore using matlab.io.datastore.MiniBatchable has several challenges and limitations.

In addition to specifying the preprocessing operations, you must also define properties and methods to support reading data in batches, reading data by index, and partitioning and shuffling data.
You must specify a value for the NumObservations property, but this value may be ill-defined or difficult to define in real-world applications.
Custom mini-batch datastores are not flexible enough to support common deep learning workflows, such as deployed workflows using GPU Coder™.

Starting in R2019a, datastores natively support prefetch, shuffling, and parallel training when reading batches of data. The transform function is the preferred way to perform custom data preprocessing, or transformations. The combine function is the preferred way to concatenate read data from multiple datastores, including transformed datastores. Concatenated data can serve as the network inputs and expected responses for training deep learning networks. The transform and combine functions have several advantages over matlab.io.datastore.MiniBatchable and matlab.io.datastore.PartitionableByIndex.

The functions enable data preprocessing and concatenation for all types of datastores, including imageDatastore.
The transform function only requires you to define the data processing pipeline.
When used on a deterministic datastore, the functions support tall data types and MapReduce.
The functions support deployed workflows.

Note

The recommended solution to transform data with basic image preprocessing operations, including resizing, rotation, and reflection, is augmentedImageDatastore. For more information, see Preprocess Images for Deep Learning.

There are no plans to remove partitionByIndex at this time.

partitionByIndex

Syntax

Description

Input Arguments

`ds` — Input datastore
`Datastore` object

`ind` — Indices
vector of positive integers

Output Arguments

`ds2` — Partitioned datastore
`Datastore` object

Attributes

Tips

Version History

R2019a: `partitionByIndex` is not recommended

See Also

Topics

partitionByIndex

Syntax

Description

Input Arguments

ds — Input datastore Datastore object

ind — Indices vector of positive integers

Output Arguments

ds2 — Partitioned datastore Datastore object

Attributes

Tips

Version History

R2019a: partitionByIndex is not recommended

See Also

Topics

`ds` — Input datastore
`Datastore` object

`ind` — Indices
vector of positive integers

`ds2` — Partitioned datastore
`Datastore` object

R2019a: `partitionByIndex` is not recommended