partition
Class: matlab.io.datastore.Partitionable
Namespace: matlab.io.datastore
Partition a datastore
Description
Input Arguments
ds
— Input datastore
matlab.io.Datastore
object
Input datastore, specified as a matlab.io.Datastore
object. To create a Datastore
object, see matlab.io.Datastore
.
n
— Number of partitions
positive integer
Number of partitions, specified as a positive integer. To get a reasonable
value for n
, use the numpartitions
function.
When you specify a value of n
that is not in the range
of partitions available for the datastore, the partition
method returns an empty datastore. For more information, see Empty Datastores. For
instance, if a datastore can hold up to 10
partitions,
then the output of the partition
method depends on the
value of n
.
If the specified value of
n
is less than or equal to10
, then thepartition
method returns the partition specified by theindex
. For example,partition(ds,10,1)
returns a copy of the first partition of the original datastoreds
.If the specified value of
n
is greater than10
, then thepartition
method returns an empty datastore. For example,partition(ds,100,11)
returns an empty datastore.
Example: 3
Data Types: double
index
— Index
positive integer
Index, specified as a positive integer.
Example: 1
Data Types: double
Examples
Build Datastore with Parallel Processing Support
Build a datastore with parallel processing support and use it to bring your custom or proprietary data into MATLAB®. Then, process the data in a parallel pool.
Create a .m
class definition file that contains the code implementing your custom datastore. You must save this file in your working folder or in a folder that is on the MATLAB® path. The name of the .m
file must be the same as the name of your object constructor function. For example, if you want your constructor function to have the name MyDatastorePar, then the name of the .m
file must be MyDatastorePar.m
. The .m
class definition file must contain the following steps:
Step 1: Inherit from the datastore classes.
Step 2: Define the constructor and the required methods.
Step 3: Define your custom file reading function.
In addition to these steps, define any other properties or methods that you need to process and analyze your data.
%% STEP 1: INHERIT FROM DATASTORE CLASSES classdef MyDatastorePar < matlab.io.Datastore & ... matlab.io.datastore.Partitionable properties(Access = private) CurrentFileIndex double FileSet matlab.io.datastore.DsFileSet end % Property to support saving, loading, and processing of % datastore on different file system machines or clusters. % In addition, define the methods get.AlternateFileSystemRoots() % and set.AlternateFileSystemRoots() in the methods section. properties(Dependent) AlternateFileSystemRoots end %% STEP 2: DEFINE THE CONSTRUCTOR AND THE REQUIRED METHODS methods % Define your datastore constructor function myds = MyDatastorePar(location,altRoots) myds.FileSet = matlab.io.datastore.DsFileSet(location,... 'FileExtensions','.bin', ... 'FileSplitSize',8*1024); myds.CurrentFileIndex = 1; if nargin == 2 myds.AlternateFileSystemRoots = altRoots; end reset(myds); end % Define the hasdata method function tf = hasdata(myds) % Return true if more data is available tf = hasfile(myds.FileSet); end % Define the read method function [data,info] = read(myds) % Read data and information about the extracted data % See also: MyFileReader() if ~hasdata(myds) msgII = ['Use the reset method to reset the datastore ',... 'to the start of the data.']; msgIII = ['Before calling the read method, ',... 'check if data is available to read ',... 'by using the hasdata method.']; error('No more data to read.\n%s\n%s',msgII,msgIII); end fileInfoTbl = nextfile(myds.FileSet); data = MyFileReader(fileInfoTbl); info.Size = size(data); info.FileName = fileInfoTbl.FileName; info.Offset = fileInfoTbl.Offset; % Update CurrentFileIndex for tracking progress if fileInfoTbl.Offset + fileInfoTbl.SplitSize >= ... fileInfoTbl.FileSize myds.CurrentFileIndex = myds.CurrentFileIndex + 1 ; end end % Define the reset method function reset(myds) % Reset to the start of the data reset(myds.FileSet); myds.CurrentFileIndex = 1; end % Define the partition method function subds = partition(myds,n,ii) subds = copy(myds); subds.FileSet = partition(myds.FileSet,n,ii); reset(subds); end % Getter for AlternateFileSystemRoots property function altRoots = get.AlternateFileSystemRoots(myds) altRoots = myds.FileSet.AlternateFileSystemRoots; end % Setter for AlternateFileSystemRoots property function set.AlternateFileSystemRoots(myds,altRoots) try % The DsFileSet object manages AlternateFileSystemRoots % for your datastore myds.FileSet.AlternateFileSystemRoots = altRoots; % Reset the datastore reset(myds); catch ME throw(ME); end end end methods (Hidden = true) % Define the progress method function frac = progress(myds) % Determine percentage of data read from datastore if hasdata(myds) frac = (myds.CurrentFileIndex-1)/... myds.FileSet.NumFiles; else frac = 1; end end end methods(Access = protected) % If you use the FileSet property in the datastore, % then you must define the copyElement method. The % copyElement method allows methods such as readall % and preview to remain stateless function dscopy = copyElement(ds) dscopy = copyElement@matlab.mixin.Copyable(ds); dscopy.FileSet = copy(ds.FileSet); end % Define the maxpartitions method function n = maxpartitions(myds) n = maxpartitions(myds.FileSet); end end end %% STEP 3: IMPLEMENT YOUR CUSTOM FILE READING FUNCTION function data = MyFileReader(fileInfoTbl) % create a reader object using FileName reader = matlab.io.datastore.DsFileReader(fileInfoTbl.FileName); % seek to the offset seek(reader,fileInfoTbl.Offset,'Origin','start-of-file'); % read fileInfoTbl.SplitSize amount of data data = read(reader,fileInfoTbl.SplitSize); end
Your custom datastore is now ready. Use your custom datastore to read and process the data in a parallel pool.
More About
Empty Datastores
An empty datastore is a datastore object that does not contain any records. For an empty datastore, your custom datastore methods must satisfy these conditions:
hasdata
must returnfalse
.read
must return an error.numpartitions
andmaxpartitions
must return0
.partition
must return an empty datastore.preview
andreadall
must return empty data that preserves the non-tall dimensions. For example, if theread
method on a nonempty datastore returns data that is of size5
-by-15
-by-25
, then thepreview
andreadall
methods must return empty data of size0
-by-15
-by-25
.
Non-Tall Dimensions
Dimensions other than the first dimension of the array. For an array of size
5
-by-15
-by-25
, the tall
dimension is 5
and the non-tall dimensions are
15
and 25
.
Tips
In your implementation of the
partition
method, you must include these steps.Before creating a partitioned datastore
subds
, create a deep copy of the original datastoreds
.At the end of the
partition
method, reset the partitioned datastoresubds
.
For a sample implementation of the
partition
method, see Add Support for Parallel Processing.When a partition of a datastore contains no readable record, the
read
method must return empty data. The non-tall dimensions of this empty data must match the non-tall dimensions of theread
method output on a partition with readable records. This requirement ensures that the behavior of thereadall
method matches the behavior of thegather
function.
Version History
Introduced in R2017b
See Also
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)