bioma.data.DataMatrix
Data structure encapsulating data and metadata from microarray experiment
Description
A bioma.data.DataMatrix
object is a data structure
encapsulating measurement data and feature metadata from a microarray experiment so that it
can be indexed by gene or probe identifiers and by sample identifiers.
A bioma.data.DataMatrix
object stores experimental data in a matrix,
with rows typically corresponding to gene names or probe identifiers, and columns typically
corresponding to sample identifiers. A DataMatrix
object also stores
metadata, such as the gene names or probe identifiers and sample identifiers, in row names and
column names.
Creation
Syntax
Description
creates a DMobj
= bioma.data.DataMatrix(Matrix
)DataMatrix
object from measurement data and feature
metadata from microarray experiment.
specifies row and column names. DMobj
=
bioma.data.DataMatrix(Matrix
,RowNames
,ColumnNames
)RowNames
are typically gene names
or probe identifiers. ColumnNames
are typically sample
identifiers.
creates a
DMobj
=
bioma.data.DataMatrix('File',FileName
)bioma.data.DataMatrix
object from a tab-delimited TXT or XLS file
that contains table-oriented data and metadata.
creates
a DMobj
= bioma.data.DataMatrix('File',FileName
,Name,Value
)bioma.data.DataMatrix
object from a tab-delimited TXT or XLS file
according to the Name,Value
arguments.
Input Arguments
Matrix
— Measurement data and feature metadata from microarray experiment
two-dimensional numeric array | two-dimensional logical array | bioma.data.DataMatrix
object
Measurement data and feature metadata from microarray experiment, specified as a
two-dimensional numeric or logical array or a bioma.data.DataMatrix
object.
RowNames
— Row names for bioma.data.DataMatrix
object
numeric vector | character array | string vector | cell array of character vectors
Row names for the bioma.data.DataMatrix
object, specified as a
numeric vector, character array, string vector, or cell array of character vectors.
The number of elements RowNames
must be equal to the number of rows
in Matrix
. RowNames
are typically gene names
or probe identifiers from a microarray experiment. Row names do not need to be
unique.
Data Types: double
| char
| string
| cell
ColumnNames
— Column names for bioma.data.DataMatrix
object
numeric vector | character array | string vector | cell array of character vectors
Column names for the bioma.data.DataMatrix
object, specified as
a numeric vector, character array, string vector, or cell array of character vectors.
The number of elements ColumnNames
must be equal to the number of
columns in Matrix
. ColumnNames
are typically
sample identifiers from a microarray experiment. Column names do not need to be
unique.
Data Types: double
| char
| string
| cell
FileName
— File name or path and file name
character vector | string
File name or a path and file name of a tab-delimited TXT or XLS file that contains table-oriented data and metadata, specified as a character vector or string.
Typically, the first row of the table contains column names, the first column
contains row names, and the numeric data starts at the 2,2
position. The bioma.data.DataMatrix
function detects if the first
column does not contain row names, and reads data from the first column. However, if
the first row does not contain header text (column names), set the
HLine
property to 0
.
Data Types: char
| string
RowNames
— Row names for bioma.data.DataMatrix
object
false
(default) | numeric vector | character array | string vector | cell array of character vectors | character vector | string | true
Row names for bioma.data.DataMatrix
object, specified as one
of these values:
Numeric vector, character array, string vector, or a cell array of character vectors, whose elements are equal in number to the number of rows of numeric data in the input matrix.
A character vector or string, which is used as a prefix for row names. Numbers are appended to the prefix.
true
— Unique row names are assigned using the formatsrow1
,row2
,row3
, and so on.false
— No row names are assigned.
Row names do not need to be unique.
Data Types: double
| logical
| char
| string
| cell
ColNames
— Column names for bioma.data.DataMatrix
object
false
(default) | numeric vector | character array | string vector | cell array of character vectors | character vector | string | true
Column names for bioma.data.DataMatrix
object, specified as
one of these values:
Numeric vector, character array, string vector, or a cell array of character vectors, whose elements are equal in number to the number of columns of numeric data in the input matrix.
A character vector or string, which is used as a prefix for column names. Numbers are appended to the prefix.
true
— Unique column names are assigned using the formatscol1
,col2
,col3
, and so on.false
— No column names are assigned.
Column names do not need to be unique.
Data Types: double
| logical
| char
| string
| cell
Name
— Name for bioma.data.DataMatrix
object
''
(default) | character vector | string
Name for bioma.data.DataMatrix
object, specified as a
character vector or string.
Data Types: char
| string
Delimiter
— Delimiter symbol to use for input file
'\t'
(default) | character vector | string
Delimiter symbol to use for input file, specified as a character vector or string. Typical choices are:
' '
'\t'
(default)','
';'
'|'
Data Types: char
| string
Hline
— Row of input file that contains column header text
1 (default) | positive integer
Row of the input file that contains the column header text (column names),
specified as a positive integer. When creating the DataMatrix
object, the DataMatrix
function loads data from (HLine +
1
) to the end of the file. If the input file does not contain column
header text (column names), set HLine
to
0
.
Data Types: double
Rows
— Subset of row names in file
cell array of character vectors | character array | string vector | numeric vector | logical vector
Subset of row names in File
for the
bioma.data.DataMatrix
function to use for creating the
bioma.data.DataMatrix
object, specified as a cell array of
character vectors, character array, string vector, or a numeric or logical
vector.
Data Types: logical
| char
| string
| cell
Columns
— Subset of column names in file
cell array of character vectors | character array | string vector | numeric vector | logical vector
Subset of column names in File
for the
DataMatrix
function to use for creating the
bioma.data.DataMatrix
object, specified as a cell array of
character vectors, character array, string vector, or a numeric or logical
vector.
Data Types: logical
| char
| string
| cell
Properties
Name
— Name of bioma.data.DataMatrix
object
''
(default) | character vector
Name of the bioma.data.DataMatrix
object, stored as a character
vector.
Data Types: char
RowNames
— Row names
''
| cell array of character vectors
Row names (typically gene names or probe identifiers), stored as an empty array or a cell array of character vectors. The number of elements in the cell array must equal the number of rows in the matrix.
Data Types: cell
ColNames
— Column names
''
| cell array of character vectors
Column names (typically sample identifiers), stored as an empty array or a cell array of character vectors. The number of elements in the cell array must equal the number of columns in the matrix.
Data Types: cell
NRows
— Number of rows
positive number
This property is read-only.
Number of rows in the matrix, stored as a positive number. You cannot modify this
property directly. You can access it using the get
method.
Data Types: double
NCols
— Number of columns
positive number
This property is read-only.
Number of columns in the matrix, stored as a positive number. You cannot modify this
property directly. You can access it using the get
method.
Data Types: double
NDims
— Number of dimensions
positive number
This property is read-only.
Number of dimensions in the matrix, stored as a positive number. You cannot modify
this property directly. You can access it using the get
method.
Data Types: double
ElementClass
— Class type of the elements in bioma.data.DataMatrix
object
character vector
This property is read-only.
Class type of the elements in bioma.data.DataMatrix
object,
stored as a character vector, such as single
or
double
. You cannot modify this property directly. You can access it
using the get
method.
Data Types: char
Object Functions
General Methods
colnames | Retrieve or set column names of DataMatrix object |
disp | Display DataMatrix object |
dmwrite | Write DataMatrix object to text file |
double | Convert DataMatrix object to double-precision array |
get | Retrieve information about DataMatrix object |
isempty | Determine whether array is empty |
isfinite | Determine which array elements are finite |
isinf | Determine which array elements are infinite |
isnan | Determine which array elements are NaN |
isscalar | Determine whether input is scalar |
isequal | Test DataMatrix objects for equality |
isequaln | Test DataMatrix objects for equality, treating NaNs as equal |
isvector | Determine whether input is vector |
length | Length of largest array dimension |
ndims | Return number of dimensions in DataMatrix object |
numel | Return number of elements in DataMatrix object |
pdist | Pairwise distance between pairs of observations |
plot | Draw 2-D line plot of DataMatrix object |
rownames | Retrieve or set row names of DataMatrix object |
set | Set property of DataMatrix object |
single | Convert DataMatrix object to single-precision array |
size | Array size |
Methods for Manipulating Data
Descriptive Statistics and Statistical Learning Methods
kmeans | k-means clustering |
max | Return maximum values in DataMatrix object |
mean | Return average or mean values in DataMatrix object |
median | Return median values in DataMatrix object |
min | Return minimum values in DataMatrix object |
nanmax | (Not recommended) Maximum, ignoring NaN values |
nanmean | (Not recommended) Mean, ignoring NaN values |
nanmedian | (Not recommended) Median, ignoring NaN values |
nanmin | (Not recommended) Minimum, ignoring NaN values |
nanstd | (Not recommended) Standard deviation, ignoring NaN
values |
nansum | (Not recommended) Sum, ignoring NaN values |
nanvar | (Not recommended) Variance, ignoring NaN values |
pca | Principal component analysis of raw data |
pdist | Pairwise distance between pairs of observations |
std | Return standard deviation values in DataMatrix object |
sum | Return sum of elements in DataMatrix object |
var | Return variance values in DataMatrix object |
Unary Methods — Exponential
Unary Methods — Integer
Unary Methods — Custom
dmarrayfun | Apply function to each element in DataMatrix object |
Binary Methods — Arithmetic Operator
Binary Methods — Relational Operator
Binary Methods — Custom
dmbsxfun | Apply element-by-element binary operation to two DataMatrix objects with singleton expansion enabled |
Examples
Determine Properties and Property Values of DataMatrix
Object
Load the file containing yeast data. This file includes three variables: yeastvalues
, a 614-by-7 matrix of gene expression data, genes
, a cell array of 614 GenBank® accession numbers for labeling the rows in yeastvalues
, and times
, a 1-by-7 vector of time values for labeling the columns in yeastvalues
.
load filteredyeastdata
Create variables to contain a subset of the data, specifically the first five rows and first four columns of the yeastvalues
matrix, the genes
cell array, and the times
vector.
yeastvalues = yeastvalues(1:5,1:4); genes = genes(1:5,:); times = times(1:4);
Import the microarray object package.
import bioma.data.*
Create a DataMatrix
object from the gene expression data.
DMobj = DataMatrix(yeastvalues,genes,times)
DMobj = 0 9.5 11.5 13.5 SS DNA -0.131 1.699 -0.026 0.365 YAL003W 0.305 0.146 -0.129 -0.444 YAL012W 0.157 0.175 0.467 -0.379 YAL026C 0.246 0.796 0.384 0.981 YAL034C -0.235 0.487 -0.184 -0.669
Display all properties of a DataMatrix
object and their current values.
get(DMobj)
Name: '' RowNames: {5x1 cell} ColNames: {' 0' ' 9.5' '11.5' '13.5'} NRows: 5 NCols: 4 NDims: 2 ElementClass: 'double'
Return all properties and their current values of the DataMatrix
object to a scalar structure where each field name is a property of a DataMatrix
object, and each field contains the value of that property.
DMstruct = get(DMobj)
DMstruct = struct with fields:
Name: ''
RowNames: {5x1 cell}
ColNames: {' 0' ' 9.5' '11.5' '13.5'}
NRows: 5
NCols: 4
NDims: 2
ElementClass: 'double'
Return the value of a specific property of the DataMatrix
object. For exxample, return the value of RowNames
.
NamesOfRows = get(DMobj,'RowNames')
NamesOfRows = 5x1 cell
{'SS DNA' }
{'YAL003W'}
{'YAL012W'}
{'YAL026C'}
{'YAL034C'}
Now return the value of NRows
.
NumberOfRows = DMobj.NRows
NumberOfRows = 5
Determine Possible Values of DataMatrix
Object Properties
Load the file containing yeast data. This file includes three variables: yeastvalues
, a 614-by-7 matrix of gene expression data, genes
, a cell array of 614 GenBank® accession numbers for labeling the rows in yeastvalues
, and times
, a 1-by-7 vector of time values for labeling the columns in yeastvalues
.
load filteredyeastdata
Create variables to contain a subset of the data, specifically the first five rows and first four columns of the yeastvalues
matrix, the genes
cell array, and the times
vector.
yeastvalues = yeastvalues(1:5,1:4); genes = genes(1:5,:); times = times(1:4);
Import the microarray object package.
import bioma.data.*
Create a DataMatrix
object from the gene expression data.
DMobj = DataMatrix(yeastvalues,genes,times)
DMobj = 0 9.5 11.5 13.5 SS DNA -0.131 1.699 -0.026 0.365 YAL003W 0.305 0.146 -0.129 -0.444 YAL012W 0.157 0.175 0.467 -0.379 YAL026C 0.246 0.796 0.384 0.981 YAL034C -0.235 0.487 -0.184 -0.669
Display possible values for all properties that have a fixed set of property values in the DataMatrix object.
set(DMobj)
Name: 'A DataMatrix's 'Name' property does not have a fixed set of values.' RowNames: 'Empty, a cell array of strings or a numeric vector.' ColNames: 'Empty, a cell array of strings or a numeric vector.'
Display possible values for a specific property that has a fixed set of property values in the DataMatrix
object. For example, display possible values for RowNames
.
set(DMobj,'RowNames')
Empty, a cell array of strings or a numeric vector.
Specify Properties of DataMatrix
Object
Load the file containing yeast data. This file includes three variables: yeastvalues
, a 614-by-7 matrix of gene expression data, genes
, a cell array of 614 GenBank® accession numbers for labeling the rows in yeastvalues
, and times
, a 1-by-7 vector of time values for labeling the columns in yeastvalues
.
load filteredyeastdata
Create variables to contain a subset of the data, specifically the first five rows and first four columns of the yeastvalues
matrix, the genes
cell array, and the times
vector.
yeastvalues = yeastvalues(1:5,1:4); genes = genes(1:5,:); times = times(1:4);
Import the microarray object package.
import bioma.data.*
Create a DataMatrix
object from the gene expression data.
DMobj = DataMatrix(yeastvalues)
DMobj = 1 2 3 4 1 -0.131 1.699 -0.026 0.365 2 0.305 0.146 -0.129 -0.444 3 0.157 0.175 0.467 -0.379 4 0.246 0.796 0.384 0.981 5 -0.235 0.487 -0.184 -0.669
Set the Name
property of the DataMatrix
object.
DMobj = set(DMobj,'Name','YeastData'); DMobj.Name
ans = 'YeastData'
Set multiple properties, for example, set the RowNames
and ColNames
properties.
DMobj = set(DMobj,'RowNames',genes,'ColNames',times)
DMobj = 0 9.5 11.5 13.5 SS DNA -0.131 1.699 -0.026 0.365 YAL003W 0.305 0.146 -0.129 -0.444 YAL012W 0.157 0.175 0.467 -0.379 YAL026C 0.246 0.796 0.384 0.981 YAL034C -0.235 0.487 -0.184 -0.669
Version History
Introduced in R2008bR2017b: princomp
method has been renamed
The princomp
method of bioma.data.DataMatrix
has
been renamed. Replace instances of princomp
with pca
.
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)