unstack
(Not Recommended) Unstack dataset array from single variable into multiple variables
The dataset
data type is not recommended. To work with heterogeneous data,
use the MATLAB®
table
data type instead. See MATLAB
table
documentation for more information.
Syntax
A = unstack(B,datavar,indvar)
[A,iB] = unstack(B,datavar,indvar)
A = unstack(B,datavar,indvar,'Parameter'
,value
)
Description
A = unstack(B,datavar,indvar)
unstacks a single variable in dataset
array B
into multiple variables in A
. In general
A
contains more variables, but fewer observations, than
B
.
datavar
specifies the data variable in B
to unstack.
indvar
specifies an indicator variable in B
that
determines which variable in A
each value in datavar
is
unstacked into. unstack
treats the remaining variables in B
as grouping variables. Each unique combination of their values defines a group of observations in
B
that will be unstacked into a single observation in
A
.
unstack
creates m
data variables in
A
, where m
is the number of group levels in
indvar
. The values in indvar
indicate which of those
m
variables receive which values from datavar
. The
j
-th data variable in A
contains the values from
datavar
that correspond to observations whose indvar
value
was the j
-th of the m
possible levels. Elements of those
m
variables for which no corresponding data value in B
exists contain a default value.
datavar
is a positive integer, a character vector, a string scalar, or a
logical vector containing a single true value. indvar
is a positive integer, a
variable name, or a logical vector containing a single true value.
[A,iB] = unstack(B,datavar,indvar)
returns an index vector
iB
indicating the correspondence between observations in A
and those in B
. For each observation in A
,
iB
contains the index of the first in the corresponding group of observations
in B
.
For more information on grouping variables, see Grouping Variables.
Input Arguments
A = unstack(B,datavar,indvar,
uses the following parameter name/value pairs to control how 'Parameter'
,value
)unstack
converts
variables in B
to variables in A
:
'GroupVars' | Grouping variables in B that define groups of observations.
groupvars is a positive integer, a vector of positive integers, a
character vector, a string array, a cell array of character vectors, or a logical vector.
The default is all variables in B not listed in
datavar or indvar . |
'NewDataVarNames' | A string array or cell array of character vectors containing names for the data
variables unstack should create in A . Default is the
group names of the grouping variable specified in indvar . |
'AggregationFun' | A function handle that accepts a subset of values from datavar and
returns a single value. stack applies this function to observations from
the same group that have the same value of indvar . The function must
aggregate the data values into a single value, and in such cases it is not possible to
recover B from A using stack . The
default is @sum for numeric data variables. For non-numeric variables,
there is no default, and you must specify 'AggregationFun' if multiple
observations in the same group have the same values of indvar . |
'ConstVars' | Variables in B to copy to A without unstacking.
The values for these variables in A are taken from the first observation
in each group in B , so these variables should typically be constant
within each group. ConstVars is a positive integer, a vector of positive
integers, a character vector, a string array, a cell array of character vectors, or a
logical vector. The default is no variables. |
You can also specify more than one data variable in B
, each of
which becomes a set of m
variables in A
. In this case,
specify datavar
as a vector of positive integers, a string array or cell array
containing variable names, or a logical vector. You may specify only one variable with
indvar
. The names of each set of data variables in A
are
the name of the corresponding data variable in B
concatenated with the names
specified in 'NewDataVarNames'
. The function specified in
'AggregationFun'
must return a value with a single row.
Examples
Combine several variables for estimated influenza rates into a single variable. Then unstack the estimated influenza rates by date.
load flu % FLU has a 'Date' variable, and 10 variables for estimated influenza rates % (in 9 different regions, estimated from Google searches, plus a % nationwide estimate from the CDC). Combine those 10 variables into an % array that has a single data variable, 'FluRate', and an indicator % variable, 'Region', that says which region each estimate is from. [flu2,iflu] = stack(flu, 2:11, 'NewDataVarName','FluRate', ... 'IndVarName','Region') % The second observation in FLU is for 10/16/2005. Find the observations % in FLU2 that correspond to that date. flu(2,:) flu2(iflu==2,:) % Use the 'Date' variable from that array to split 'FluRate' into 52 % separate variables, each containing the estimated influenza rates for % each unique date. The new array has one observation for each region. In % effect, this is the original array FLU "on its side". dateNames = cellstr(datestr(flu.Date,'mmm_DD_YYYY')); [flu3,iflu2] = unstack(flu2, 'FluRate', 'Date', ... 'NewDataVarNames',dateNames) % Since observations in FLU3 represent regions, IFLU2 indicates the first % occurrence in FLU2 of each region. flu2(iflu2,:)