Main Content

unstack

(Not Recommended) Unstack dataset array from single variable into multiple variables

The dataset data type is not recommended. To work with heterogeneous data, use the MATLAB® table data type instead. See MATLAB table documentation for more information.

Syntax

A = unstack(B,datavar,indvar)
[A,iB] = unstack(B,datavar,indvar)
A = unstack(B,datavar,indvar,'Parameter',value)

Description

A = unstack(B,datavar,indvar) unstacks a single variable in dataset array B into multiple variables in A. In general A contains more variables, but fewer observations, than B.

datavar specifies the data variable in B to unstack. indvar specifies an indicator variable in B that determines which variable in A each value in datavar is unstacked into. unstack treats the remaining variables in B as grouping variables. Each unique combination of their values defines a group of observations in B that will be unstacked into a single observation in A.

unstack creates m data variables in A, where m is the number of group levels in indvar. The values in indvar indicate which of those m variables receive which values from datavar. The j-th data variable in A contains the values from datavar that correspond to observations whose indvar value was the j-th of the m possible levels. Elements of those m variables for which no corresponding data value in B exists contain a default value.

datavar is a positive integer, a character vector, a string scalar, or a logical vector containing a single true value. indvar is a positive integer, a variable name, or a logical vector containing a single true value.

[A,iB] = unstack(B,datavar,indvar) returns an index vector iB indicating the correspondence between observations in A and those in B. For each observation in A, iB contains the index of the first in the corresponding group of observations in B.

For more information on grouping variables, see Grouping Variables.

Input Arguments

A = unstack(B,datavar,indvar,'Parameter',value) uses the following parameter name/value pairs to control how unstack converts variables in B to variables in A:

'GroupVars'Grouping variables in B that define groups of observations. groupvars is a positive integer, a vector of positive integers, a character vector, a string array, a cell array of character vectors, or a logical vector. The default is all variables in B not listed in datavar or indvar.
'NewDataVarNames'A string array or cell array of character vectors containing names for the data variables unstack should create in A. Default is the group names of the grouping variable specified in indvar.
'AggregationFun'A function handle that accepts a subset of values from datavar and returns a single value. stack applies this function to observations from the same group that have the same value of indvar. The function must aggregate the data values into a single value, and in such cases it is not possible to recover B from A using stack. The default is @sum for numeric data variables. For non-numeric variables, there is no default, and you must specify 'AggregationFun' if multiple observations in the same group have the same values of indvar.
'ConstVars'Variables in B to copy to A without unstacking. The values for these variables in A are taken from the first observation in each group in B, so these variables should typically be constant within each group. ConstVars is a positive integer, a vector of positive integers, a character vector, a string array, a cell array of character vectors, or a logical vector. The default is no variables.

You can also specify more than one data variable in B, each of which becomes a set of m variables in A. In this case, specify datavar as a vector of positive integers, a string array or cell array containing variable names, or a logical vector. You may specify only one variable with indvar. The names of each set of data variables in A are the name of the corresponding data variable in B concatenated with the names specified in 'NewDataVarNames'. The function specified in 'AggregationFun' must return a value with a single row.

Examples

Combine several variables for estimated influenza rates into a single variable. Then unstack the estimated influenza rates by date.

load flu
 
% FLU has a 'Date' variable, and 10 variables for estimated influenza rates
% (in 9 different regions, estimated from Google searches, plus a
% nationwide estimate from the CDC). Combine those 10 variables into an
% array that has a single data variable, 'FluRate', and an indicator
% variable, 'Region', that says which region each estimate is from.
[flu2,iflu] = stack(flu, 2:11, 'NewDataVarName','FluRate', ...
    'IndVarName','Region')
 
% The second observation in FLU is for 10/16/2005.  Find the observations
% in FLU2 that correspond to that date.
flu(2,:)
flu2(iflu==2,:)
 
% Use the 'Date' variable from that array to split 'FluRate' into 52
% separate variables, each containing the estimated influenza rates for
% each unique date.  The new array has one observation for each region.  In
% effect, this is the original array FLU "on its side".
dateNames = cellstr(datestr(flu.Date,'mmm_DD_YYYY'));
[flu3,iflu2] = unstack(flu2, 'FluRate', 'Date', ...
    'NewDataVarNames',dateNames)
 
% Since observations in FLU3 represent regions, IFLU2 indicates the first
% occurrence in FLU2 of each region.
flu2(iflu2,:)