# Split Table Data Variables and Apply Functions

This example shows how to split power outage data from a table into groups by region and cause of the power outages. Then it shows how to apply functions to calculate statistics for each group and collect the results in a table.

The sample file, outages.csv, contains data representing electric utility outages in the United States. The file contains six columns: Region, OutageTime, Loss, Customers, RestorationTime, and Cause. Read outages.csv into a table.

Convert Region and Cause to categorical arrays, and OutageTime and RestorationTime to datetime arrays. Display the first five rows.

T.Region = categorical(T.Region);
T.Cause = categorical(T.Cause);
T.OutageTime = datetime(T.OutageTime);
T.RestorationTime = datetime(T.RestorationTime);
T(1:5,:)
ans=5×6 table
Region         OutageTime        Loss     Customers     RestorationTime          Cause
_________    ________________    ______    __________    ________________    _______________

SouthWest    2002-02-01 12:18    458.98    1.8202e+06    2002-02-07 16:50    winter storm
SouthEast    2003-01-23 00:49    530.14    2.1204e+05                 NaT    winter storm
SouthEast    2003-02-07 21:15     289.4    1.4294e+05    2003-02-17 08:14    winter storm
West         2004-04-06 05:44    434.81    3.4037e+05    2004-04-06 06:10    equipment fault
MidWest      2002-03-16 06:18    186.44    2.1275e+05    2002-03-18 23:23    severe storm

### Calculate Maximum Power Loss

Determine the greatest power loss due to a power outage in each region. The findgroups function returns G, a vector of group numbers created from T.Region. The splitapply function uses G to split T.Loss into five groups, corresponding to the five regions. splitapply applies the max function to each group and concatenates the maximum power losses into a vector.

G = findgroups(T.Region);
maxLoss = splitapply(@max,T.Loss,G)
maxLoss = 5×1
104 ×

2.3141
2.3418
0.8767
0.2796
1.6659

Calculate the maximum power loss due to a power outage by cause. To specify that Cause is the grouping variable, use table indexing. Create a table that contains the maximum power losses and their causes.

T1 = T(:,'Cause');
[G,powerLosses] = findgroups(T1);
powerLosses.maxLoss = splitapply(@max,T.Loss,G)
powerLosses=10×2 table
Cause          maxLoss
________________    _______

attack              582.63
earthquake          258.18
energy emergency     11638
equipment fault      16659
fire                872.96
severe storm        8767.3
thunder storm        23418
unknown              23141
wind                  2796
winter storm        2883.7

powerLosses is a table because T1 is a table. You can append the maximum losses as another table variable.

Calculate the maximum power loss by cause in each region. To specify that Region and Cause are the grouping variables, use table indexing. Create a table that contains the maximum power losses and display the first 15 rows.

T1 = T(:,{'Region','Cause'});
[G,powerLosses] = findgroups(T1);
powerLosses.maxLoss = splitapply(@max,T.Loss,G);
powerLosses(1:15,:)
ans=15×3 table
Region           Cause          maxLoss
_________    ________________    _______

MidWest      attack                   0
MidWest      energy emergency    2378.7
MidWest      equipment fault     903.28
MidWest      severe storm        6808.7
MidWest      thunder storm        15128
MidWest      unknown              23141
MidWest      wind                2053.8
MidWest      winter storm        669.25
NorthEast    attack              405.62
NorthEast    earthquake               0
NorthEast    energy emergency     11638
NorthEast    equipment fault     794.36
NorthEast    fire                872.96
NorthEast    severe storm        6002.4
NorthEast    thunder storm        23418

### Calculate Number of Customers Impacted

Determine power-outage impact on customers by cause and region. Because T.Loss contains NaN values, wrap sum in an anonymous function to use the 'omitnan' input argument.

osumFcn = @(x)(sum(x,'omitnan'));
powerLosses.totalCustomers = splitapply(osumFcn,T.Customers,G);
powerLosses(1:15,:)
ans=15×4 table
Region           Cause          maxLoss    totalCustomers
_________    ________________    _______    ______________

MidWest      attack                   0                0
MidWest      energy emergency    2378.7       6.3363e+05
MidWest      equipment fault     903.28       1.7822e+05
MidWest      severe storm        6808.7       1.3511e+07
MidWest      thunder storm        15128       4.2563e+06
MidWest      unknown              23141       3.9505e+06
MidWest      wind                2053.8       1.8796e+06
MidWest      winter storm        669.25       4.8887e+06
NorthEast    attack              405.62           2181.8
NorthEast    earthquake               0                0
NorthEast    energy emergency     11638       1.4391e+05
NorthEast    equipment fault     794.36       3.9961e+05
NorthEast    fire                872.96       6.1292e+05
NorthEast    severe storm        6002.4       2.7905e+07
NorthEast    thunder storm        23418       2.1885e+07

### Calculate Mean Durations of Power Outages

Determine the mean durations of all U.S. power outages in hours. Add the mean durations of power outages to powerLosses. Because T.RestorationTime has NaT values, omit the resulting NaN values when calculating the mean durations.

D = T.RestorationTime - T.OutageTime;
H = hours(D);
omeanFcn = @(x)(mean(x,'omitnan'));
powerLosses.meanOutage = splitapply(omeanFcn,H,G);
powerLosses(1:15,:)
ans=15×5 table
Region           Cause          maxLoss    totalCustomers    meanOutage
_________    ________________    _______    ______________    __________

MidWest      attack                   0                0        335.02
MidWest      energy emergency    2378.7       6.3363e+05        5339.3
MidWest      equipment fault     903.28       1.7822e+05        17.863
MidWest      severe storm        6808.7       1.3511e+07        78.906
MidWest      thunder storm        15128       4.2563e+06        51.245
MidWest      unknown              23141       3.9505e+06        30.892
MidWest      wind                2053.8       1.8796e+06        73.761
MidWest      winter storm        669.25       4.8887e+06        127.58
NorthEast    attack              405.62           2181.8        5.5117
NorthEast    earthquake               0                0             0
NorthEast    energy emergency     11638       1.4391e+05        77.345
NorthEast    equipment fault     794.36       3.9961e+05        87.204
NorthEast    fire                872.96       6.1292e+05        4.0267
NorthEast    severe storm        6002.4       2.7905e+07        2163.5
NorthEast    thunder storm        23418       2.1885e+07        46.098