Combine rows and average contents in table

Question

DavidL88 2021 年 4 月 7 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/795702-combine-rows-and-average-contents-in-table

回答済み: Peter Perkins 2022 年 3 月 3 日

I have a 17,700x4 table T with 4 columns containing the results of multiple tests. First column is the name of a test. There are 149 rows for each test. The fourth column contains 149 time points with values ranging from 0.085 seconds to 0.9 seconds for each test. The second and third columns are voltage values assessed at each time point. I have four time ranges ([0.085-0.125, 0.17-0.23, 0.27-0.45, 0.6-0.9) within which I want to average the voltage values for each named test.

Example of a few rows for one test 'MLD_CG_11_FCL' in current table

TEST                        V1         V2       Time
MLD_CG_11_FCL             -0.68      -0.73      0.086
MLD_CG_11_FCL             -0.98      -0.89      0.090
MLD_CG_11_FCL             -1.22      -0.97      0.094
MLD_CG_11_FCL             -1.37      -0.98      0.098
MLD_CG_11_FCL             -1.41      -0.93      0.102
MLD_CG_11_FCL             -1.33      -0.85      0.105
MLD_CG_11_FCL             -1.16      -0.75      0.109
MLD_CG_11_FCL             -0.93      -0.64      0.113
MLD_CG_11_FCL             -0.66      -0.48      0.117
MLD_CG_11_FCL             -0.39      -0.28      0.121
MLD_CG_11_FCL             -0.14      -0.03      0.125

I'm looking to get this. Where V1 and V2 are now the average of the values within the time range thus collapsing the 149 rows per test into 4 rows per test.

TEST                        V1         V2       Time range
MLD_CG_11_FCL             -0.93      -0.69      0.085-0.125
MLD_CG_11_FCL             -0.50      -0.90      0.17-0.23
MLD_CG_11_FCL             -1.28      -0.95      0.27-0.45
MLD_CG_11_FCL             -1.40      -0.33      0.6-0.9

Can anyone advise how I do this?

2 件のコメント
なしを表示なしを非表示

dpb 2021 年 4 月 7 日

編集済み: dpb 2021 年 4 月 7 日

Be easier to illustrate with an actual data file covering the ranges, but discretize and findgroups along with rowfun will solve the problem -- if.

The IF on that solution is whether the time bins are indeed disjoint as given and there are data possibly between the group boundaries defined as given above.

If there are, one will have to calculate the range differently; to use discretize one would set the boundaries at the midpoint between the upper and lower boundaries for the middle ranges on the assurance there's not data in the cracks being included that shouldn't be.

MATLAB doesn't have a builtin function that handles disjoint bins altho I suppose one could create an additional bin between the regions above now that think about it.

Anyway, pick a subset of your table and attach it as a .mat file -- altho the whole thing is probably only a few MB???

DavidL88 2021 年 4 月 7 日

Sample_table_voltage.mat

Thanks. I attach the table here. I removed times outside the ranges given above.

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

dpb 2021 年 4 月 8 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/795702-combine-rows-and-average-contents-in-table#answer_669847

MATLAB Online で開く

% smoosh edges of disjoint regions to midpoints between...
edges=[0.085 0.125; 0.17 0.23; 0.27 0.45; 0.6 0.9];
e=edges.'; e=e(:);
e=[e(1) mean(reshape(e(2:end-1),2,[])) e(end)];
load voltage
Tfinish.Properties.VariableNames(1)={'Test'};
Tfinish.Test=categorical(Tfinish.Test);
Tfinish.TimeBin=discretize(Tfinish.Var4,e);         %  create the variable for grouping by time 
tMeanByTest=rowfun(@(x,y)deal(mean(x,1),mean(y,1)),Tfinish, ...
                    'InputVariables',{'Var2','Var3'}, ...
                    'GroupingVariables',{'Test','TimeBin'}, ...
                    'SeparateInputs',1, ...
                    'OutputVariableNames',{'MeanV2','MeanV3'});

provides

>> head(tMeanByTest)
ans =
  8×5 table
            Test            TimeBin    GroupCount    MeanV2     MeanV3 
    ____________________    _______    __________    _______    _______
    CG_HGH_CG_LOW_11_CPL       1           11        0.91667    0.23436
    CG_HGH_CG_LOW_11_CPL       2           15         2.1104     1.0282
    CG_HGH_CG_LOW_11_CPL       3           46         4.5008     3.2523
    CG_HGH_CG_LOW_11_CPL       4           77         2.7981     2.0317
    CG_HGH_CG_LOW_11_CPR       1           11        0.54696    0.53518
    CG_HGH_CG_LOW_11_CPR       2           15         1.6762     1.0518
    CG_HGH_CG_LOW_11_CPR       3           46         2.4635     1.0401
    CG_HGH_CG_LOW_11_CPR       4           77         1.4723     1.1045
>> 

You can turn the TimeBin variable to categorical with the bin names/ids you've outlined above for labels.

If the above approximation for the bin edges for discretize is not allowable, two ways to go at it--

add extra edges for bins between those defined and then remove the unwanted bins, or
use interp1 with a 'nearest' interpolation scheme to do the discretization process in which you can dump all those in the unwanted regions into one bin and remove it.

Or, of course, you can use explicit coding with case statement or if...elseif... construct.

10 件のコメント
8 件の古いコメントを表示8 件の古いコメントを非表示

DavidL88 2021 年 4 月 8 日

That code worked thanks! It classified the results exactly as I wanted. I didn't know how to use a conditional based on two variables.

ERP is event-related potential from electroencephalography. I made a mistake in extracting the data for this table so POR/POL weren't included with other regions of interest. Sorry about that!

dpb 2021 年 4 月 9 日

MATLAB Online で開く

We didn't have any ERPs in NucE, so thanks for explanation! :)

The above syntax/MATLAB coding idiom is use of "logical indexing"; a very powerful and important feature in MATLAB. Anywhere you can use a numeric indexing expression you can also use a logical one; the result will be to address those elements of the addressing expression containing TRUE and also recognize that one can use the NOT operator "~" to change the sense for things like

isOK=~cellfun(@isempty,somecellstrarray);

サインインしてコメントする。

Answer 2

Bob Thompson 2021 年 4 月 7 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/795702-combine-rows-and-average-contents-in-table#answer_669782

I'm not sure how you're defining the time time ranges but here's some psuedo code that I thought about for the rest of the process.

1) Import data into table; readtable

2) Determine names of different tests; unique

3) Loop through each test; for with ismember to select relevant rows

4) Determine time ranges; Not sure if you want to use user inputs or what for this

5) Loop through each time range; for with ismember

6) Should now have specific dataset for test and time, do average; mean

7) Append test name, averages, and time (two columns to keep as numbers?) to output array.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

DavidL88 2021 年 4 月 8 日

Thank you. The above script worked.

サインインしてコメントする。

Answer 3

Peter Perkins 2022 年 3 月 3 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/795702-combine-rows-and-average-contents-in-table#answer_908345

MATLAB Online で開く

In addition to dpb's solution that uses rowfun, here's a similar sol'n that uses varfun. The difference is that varfun works on one var at a time, while rowfun allows you to apply a function to multiple vars at once. So in that sense, rowfun is overkill here.

It looks like there are times that are exactly .125, but none that are exactly .23 or .45. That's what the wrangling of categories is all about.

>> load('Sample_table_voltage.mat')
>> Tfinish.Properties.VariableNames = ["Test" "V1" "V2" "Time"];
>> Tfinish.Test = categorical(Tfinish.Test);
>> Tfinish.Time = seconds(Tfinish.Time)
>> edges = seconds([0.085 0.125 0.17 0.23 0.27 0.45 0.6 0.9]);
>> catNames = ["0.085-0.125" "UpperEdge1" "0.17-0.23" "Discard1" "0.27-0.45" "Discard2" "0.6-0.9"];
>> Tfinish.TimeBin = discretize(Tfinish.Time,edges,'categorical',catNames);
>> Tfinish.TimeBin = mergecats(Tfinish.TimeBin,["0.085-0.125" "UpperEdge1"]);
>> Tfinish.TimeBin = removecats(Tfinish.TimeBin,["Discard1" "Discard2"])
Tfinish =
  17731×5 table
            Test                V1          V2            Time          TimeBin  
    _____________________    ________    _________    ____________    ___________
    MLD_CG_11_FCL            -0.67827     -0.73036    0.085938 sec    0.085-0.125
    MLD_CG_11_FCL             -0.9777     -0.89304    0.089844 sec    0.085-0.125
    MLD_CG_11_FCL             -1.2218     -0.97293     0.09375 sec    0.085-0.125
[snip]
    MOD_HGH_CG_HGH_22_CPL      1.2733      0.64462     0.89062 sec    0.6-0.9    
    MOD_HGH_CG_HGH_22_CPL      1.2546      0.62261     0.89453 sec    0.6-0.9    
    MOD_HGH_CG_HGH_22_CPL       1.192      0.59208     0.89844 sec    0.6-0.9    
>> varfun(@mean,Tfinish,"GroupingVariables",["Test" "TimeBin"])
ans =
  472×6 table
            Test               TimeBin      GroupCount     mean_V1      mean_V2      mean_Time 
    _____________________    ___________    __________    _________    _________    ___________
    CG_HGH_CG_LOW_11_CPL     0.085-0.125        11          0.91667      0.23436    0.10547 sec
    CG_HGH_CG_LOW_11_CPL     0.17-0.23          15           2.1104       1.0282    0.19922 sec
    CG_HGH_CG_LOW_11_CPL     0.27-0.45          46           4.5008       3.2523    0.36133 sec
[snip]
    MOD_LOW_CG_LOW_22_FCR    0.17-0.23          15           1.3824       1.3781    0.19922 sec
    MOD_LOW_CG_LOW_22_FCR    0.27-0.45          46         -0.38971    -0.025518    0.36133 sec
    MOD_LOW_CG_LOW_22_FCR    0.6-0.9            77         -0.60483     -0.47444       0.75 sec

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Combine rows and average contents in table

2 件のコメント
なしを表示なしを非表示

採用された回答

10 件のコメント
8 件の古いコメントを表示8 件の古いコメントを非表示

その他の回答 (2 件)

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

Combine rows and average contents in table

2 件のコメント なしを表示なしを非表示

採用された回答

10 件のコメント 8 件の古いコメントを表示8 件の古いコメントを非表示

その他の回答 (2 件)

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

2 件のコメント
なしを表示なしを非表示

10 件のコメント
8 件の古いコメントを表示8 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示