Is it possible to aggregate data in a timetable with a weighted daily mean?

1 回表示 (過去 30 日間)
Hannes Schenk
Hannes Schenk 2021 年 11 月 9 日
回答済み: Seth Furman 2021 年 11 月 10 日
I have a timetable T with one or more entries corresponding to a single date:
With the retime function it is possible to aggregate the data with a daily mean like this:
T_agg = retime(T,'daily','mean');
However, what I really want do compute is the weighted daily average like this:
WeightedMean = sum(Measurement.*Weight)./sum(Weight);
where the variable "Measurement" and "Weight" are nx1 column vectors, where n is the number of data points in a single day.
The correct result for example for the 02-11-2021 is 374.06, opposed to the 494.58 resulting from the conventional mean.
My question is now, if it is possible to compute the weighted mean with the retime function by using a custom function handle of the aggregation method? If yes, how? Unfortunately, the number of data entries per day is not always 6. Unlike the figure suggests, the number of data entries per day varies.
I imagine a solution of this type:
WeightMean = @(Mea,Weight)sum(Mea.*Weight)./sum(Weight);
T_agg = retime(T,'daily',WeightMean,Mea,Weight);
Thank you for your help, I can't seem to figure this out myself.
EDIT: Attatched you can find the timetable as .mat-file
  3 件のコメント
Adam Danz
Adam Danz 2021 年 11 月 9 日
One approach would be to add a 4th column containing the weighted measurments and then use retime on the new column.
Hannes Schenk
Hannes Schenk 2021 年 11 月 10 日
Thank you both!
@dpb I attached the .mat file. I had the feeling, that retime is not going to work on this one. splitapply could work, I have not looked into it
@Adam Danz Thank you, this is actually what pointed me in the right direction. I solved the problem now by adding a new column containing the multiplied values of the two existing columns, calculated the daily sum with the retimes function and than divided the column with the summed multiplied values by the summed weighted values. It's actually a simple workaround. Problem solved :)

サインインしてコメントする。

採用された回答

Seth Furman
Seth Furman 2021 年 11 月 10 日
Another way to approach this is to
1) Copy the timetable.
2) Shift the copied timetable's row times to the start of the day.
3) Use rowfun with the weighted mean function handle you defined above, grouping by the row-times (i.e. by day).
load timetable.mat
T_agg = T;
T_agg.Properties.RowTimes = dateshift(T_agg.Properties.RowTimes, "start", "day");
weightedMean = @(M, W) sum(M .* W) ./ sum(W);
T_agg = rowfun(weightedMean, T_agg, "GroupingVariables", T_agg.Properties.DimensionNames{1}, "OutputVariableNames", "WeightedMean");
head(T_agg)
%
% ans =
%
% 8×2 timetable
%
% sample_date GroupCount WeightedMean
% ___________ __________ ____________
%
% 27-07-2020 1 313.55
% 03-08-2020 2 493.64
% 10-08-2020 2 234.58
% 17-08-2020 2 33.334
% 24-08-2020 2 63.589
% 31-08-2020 2 102.61
% 07-09-2020 2 492.55
% 14-09-2020 2 279.11

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeData Preprocessing についてさらに検索

製品


リリース

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by