Retime data aggregation for ID

1 回表示 (過去 30 日間)
Gian23
Gian23 2021 年 4 月 26 日
回答済み: Eric Sofen 2021 年 5 月 4 日
Hello everyone,
I'm going to calculate the monthly average of daily temperatures grouping by sensor using retime function. At the moment I'm trying to select each sensor with a loop and then apply the retime function, but I have to calculate six million rows so I would like to avoid a loop to speed up the calculation
I give an example:
Time = datetime({'18-feb-2021';'19-feb-2021';...
'01-mar-2021';'04-mar-2021';'18-feb-2021';...
'19-feb-2021';'01-mar-2021';'04-mar-2021'});
Temp = [56.82;62.72;64.52;63.81;63.45;59.7;60.27;61.32];
Sensor = [12;12;12;12;13;13;13;13];
TT = timetable(Time,Sensor,Temp);
Current code:
uni_sensor = unique(Sensor);
monthly_ds = timetable;
parfor kk = 1:length(uni_sensor)
index_retime = find(TT.Sensor == uni_sensor(kk));
sensor_retime = TT(index_retime,:);
monthly_data = retime(sensor_retime(:,2), 'monthly', 'mean');
data_sensor_retime = array2table(repmat(uni_sensor(kk), size(monthly_data, 1),1), 'VariableNames', "Sensor" );
monthly_sub_id = [monthly_data, data_sensor_retime];
monthly_ds = [monthly_ds; monthly_sub_id];
end
Desired output:
Time = datetime({'feb-2021';'feb-2021';...
'mar-2021';'mar-2021'});
Temp = [59.77;61.575;64.165;60.795];
Sensor = [12;12;13;13];
TT_out = timetable(Time,Sensor,Temp);
Thanks in advance,
Gianluca

採用された回答

Marco Riani
Marco Riani 2021 年 4 月 27 日
I think in this example it is unnecessary to use retime.
I would proceed as follows.
Time1=char(Time);
Time2=Time1(:,4:end);
TT = table(findgroups(string(Time2)),Sensor,Temp);
groupvars={'Sensor' 'Var1'};
datavars='Temp';
groupsummary(TT,groupvars,'mean',datavars)
Instead of using groupsummary it is possible to use grpstats. Please let us know which between groupsummary and grpstats is faster.
  1 件のコメント
Gian23
Gian23 2021 年 4 月 27 日
Very interesting approach and usefull overall! I tried yor solution and it works very well.
On the evidence of one test, grpstats is faster than groupsummary.
Many thanks again!

サインインしてコメントする。

その他の回答 (1 件)

Eric Sofen
Eric Sofen 2021 年 5 月 4 日
Another approach is to unstack the timetable based on the sensor ID, so you'd have a wide timetable with separate variables temp_12, temp_13, ..., then apply retime to that without a need for grouping. I don't know if it would be faster than Marco's findgroups approach (which is quite clever), and having the sensor IDs embedded in the table variable names may or may not be useful in the long run, but it's yet another way to tackle this problem.

カテゴリ

Help Center および File ExchangeTables についてさらに検索

タグ

製品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by