フィルターのクリア

combine data into hourly-based data

1 回表示 (過去 30 日間)
ahmad Saad
ahmad Saad 2023 年 8 月 21 日
コメント済み: ahmad Saad 2023 年 8 月 21 日
Col2 is a time column and i need to classify the attached data to be hourly-based data.
for example:
for 0 < col2 <=1 get the median of corresponding values of col3 (and col4)
for 1 < col2 <=2 get the median of corresponding values of col3 (and col4)
.
.
.
.
for 23 < col2 <=24 get the median of corresponding col3 (and col4)
so, i get a matrix of three columns:
c4= [i median(col3) median(col4)];
where i =1:24
my trial:
for i=1:24
id=find(data(:,2)>i-1 & data(:,2)>i)
m1(i)= median(data(id,3));
m2(i)= median(data(id,4));
c4(i,[1:3])=[i m1(i) m2(i)];
end
Any help

採用された回答

Voss
Voss 2023 年 8 月 21 日
load data.mat
First, your approach, modified:
for i=1:24
id=find(data(:,2)>=i-1 & data(:,2)<i);
m1(i)= median(data(id,3));
m2(i)= median(data(id,4));
c4(i,[1:3])=[i m1(i) m2(i)];
end
disp(c4)
1.0000 2.0049 2.8200 2.0000 1.6326 3.1700 3.0000 0.9196 3.1550 4.0000 NaN NaN 5.0000 NaN NaN 6.0000 0.9596 1.2100 7.0000 NaN NaN 8.0000 1.9756 4.3400 9.0000 NaN NaN 10.0000 NaN NaN 11.0000 NaN NaN 12.0000 4.6718 10.4350 13.0000 NaN NaN 14.0000 NaN NaN 15.0000 NaN NaN 16.0000 6.1635 6.9350 17.0000 4.5366 6.6450 18.0000 NaN NaN 19.0000 NaN NaN 20.0000 2.1452 5.4050 21.0000 2.2494 4.8300 22.0000 1.9169 3.5600 23.0000 2.0172 2.9000 24.0000 NaN NaN
Another approach to calculate the medians for each hour:
hr = discretize(data(:,2),0:24);
[g,g_id] = findgroups(hr);
meds = splitapply(@(x)median(x,1),data(:,[3 4]),g);
disp(meds);
2.0049 2.8200 1.6326 3.1700 0.9196 3.1550 0.9596 1.2100 1.9756 4.3400 4.6718 10.4350 6.1635 6.9350 4.5366 6.6450 2.1452 5.4050 2.2494 4.8300 1.9169 3.5600 2.0172 2.9000
Then, if you don't want the final result to include medians for hours where there is no data:
c4 = [g_id meds];
disp(c4);
1.0000 2.0049 2.8200 2.0000 1.6326 3.1700 3.0000 0.9196 3.1550 6.0000 0.9596 1.2100 8.0000 1.9756 4.3400 12.0000 4.6718 10.4350 16.0000 6.1635 6.9350 17.0000 4.5366 6.6450 20.0000 2.1452 5.4050 21.0000 2.2494 4.8300 22.0000 1.9169 3.5600 23.0000 2.0172 2.9000
Or if you do want to include those NaN medians:
c4 = NaN(24,3);
c4(:,1) = 1:24;
c4(g_id,[2 3]) = meds;
disp(c4);
1.0000 2.0049 2.8200 2.0000 1.6326 3.1700 3.0000 0.9196 3.1550 4.0000 NaN NaN 5.0000 NaN NaN 6.0000 0.9596 1.2100 7.0000 NaN NaN 8.0000 1.9756 4.3400 9.0000 NaN NaN 10.0000 NaN NaN 11.0000 NaN NaN 12.0000 4.6718 10.4350 13.0000 NaN NaN 14.0000 NaN NaN 15.0000 NaN NaN 16.0000 6.1635 6.9350 17.0000 4.5366 6.6450 18.0000 NaN NaN 19.0000 NaN NaN 20.0000 2.1452 5.4050 21.0000 2.2494 4.8300 22.0000 1.9169 3.5600 23.0000 2.0172 2.9000 24.0000 NaN NaN

その他の回答 (1 件)

Dyuman Joshi
Dyuman Joshi 2023 年 8 月 21 日
There's no need of using find in the for loop
load('data.mat')
for i=1:24
%Comparison was incorrect
id=data(:,2)>=i-1 & data(:,2)<i;
m1(i)= median(data(id,3));
m2(i)= median(data(id,4));
c4(i,[1:3])=[i m1(i) m2(i)];
end
disp(c4)
1.0000 2.0049 2.8200 2.0000 1.6326 3.1700 3.0000 0.9196 3.1550 4.0000 NaN NaN 5.0000 NaN NaN 6.0000 0.9596 1.2100 7.0000 NaN NaN 8.0000 1.9756 4.3400 9.0000 NaN NaN 10.0000 NaN NaN 11.0000 NaN NaN 12.0000 4.6718 10.4350 13.0000 NaN NaN 14.0000 NaN NaN 15.0000 NaN NaN 16.0000 6.1635 6.9350 17.0000 4.5366 6.6450 18.0000 NaN NaN 19.0000 NaN NaN 20.0000 2.1452 5.4050 21.0000 2.2494 4.8300 22.0000 1.9169 3.5600 23.0000 2.0172 2.9000 24.0000 NaN NaN
%Now the MATLAB/vectorized approach
%Data
vec=data(:,2);
%Specify the bins
bins = 0:24;
%Discretize into bins with inclusion of the right side
%as described in the problem statement i.e. loweredge < data <= upperedge
idx=discretize(vec,0:24,'IncludedEdge','right');
%Accumulate according to the indices obtained by discretization
%and apply median function to the data
%Specify the output size as a column vector as indices are a column vector as well
%And the number of sets will be 1 less than the number of bins
fun = @(x) accumarray(idx,data(:,x),[numel(bins)-1 1],@median);
%Desired output
out = [(1:24)' fun(3) fun(4)];
disp(out)
1.0000 2.0049 2.8250 2.0000 1.6326 3.1750 3.0000 0.9196 3.1550 4.0000 0 0 5.0000 0 0 6.0000 0.9596 1.2100 7.0000 0 0 8.0000 1.9756 4.3400 9.0000 0 0 10.0000 0 0 11.0000 0 0 12.0000 4.6718 10.4350 13.0000 0 0 14.0000 0 0 15.0000 0 0 16.0000 6.1635 6.9350 17.0000 4.5366 6.6450 18.0000 0 0 19.0000 0 0 20.0000 2.1452 5.4000 21.0000 2.2494 4.8200 22.0000 1.9169 3.5500 23.0000 2.0172 2.8900 24.0000 0 0
The only difference is that the for loop approach yields NaN, where as accumarray approach gives 0, for no values in a particular bin.
  1 件のコメント
ahmad Saad
ahmad Saad 2023 年 8 月 21 日
Dyuman Joshi : Thanks for your response

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeMultirate Signal Processing についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by