I need to break down the flow data into groups of data with months for every year.
5 ビュー (過去 30 日間)
古いコメントを表示
For example: All the entries for the month of Jan for all the years (suppose 2010-2022) are stored in one group. Similarly, all the entries for the month of Feb for all the years.(suppose 2010-2022) are stored in second group and so on. My end goal is to calculate the percentile of monthly data for all the years.
Date Flow
12/1/2010 100
12/2/2010 200
12/3/2010 150
12/4/2010 176
.....
.....
12/1/2022 145
I tried to create the vlookup function to create the matrix but i wasn't able to. So, I decided to group all months together and then perform operations on them.
0 件のコメント
採用された回答
Cris LaPierre
2022 年 6 月 20 日
6 件のコメント
Cris LaPierre
2022 年 6 月 25 日
I think the best approach here is to create a second table that is a subset of the full data set, but just contains the date ranges of interest. Then you can use groupsummary to compute the desired stats grouped by year. I don't know what you want to use for percentile, so I use 42.
T = readtable('https://www.mathworks.com/matlabcentral/answers/uploaded_files/1043800/Example.xlsx',...
'ReadVariableNames',false)
% Use 'day of year' to identify dates within the desired range for all years
drng = day(datetime(2004,9,27):caldays(1):datetime(2004,10,27),'dayofyear');
d = day(T.Var1,'dayofyear');
% Create subtable
idx= ismember(d,drng);
Tsub = T(idx,:)
% compute stats for each year
Tstats = groupsummary(Tsub,'Var1','year',{@(A) prctile(A,42),'max','min'},'Var2');
Tstats.Properties.VariableNames = ["Tyr","GroupCnt","Tmin","Tmax","Tpct"]
Cris LaPierre
2022 年 6 月 26 日
Leap years are going to cause some issues with my previous answer. Unfortunately, I couldn't find a slick way to do this, so here's a brute force approach.
T = readtable('https://www.mathworks.com/matlabcentral/answers/uploaded_files/1043800/Example.xlsx',...
'ReadVariableNames',false);
% Define range of interest
startD = datetime(2004,9,27);
stopD = datetime(2004,10,27);
% find data for all years between the indicated month/days
y = unique(year(T.Var1));
idx = zeros(height(T),1);
for i = 1:length(y)
idx = idx + isbetween(T.Var1,datetime(y(i),month(startD),day(startD)),...
datetime(y(i),month(stopD),day(stopD)));
end
Tsub = T(logical(idx),:)
% Compute summary stats
Tstats = groupsummary(Tsub,'Var1','year',{@(A) prctile(A,42),'max','min'},'Var2');
Tstats.Properties.VariableNames = ["Tyr","GroupCnt","Tmin","Tmax","Tpct"]
その他の回答 (0 件)
参考
カテゴリ
Help Center および File Exchange で Dates and Time についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!