Improve speed to max calculation for max daily output
    5 ビュー (過去 30 日間)
  
       古いコメントを表示
    
Hi there,
I have done a good few variations of this and this code is the simpliest but not the most time efficient and it seems that the max calculation is the slowest.
[nrow,ncol] = size(test_data);
test_output = cell(nrow,ncol);  
for i = 1:nrow;
    for j = 1:ncol;
          inpdata = double(test_data{i,j});
          % Build array of time components
          dv = datevec(inpdata(:,1)) ;
          % Find all timestamps where the HH value (col 4) is 0 and 6
          time_markers = find(dv(:,4)==6 | dv(:,4)==0);  
          % Preallocate output array
          daily_max = zeros(length(test_data),4);     
          for n = 1:2:length(time_markers)-2;
              daily_max(n,1) = inpdata(time_markers(n+1,1),1);
              daily_max(n,2) = mean(inpdata(time_markers(n,1):time_markers(n+1,1),2));
              daily_max(n,3) = mean(inpdata(time_markers(n,1):time_markers(n+1,1),3));
              daily_max(n,4) = max(inpdata(time_markers(n,1):time_markers(n+1,1),4));
          end
          % Remove any extra cells in output file
          daily_max(daily_max(:,1)==0,:) = [];
          % Plug this into the final output
          test_output{i,j} = daily_max;
      end
  end
Any ideas of improving its performance? I know there are some unnecessary lines that I need to fine tune, but the main issue slowing the performance is the calling of the max, particularly, the mean function. The dv variable (first 10 rows) looks like this, if it helps understand what I want done:
1980  1  1  6  0  0
1980  1  1  12  0  0
1980  1  1  18  0  0
1980  1  2  0  0  0
1980  1  2  6  0  0
1980  1  2  12  0  0
1980  1  2  18  0  0
1980  1  3  0  0  0
1980  1  3  6  0  0
1980  1  3  12  0  0
and I want anything between 6 and the consecutive 0 to count as a day. The data has been formatted already so that there are no missing timesteps.
2 件のコメント
  Peter Perkins
    
 2015 年 8 月 6 日
				Mashtine, you're using a triple-nested loop. That almost certainly is not the way to go.
You should attach a short example of your input data, what you want as the result, and an explanation of the calculations to create that result.
採用された回答
  Peter Perkins
    
 2015 年 8 月 6 日
        There are lots of ways to do this. Here's one that assumes you have R2014b or later. If you only have R2013b or later, you can still use a table, but you'd have to use datenum and datestr rather than datetime, which was added in R2014b.
First load your numeric matrix and create a table, and then convert the datenum to a datetime:
>> load test_data2.mat
>> test_data = array2table(test_data,'VariableNames',{'Time' 'X' 'Y' 'Z'});
>> test_data.Time = datetime(test_data.Time,'ConvertFrom','datenum')
test_data = 
            Time              X          Y         Z   
    ____________________    ______    _______    ______
    01-Jan-1980 06:00:00    3.2872    0.34067    5.4056
    01-Jan-1980 12:00:00     1.268    0.20843    2.9019
    01-Jan-1980 18:00:00    2.8944    0.22515    4.5896
    02-Jan-1980 00:00:00    7.9143    0.57301    10.884
    02-Jan-1980 06:00:00    14.369     1.0058    18.924
    02-Jan-1980 12:00:00    17.886     1.2894     23.48
 [snip]
Next, create a variable that defines the way you want to group the rows of that table. On the minus side, you want to group midnight of tomorrow with 6am, 12pm, and 6pm of today, so you can't just get the day number. On the plus side, your data are completely regular, so you can just take each consecutive group of four rows:
>> n = height(test_data);
>> test_data.Day = repelem(1:(n/4),4)'
test_data = 
            Time              X          Y         Z       Day
    ____________________    ______    _______    ______    ___
    01-Jan-1980 06:00:00    3.2872    0.34067    5.4056     1 
    01-Jan-1980 12:00:00     1.268    0.20843    2.9019     1 
    01-Jan-1980 18:00:00    2.8944    0.22515    4.5896     1 
    02-Jan-1980 00:00:00    7.9143    0.57301    10.884     1 
    02-Jan-1980 06:00:00    14.369     1.0058    18.924     2 
    02-Jan-1980 12:00:00    17.886     1.2894     23.48     2 
 [snip]
Finally, do the grouped calculation on the table, and pretty up the result:
>> dailyStats = @(x,y,z) deal(mean(x),mean(y),max(z));
>> dailies = rowfun(dailyStats,test_data, ...
    'GroupingVariable','Day', 'InputVariables',{'X' 'Y' 'Z'}, ...
    'OutputVariableNames',{'meanX' 'meanY' 'maxZ'});
>> dailies.Properties.RowNames = {}; % don't need these
>> dailies.Day = dateshift(test_data.Time(1:4:n),'start','day');
>> dailies.Day.Format = 'dd-MMM-yyyy'
dailies = 
        Day        GroupCount    meanX      meanY      maxZ 
    ___________    __________    ______    _______    ______
    01-Jan-1980    4              3.841    0.33681    10.884
    02-Jan-1980    4             16.812     1.1289    24.982
    03-Jan-1980    4             9.4298    0.62444    17.041
    04-Jan-1980    4             14.185    0.97222    24.212
    05-Jan-1980    4             12.899    0.99925    21.861
    06-Jan-1980    4             6.2882    0.53728    10.743
  [snip]
Hope this helps.
2 件のコメント
  Peter Perkins
    
 2015 年 9 月 14 日
				I think you mean, "I have to do the same grouped calculation of 121*97 sets of data." If that's the case, it seems like you have two options:
- Loop over the data sets and do the calculation 121*97 times, or
 - Somehow combine the separate data sets into one
 
I can't say how to do the latter, since I don't really know anything about your data.
その他の回答 (0 件)
参考
カテゴリ
				Help Center および File Exchange で Loops and Conditional Statements についてさらに検索
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!