Replace nested loops?

Question

Andy 2011 年 12 月 7 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/23329-replace-nested-loops

Is it possible to replace 4 for loops in the form: for if for for if for if .... ....

with something that is more efficient?

because some of the data i run have millions of variables, and i have lots of data set to run, it takes a few days to finish them. So anything that would lower the run time of this section would be geatly appreciated. Thanks

[EDITED: Code yopied form the comments, Jan Simon]

for i=1:length(starts)
   counter = 0;
   if isempty(starts{i}) == 0
      for j = 1: length(starts{i})     
         for k = 1: length(starts)
            if isempty(starts{k}) ==0
               for m = 1:length (starts{k})
                  if stops{i}(j) >= starts{k}(m) && stops{i}(j)< stops{k}(m) && isempty(peak_loc3{k})==0 && peak_loc3{i}(j)~= peak_loc3{k}(m)
                     counter = counter +1;
                     overlap{1,i}(counter) = peak_loc3{k}(m);
                     overlap{2,i}(counter) = peak_loc3{i}(j);
                  end
               end
            end
         end
      end
   end
end

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Sean de Wolski 2011 年 12 月 7 日

We really need to see the operations to figure out if it's possible. A well orchestrated for-loop should be fairly fast in newer versions.

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Sven 2011 年 12 月 7 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/23329-replace-nested-loops#answer_30623

MATLAB Online で開く

A small time-drain will be the fact that inside the loop, the overlap variable gets constantly resized. In the MATLAB editor, these variables will have a little orange line under them. If you hover over that line, it will warn you about this potential problem.

Here's a first attempt that will reduce the time needed. Note I've also replaced "isempty(x)==0" with "~isempty(x)" (for simplicity) and replaced some of the nested if statements with continue statements, just to have less nesting (which can get confusing).

overlap = cell(2, length(starts));
nonEmptyStarts = find(~cellfun(@isempty,starts));
for i=nonEmptyStarts
    counter = 0;
    thisStart = starts{i};
    thisStop = stops{i};
    for j = 1: length(thisStart)
        for k = nonEmptyStarts
            if isempty(peak_loc3{k}), continue; end
            thatStart = starts{k};
            thatStop = stops{k};
            thisMask = thisStop(j)>=thatStart & thisStop(j)<thatStop & peak_loc3{i}(j)~=peak_loc3{k}(1:length(thatStart))';
            for m = find(thisMask);
                    counter = counter +1;
                    overlap{1,i}(counter) = peak_loc3{k}(m);
                    overlap{2,i}(counter) = peak_loc3{i}(j);
            end
        end
    end
end

Unfortunately there is still a big culprit of "variable size adjustment" sitting inside a loop, which will really slow down the code. If you see the line starting with overlap{1,i}(counter) =, you'll notice that every time this line is run, the variable sitting in the cell at overlap{1,i} grows by one. If this happens a lot, MATLAB has to work really hard to find new space in memory fitting this new size.

This updated code currently has an approximately 10-fold reduction in running time to the original.

UPDATE

overlap = cell(2, length(starts));
nonEmptyStarts = find(~cellfun(@isempty,starts));
for i=nonEmptyStarts
    counter = 0;
    % Get column vectors of the first start/stop pairs
    startA = starts{i}';    stopA = stops{i}';
    for k = nonEmptyStarts
        % Get row vectors of the second start/stop pairs
        startB = starts{k};    stopB = stops{k};
        % Get a mask of all A-B pairs that match requirements
        ABMask = bsxfun(@ge,stopA,startB) & ...
            bsxfun(@lt,stopA,stopB) & ...
            bsxfun(@ne,peak_loc3{i}(1:numel(startA)), peak_loc3{k}(1:numel(startB))');
        [j,m] = find(ABMask);
        numToAdd = length(m);
        if ~numToAdd, continue; end
        % Append them to "overlap"
        indsToInsert = (1:numToAdd) + counter;
        counter = counter + numToAdd;
        overlap{1,i}(indsToInsert) = peak_loc3{k}(m);
        overlap{2,i}(indsToInsert) = peak_loc3{i}(j);
    end
end

This update should make significant improvements on a large dataset. There is still room for improvement, depending on the type and sizes of data you have. You can actually get a good view of what parts of the code take the most time by replacing tic and toc with profile on and profile viewer.

I have a feeling that the assignment into overlap will still be the biggest area for possible improvement.

12 件のコメント
10 件の古いコメントを表示10 件の古いコメントを非表示

Andy 2011 年 12 月 7 日

wow your first update made my code go from 160 seconds to 11 seconds!!

Andy 2011 年 12 月 7 日

the second update took it down to 1 second! thanks! let me try running my slowest set of data :D

サインインしてコメントする。

Replace nested loops?

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

採用された回答

12 件のコメント
10 件の古いコメントを表示10 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

Community Treasure Hunt

Replace nested loops?

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

採用された回答

12 件のコメント 10 件の古いコメントを表示10 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

Community Treasure Hunt

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

12 件のコメント
10 件の古いコメントを表示10 件の古いコメントを非表示