Optimize the ordering of nested loop for speed

10 ビュー (過去 30 日間)
Alessandro D
Alessandro D 2019 年 1 月 19 日
コメント済み: Bruno Luong 2021 年 7 月 28 日
Suppose I have the following code. Will it be faster version 1 or version 2? What changes is the ordering of the two nested loops
VERSION 1
% bigArray has dim: [npolv,nz,nsv]
% npolv=68961 > nsv=200 > nz=81
% zgrid is [nz,1], kgrid is [nsv,1]
for j=1:nz
for qq=1:nsv
% the output of fun is a vector dim npolv
bigArray(:,j,qq) = fun(zgrid(j),kgrid(qq));
end
end
or VERSION 2
% bigArray has dim: [npolv,nz,nsv]
% npolv=68961 > nsv=200 > nz=81
% zgrid is [nz,1], kgrid is [nsv,1]
for qq=1:nsv
for j=1:nz
% the output of fun is a vector with dim npolv
bigArray(:,j,qq) = fun(zgrid(j),kgrid(qq));
end
end
  4 件のコメント
Bruno Luong
Bruno Luong 2019 年 1 月 20 日
I suppose if you ask such question, then the bottleneck is nothing to do owith looping but calling the function inside the loop.
If you want to speedup, you need to vectorize the function FUN that accept ND-array. Changing loop order won't do much.
Image Analyst
Image Analyst 2019 年 1 月 20 日
Or equally FAST. Like you said Bruno, the speed has nothing to do with looping - it's the insides that count. Look at how much time the looping alone spends:
nsv=200;
nz=81;
tic
for j=1:nz
for qq=1:nsv
;
end
end
toc
tic
for qq=1:nsv
for j=1:nz
;
end
end
toc
and you get times for the for loops alone that are so fast they're not even noticeable:
Elapsed time is 0.000221 seconds.
Elapsed time is 0.000075 seconds.
They're in the microseconds range. There is no way a person would notice those absolute elapsed times, much less a difference between those two times. They're just too fast!

サインインしてコメントする。

採用された回答

Image Analyst
Image Analyst 2019 年 1 月 19 日
Why not use tic before the loop, and toc after the loop?
Because MATLAB is column major, you'll find it's best to put the right most indexes innermost, and the left indexes outermost:
for slice = 1 : slices
for col = 1 : numColumns
for row = 1 : numRows
array(row, col, slice) = ......
end
end
end
  5 件のコメント
Matthew Kehoe
Matthew Kehoe 2021 年 7 月 27 日
編集済み: Matthew Kehoe 2021 年 7 月 27 日
@Bruno Luong It appears to make a difference when the loop variables are large.
% Matrix named array is size(#Rows,#Columns,#Slices)
% Slice and col are swapped which decreases performance
slices = 50;
numColumns = 20;
numRows = 30;
array = zeros(numRows,slices,numColumns);
array2 = zeros(numRows,numColumns,slices);
tic
for ntests = 1:1000
for slice = 1 : slices
for col = 1 : numColumns
for row = 1 : numRows
array(row, slice, col) = array(row, slice, col) + 5;
end
end
end
end
toc
tic
% Ordered by right most indexes innermost
for ntests = 1:1000
for slice = 1 : slices
for col = 1 : numColumns
for row = 1 : numRows
array2(row, col, slice) = array2(row, col, slice) + 5;
end
end
end
end
toc
% Elapsed time of method 1 is 0.053500 seconds.
% Elapsed time of method 2 is 0.041477 seconds.
and
% Matrix named array is size(#Rows,#Columns,#Slices)
% Slice and col are swapped which decreases performance
slices = 50;
numColumns = 400;
numRows = 300;
array = zeros(numRows,slices,numColumns);
array2 = zeros(numRows,numColumns,slices);
tic
for ntests = 1:1000
for slice = 1 : slices
for col = 1 : numColumns
for row = 1 : numRows
array(row, slice, col) = array(row, slice, col) + 5;
end
end
end
end
toc
tic
% Ordered by right most indexes innermost
for ntests = 1:1000
for slice = 1 : slices
for col = 1 : numColumns
for row = 1 : numRows
array2(row, col, slice) = array2(row, col, slice) + 5;
end
end
end
end
toc
% Elapsed time of method 1 is 12.153343 seconds.
% Elapsed time of method 2 is 8.756736 seconds.
Bruno Luong
Bruno Luong 2021 年 7 月 28 日
Your code run on TMW server
% Matrix named array is size(#Rows,#Columns,#Slices)
% Slice and col are swapped which decreases performance
slices = 50;
numColumns = 400;
numRows = 300;
array = zeros(numRows,slices,numColumns);
array2 = zeros(numRows,numColumns,slices);
tic
for ntests = 1:1000
for slice = 1 : slices
for col = 1 : numColumns
for row = 1 : numRows
array(row, slice, col) = array(row, slice, col) + 5;
end
end
end
end
toc
Elapsed time is 13.637798 seconds.
tic
% Ordered by right most indexes innermost
for ntests = 1:1000
for slice = 1 : slices
for col = 1 : numColumns
for row = 1 : numRows
array2(row, col, slice) = array2(row, col, slice) + 5;
end
end
end
end
toc
Elapsed time is 12.603883 seconds.

サインインしてコメントする。

その他の回答 (1 件)

Mark McBroom
Mark McBroom 2019 年 1 月 19 日
  1. use profile tool to determine hot spot in code.
  2. pre-allocate bigArray. On my computer this reduced execution time by 70%
  1 件のコメント
Alessandro D
Alessandro D 2019 年 1 月 19 日
Dear Mark, thanks for your answer. I didn't report it in then sample code in my question but, yes, I had pre-allocated bigArray

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeLoops and Conditional Statements についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by