Efficient way of Vectorization

3 ビュー (過去 30 日間)
Nadeem Ahmed
Nadeem Ahmed 2022 年 11 月 29 日
コメント済み: Matt J 2022 年 11 月 30 日
Hello, I searched everywhere for the efficient explaination of vectorization, I would like to know how can we use the technique of vectorization efficiently if we have this kind of problem
clc
clear
close all
n=1000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
tic
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
toc
Elapsed time is 0.049266 seconds.
%VECTORIZATION
tic
C2(2:n-1,2:n-1)=(A(2:n-1,2:n-1).*B(2:n-1,1:n-2) + A(1:n-2,2:n-1).*B(3:n,1:n-2))./(A(2:n-1,3:n).*B(3:n,2:n-1));
toc;
Elapsed time is 0.014871 seconds.
norm(C1-C2)
ans = 0
This is a very basic example, although it is showing the improvement after vectorization but not that enough. If I make more divison and multiplication in the same function, "vectorization" will become even worse than "for loop". If anybody have any suggestion regarding this, it would be very helpful for me.
  8 件のコメント
Mike Croucher
Mike Croucher 2022 年 11 月 30 日
Thanks. So for N,M=50, the code runs in 0.01 seconds on my machine.
Increasing to N,M=100, the code runs in 0.22 seconds
Trying N,M=200, I run out of memory on my 32Gb laptop.
What values of N and M are you interested in and how fast do you need the code to be?
Nadeem Ahmed
Nadeem Ahmed 2022 年 11 月 30 日
I will use this function for N,M>80 and I need to call this function more than thousands times therefore it should be of negligible time. Any suggestions are welcome.

サインインしてコメントする。

採用された回答

Matt J
Matt J 2022 年 11 月 30 日
編集済み: Matt J 2022 年 11 月 30 日
Unfortunately, this is a situation where the for loop is the fastest option. This is because vectorized solution does much more memory allocation than it should. I have raised this issue with MathWorks staff, but am not sure what is being done on it.
function test
n=1000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
timeit(@()method1)
timeit(@()method2)
ans =
0.0161
ans =
0.0210
function method1
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function method2
C2(2:n-1,2:n-1)=(A(2:n-1,2:n-1).*B(2:n-1,1:n-2) + A(1:n-2,2:n-1).*B(3:n,1:n-2))./(A(2:n-1,3:n).*B(3:n,2:n-1));
end
end
  9 件のコメント
Bruno Luong
Bruno Luong 2022 年 11 月 30 日
編集済み: Bruno Luong 2022 年 11 月 30 日
I don't think the problem is allocating memory, but actually indexing with truncation index, which requires elements in memory to be rearranged.
I'm not surprised that to make a vectorize code as fast as the for-loop requires a big development of the internal engine (for instant using meta data that describe subarray of an array without copying the data).
Indexing is always the bottleneck of MATLAB.
Matt J
Matt J 2022 年 11 月 30 日
I don't think the problem is allocating memory, but actually indexing with truncation index
Not sure what a "truncation index" refers to here. In any case, the subsref operations are definitely to blame, since when we revise the test with the indexing done offline, the vectorized version is much more competitive with the loops:
function test
n=1000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
[Q1,Q2,Q3,Q4,Q5,Q6]=...
deal( A(2:n-1,2:n-1) , B(2:n-1,1:n-2), A(1:n-2,2:n-1),...
B(3:n,1:n-2), A(2:n-1,3:n), B(3:n,2:n-1) );
timeit(@()method1)
timeit(@()method2)
ans =
0.0149
ans =
0.0051
function method1
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function method2
C2(2:n-1,2:n-1)=(Q1.*Q2 + Q3.*Q4)./(Q5.*Q6);
end
end

サインインしてコメントする。

その他の回答 (1 件)

Mike Croucher
Mike Croucher 2022 年 11 月 30 日
Switch the order of the loops around. It will be faster because you'll be operating on the matrix column-wise
test
loops
ans = 0.0884
loops 2
ans = 0.0203
vector
ans = 0.0928
function test
n=2000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
disp('loops')
timeit(@()loops)
disp('loops 2')
timeit(@()loops2)
disp('vector')
timeit(@()vector)
function loops
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function loops2
for j=2:n-1
for i=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function vector
C2(2:n-1,2:n-1)=(A(2:n-1,2:n-1).*B(2:n-1,1:n-2) + A(1:n-2,2:n-1).*B(3:n,1:n-2))./(A(2:n-1,3:n).*B(3:n,2:n-1));
end
end
  4 件のコメント
Nadeem Ahmed
Nadeem Ahmed 2022 年 11 月 30 日
Yes, you are right, beacsue I changed all my for loop but still I didn't get any improvement.
Dyuman Joshi
Dyuman Joshi 2022 年 11 月 30 日
This is neat, @Mike Croucher! Learned something new today :D

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeTime Series Objects についてさらに検索

製品


リリース

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by