Improve the speed of nested for loops through vectorization or similar methods

3 ビュー (過去 30 日間)
Matthew Kehoe
Matthew Kehoe 2021 年 7 月 17 日
編集済み: DGM 2021 年 7 月 18 日
My Matlab code has a function that is called 10^3 - 10^7 times. I'm curious if I can improve the speed of the function through vectorization or a similar method.
clc; clear all;
% Test data for function
u = rand(32,33);
Nx = 32;
Nz = 32;
Dz = rand(Nx+1,Nz+1);
u_z = zeros(Nx,Nz+1);
u_z_2 = zeros(Nx,Nz+1);
g = zeros(Nz+1,1);
% Method 1 - Original Implementation with double for loop
tic
for j=1:Nx
for ell=0:Nz
g(ell+1) = u(j,ell+1);
end
u_z(j,:) = (2.0)*Dz*g;
end
toc
% Method 2 - Remove one for loop
tic
for j=1:Nx
g=u(j,:)';
u_z_2(j,:) = (2.0)*Dz*g;
end
diff = norm(u_z - u_z_2,inf);
toc
Repeating these for loops 10,000 times gives
clc; clear all;
u = rand(32,33);
Nx = 32;
Nz = 32;
Dz = rand(Nx+1,Nz+1);
u_z = zeros(Nx,Nz+1);
u_z_2 = zeros(Nx,Nz+1);
g = zeros(Nz+1,1);
tic
for rep=1:10000
for j=1:Nx
for ell=0:Nz
g(ell+1) = u(j,ell+1);
end
u_z(j,:) = (2.0)*Dz*g;
end
end
toc
tic
for rep=1:10000
for j=1:Nx
g=u(j,:)';
u_z_2(j,:) = (2.0)*Dz*g;
end
end
toc
diff = norm(u_z - u_z_2,inf);
where the original implementation is slightly faster since the above code returns
Elapsed time is 0.771755 seconds.
Elapsed time is 1.079783 seconds.
Could the speed be improved through implementating vectorization or a similar method?

採用された回答

DGM
DGM 2021 年 7 月 18 日
編集済み: DGM 2021 年 7 月 18 日
One big speed improvement you can do is to move the scalar multiplication of Dz outside the loop, but if you don't use a loop, it doesn't really matter.
% Test data for function (i'm using bigger arrays)
Nx = 320;
Nz = 320;
u = zeros(Nx,Nz+1);
ntests = 100; % number of test iterations to average exec time
Dz = rand(Nx+1,Nz+1);
u_z = zeros(Nx,Nz+1);
u_z_2 = zeros(Nx,Nz+1);
g = zeros(Nz+1,1);
% Method 1 - Original Implementation with double for loop
tic
for N = 1:ntests
for j=1:Nx
for ell=0:Nz
g(ell+1) = u(j,ell+1);
end
u_z(j,:) = (2.0)*Dz*g;
end
end
toc/ntests
ans = 0.0257
% Method 2 - Remove one for loop
tic
for N = 1:ntests
for j=1:Nx
g=u(j,:)';
u_z_2(j,:) = (2.0)*Dz*g;
end
end
toc/ntests
ans = 0.0227
immse(u_z,u_z_2) % result is identical
ans = 0
% simplified
tic
for N = 1:ntests
uuu = (2*Dz*u.').';
end
toc/ntests
ans = 9.1907e-04
immse(u_z,uuu) % result is identical
ans = 0
When you're trying to find out how to make things fast, it might matter how you scale the test to emphasize the execution time. Increasing the number of iterations or the size of the inputs may reveal different things. It all depends on what you expect to do.

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeDownloads についてさらに検索

製品


リリース

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by