Performance impact of using package folders
5 ビュー (過去 30 日間)
古いコメントを表示
I have been trying to do a major re-factoring of my code and thought I would use the the package system (ie. +folders) to organize things better.
After noticing the slow down of my application. I did some benchmarking and noticed that there was a large difference in run time when using the same function in a package and not. To give you a sense of the impact, using a package increase the run time of one of my functions 50%, to put that in perspective that function is 100 lines of code, uses 48 variables, does 72 variable assignments, 76 additions, 36 multiplications, and 9 divisions (It is an incredibly fast algorithm to calculate the basis matrix for a spline of degree 4, that I am very proud of). So that is a lot of computations.
After doing to digging it appears that the package system is implemented as a series of objects/classes (which make sense). Therefore I assume that the slowdown because of that. So I have the following questions/observations.
1.) I am using 2012b, is there any improvements or better integration of the package system on the horizon, such that if I take the hit now the performance impact will be negligible in future.
2.)The performance impact seems to be the same regaurdless of package depth, but has more to do with the number of items in that package(ie more files = more impact).
3.) Use of a function handle to one of those packaged function seems to take that package reference penalty every time it is called instead of just at the creation of the function handle.
Please let me know if you have any thoughts about what I might be doing wrong.
2 件のコメント
per isakson
2013 年 4 月 12 日
I fail to reproduce your results with R2012a 64bit on Windows 7. With package I see only a very small penalty. Could you provide an example code.
回答 (3 件)
Ray
2018 年 1 月 18 日
A similar attempt using R2017a on Windows.
I created foo above and placed it in:
- onPathDir/foo.m
- onPathDir/+pack1/foo.m
- onPathDir/+pack1/+pack2/foo.m
I then added onPathDir to my MATLAB path and changed directory away from it. the following test was then saved in a different directory:
function [] = testFoo()
a = rand(1);
b = rand(1);
nRun = 1000000;
y0 = zeros(1, nRun);
y1 = zeros(1, nRun);
y2 = zeros(1, nRun);
tic
for x = 1 : nRun
y0(x) = foo(x, a, b);
end
t0 = toc;
tic
for x = 1 : nRun
y1(x) = pack1.foo(x, a, b);
end
t1 = toc;
tic
for x = 1 : nRun
y2(x) = pack1.pack2.foo(x, a, b);
end
t2 = toc;
[t0 t1 t2]
end
If run as a function, I saw no overhead to the use of packages on the search path. In fact, on some repeats, the package times were actually lower. However, if I comment out the function definition line and the final end, and turn testFoo.m into a script, there is an extreme overhead for using packages: ~800x for single package and ~1000x for nested packages.
It looks like the function compiler resolves the path and then code execution is fast whereas in a script, there may be path resolution overhead on each loop iteration!
0 件のコメント
Sean de Wolski
2013 年 4 月 18 日
編集済み: Sean de Wolski
2013 年 4 月 18 日
There is unfortunately a bit more overhead in the function call when calling packages. Here is the timing I did:
With this function both in and not in a package (+foopack):
function y = foo(x,a,b)
% I create awesome lines!
%
y = a.*x+b;
end
And this timing function:
function timeit
%Time foo v. foopack.foo calls
%
%
%SCd - 735262
%
%Some values:
[t1, t2] = deal(0);
a = 1;
b = 2;
x = 3;
%Sum their times over 1000 function calls:
for ii = 1:1000
tic
y = foo(x,b,a);
t1 = t1+toc;
tic
yp = foopack.foo(x,b,a);
t2 = t2+toc;
end
%Display results:
fprintf(1,'\nfoo regular: %fs\nfoo package: %fs\n',t1,t2);
fprintf(2,'\nSlowdown of Package: %f\n\n', t2./t1);
end
It is my understanding that this is pretty much the worst case scenario since the overhead controls over the computation time. I really like packages and use them a fair amount. But for speed critical applications, where a function will be called a lot of times, it might pay to pull those computations outside. It's also important to realize that even though it's slower, as far as total time is concerned, it's still pretty quick.
Joel Fischer
2022 年 2 月 21 日
Since I'm also refactoring a code base, I tried this again on R2021a and can confirm what @Ray found:
When calling package functions from a script, there is a considerable overhead, however when calling them from a function there doesn't seem to be any difference in performance. Additional depth (calls to a function in a sub-package) seems to slightly increase the overhead as well.
In my case, the overhead when calling from a script was ~10us per call (run on a Intel i7-9750H @ 2.6GHz, 64GB DDR4 @ ~2600MHz).
My test function was:
function [C] = foo(n)
A = rand(n);
B = rand(n);
C = A\B;
end
Three copies of which were located at (relative to the current working directory):
- foo.m
- +test/foo.m
- +test/test2/foo.m
As suggested by @Ray, the benchmark was run once as a function and once as a script (by just commenting out the first and the last line):
function [t,fig] = foo_test()
%% setup
n = 20;
t = zeros(3*n,3);
%% benchmark
for j = 1:n
A = 0; a = tic(); for i=1:100000; A = A + foo(5); end; t((j-1)*3+1,1) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + test.foo(5); end; t((j-1)*3+1,2) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + test.test2.foo(5); end; t((j-1)*3+1,3) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + test.foo(5); end; t((j-1)*3+2,2) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + test.test2.foo(5); end; t((j-1)*3+2,3) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + foo(5); end; t((j-1)*3+2,1) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + test.test2.foo(5); end; t((j-1)*3+3,3) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + foo(5); end; t((j-1)*3+3,1) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + test.foo(5); end; t((j-1)*3+3,2) = toc(a);
end
%% plotting
fig = figure();
%title('script');
title('function');
hold on;
C = colororder();
for i=1:3
histogram(t(:,i),'FaceColor',C(i,:),'FaceAlpha',0.5);
end
hold off;
legend({'foo(5)','test.foo(5)','test.test2.foo(5)'},'Location','North');
ylabel('counts [-]');
xlabel('t/100k calls [s]');
grid on;
box on;
end
0 件のコメント
参考
カテゴリ
Help Center および File Exchange で Get Started with MATLAB についてさらに検索
製品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!