Unexpected slowdown using () indexing

format long g
NULL = 0; counter = 0; start = tic; while toc(start) < 10; NULL; counter = counter + 1; end; counter/10
ans =
3699475.2
NULL = @()0; counter = 0; start = tic; while toc(start) < 10; NULL(); counter = counter + 1; end; counter/10
ans =
2326610.7
NULL = 0; counter = 0; start = tic; while toc(start) < 10; NULL(); counter = counter + 1; end; counter/10
ans =
588245.3
NULL = 0; counter = 0; start = tic; while toc(start) < 10; NULL(1); counter = counter + 1; end; counter/10
ans =
2540740.2
Observe that referring to a variable in a loop can iterate millions of time per second, and executing an anonymous function to retrieve a value is (fewer) millions of times per second -- but that taking a plain scalar and using empty () to dereference it slows down to the hundreds-of-thousands range (roughly a factor of 7). But we can see that using (1) indexing is slightly slower than just using the variable with no () but is still comparable to no indexing in speed.
So there is something about using the empty index on a scalar that invokes much worse performance.

7 件のコメント

Bruno Luong
Bruno Luong 2023 年 1 月 30 日
編集済み: Bruno Luong 2023 年 1 月 30 日
I didn't know this indexing (empty reference) should return
NULL = 0; NULL();
Is there any doc that describes it? obviously NULL() returns the same as NULL. What is the use case of empty indexing (beside slowing down MATLAB)?
Walter Roberson
Walter Roberson 2023 年 1 月 30 日
My inspiration for this was a question where someone wanted to execute something 1000-ish times per second, and I was wondering what maximum execution rate was for simple expressions. I had the anonymous function which needed () to execute it, and I thought it would be best to put () on the variable too so that in both cases there would be the same parsing challenge.
I already knew from previous experience that you can () a variable. If I recall correctly there was a release notes item about it somewhere roughly a decade ago. I do not recall at the moment what the change in behaviour was however.
Walter Roberson
Walter Roberson 2023 年 1 月 30 日
My older iMac shows a slightly different pattern of relative speeds:
3906823 (variable)
2998333 (anonymous function)
588344 (variable() )
3474287 (variable(1))
The difference is that here on MATLAB Answers, the (1) index runs at pretty close to the same speed as anonymous function call, whereas on my iMac, the (1) index runs approaching but a little slower as plain variable access.
Walter Roberson
Walter Roberson 2023 年 1 月 30 日
format long g
N = 1e6;
NULL = 0; start = tic; for counter = 1:N; NULL; end; stop = toc(start); N/stop
ans =
76687116.5644172
NULL = @()0; start = tic; for counter = 1:N; NULL(); end; stop = toc(start); N/stop
ans =
13668297.7501982
NULL = 0; start = tic; for counter = 1:N; NULL(); end; stop = toc(start); N/stop
ans =
890096.255009017
NULL = 0; start = tic; for counter = 1:N; NULL(1); end; stop = toc(start); N/stop
ans =
30163182.8190511
These numbers are executions-per-second -- using a fixed number of cycles allows us to eliminate the per-cycle use of toc() which could potentially be a considerable drag.
In this case we see plain variable reference is over 80 times faster than variable with empty index; on my iMac the ratio is closer to 110 times.
Walter Roberson
Walter Roberson 2023 年 1 月 30 日
Bruno Luong
Bruno Luong 2023 年 1 月 31 日
編集済み: Bruno Luong 2023 年 1 月 31 日
I think the only justification existing of this strange empty indexing is when using with comma list that reduces to an empty cell
A=magic(2);
c=cell(0);
Aemptyidexing = A(c{:}) % equal to A() i.e. to A
%
But to me the value of A() == A is just an arbitrary design choice. I don't see any logical pattern with what come when the comma list is not empty.
To me this "feature" can be ignored; for good reason.
Personally I would prefer an error is thrown when empty indexing is encountered.
Walter Roberson
Walter Roberson 2023 年 1 月 31 日
Looks like it triggers a copy!
format debug
NULL = 0
NULL =
Structure address = 7f077e274ee0 m = 1 n = 1 pr = 7f4605e12160 0
NULL
NULL =
Structure address = 7f077e412e00 m = 1 n = 1 pr = 7f076cc60240 0
NULL()
ans =
Structure address = 7f077e3376a0 m = 1 n = 1 pr = 7f077bc2e320 0

サインインしてコメントする。

回答 (1 件)

James Lebak
James Lebak 2023 年 2 月 1 日

3 投票

Paren-reference in many cases creates a copy. This is expected behavior.
NULL(1) is nearly as fast as NULL on a scalar because it's been specially optimized. NULL() is indexing with no indices (empty paren-reference). We allow this syntax because when NULL is a function handle, NULL() means to call the function with no arguments. But we don't consider it common usage, and we haven't optimized it. As you observed, it therefore creates an expensive copy. We could consider optimizing it in a future release. If you have a need for this operation to be performant we'd be interested to hear of it.

5 件のコメント

Bruno Luong
Bruno Luong 2023 年 2 月 1 日
"We could consider optimizing it in a future release. If you have a need for this operation to be performant we'd be interested to hear of it."
Nah you have better thing to do IMO.
Walter Roberson
Walter Roberson 2023 年 2 月 2 日
It was more surprising than important. I would have expected the execution engine to optimize the construct already. It does lead, though, to some questions about relative execution rates.
In the below, the higher number is how many times faster that computation is relative to the slowest computation
format long g
N = 1e6;
NULL = 0; start = tic; for counter = 1:N; NULL; end; stop = toc(start); rates(1) = N/stop;
NULL = 0; start = tic; for counter = 1:N; NULL(); end; stop = toc(start); rates(2) = N/stop;
NULL = 0; start = tic; for counter = 1:N; NULL(1); end; stop = toc(start); rates(3) = N/stop;
NULL = 0; start = tic; for counter = 1:N; NULL(1:1); end; stop = toc(start); rates(4) = N/stop;
NULL = 0; start = tic; for counter = 1:N; NULL(end); end; stop = toc(start); rates(5) = N/stop;
NULL = 0; start = tic; for counter = 1:N; NULL(1:end); end; stop = toc(start); rates(6) = N/stop;
start = tic; for counter = 1:N; 1:1; end; stop = toc(start); rates(7) = N/stop;
start = tic; for counter = 1:N; 1:1:1; end; stop = toc(start); rates(8) = N/stop;
script_rates = rates.' / min(rates)
script_rates = 8×1
102.761950641073 1 42.1692466774125 2.24343263393608 34.7789455202167 2.20961922761883 7.29261622127669 7.37390513569138
rates(1:2).'
ans = 2×1
1.0e+00 * 74112502.7792189 721205.682812298
do_timing_function(N)
function_rates = 8×1
37.993390636192 1 34.3770511120931 1.91270048107262 33.2856694620343 1.89901723540296 6.29798797534622 5.99011852635145
ans = 2×1
1.0e+00 * 32558442.4041154 856950.165777009
function do_timing_function(N)
NULL = 0; start = tic; for counter = 1:N; NULL; end; stop = toc(start); rates(1) = N/stop;
NULL = 0; start = tic; for counter = 1:N; NULL(); end; stop = toc(start); rates(2) = N/stop;
NULL = 0; start = tic; for counter = 1:N; NULL(1); end; stop = toc(start); rates(3) = N/stop;
NULL = 0; start = tic; for counter = 1:N; NULL(1:1); end; stop = toc(start); rates(4) = N/stop;
NULL = 0; start = tic; for counter = 1:N; NULL(end); end; stop = toc(start); rates(5) = N/stop;
NULL = 0; start = tic; for counter = 1:N; NULL(1:end); end; stop = toc(start); rates(6) = N/stop;
start = tic; for counter = 1:N; 1:1; end; stop = toc(start); rates(7) = N/stop;
start = tic; for counter = 1:N; 1:1:1; end; stop = toc(start); rates(8) = N/stop;
function_rates = rates.' / min(rates)
rates(1:2).'
end
So () is the slowest, but indexing 1:1 and indexing 1:end are surprisingly slow (and statistically tied). Low much of that time is in computation of 1:1, and how does that compare to 1:1:1 ? Those are a lot slower than I expected. I suspected at first that the overhead of end might be what was slowing down 1:end but the rate for end indexing is only a little worse than the rate for (1) indexing which tells us that end must have been special cased as well. I would have expected the execution engine to notice that the variable is not changing and so it could replace 1:1 or 1:end indexing with (1) indexing.
But there is, of course, still permitted to be a difference between the execution engine for a script compared to a function. If you compare the two rate tables above you can see that they are similar enough in nearly all locations for the differences to be plausibly just due to timing noise. But the case of a simple variable access with no index is much faster in script than in function.
We could hypothesize that maybe everything is much faster in a function, but if you look at the extra rate information, we can see that inside the function we only got about 33 million raw variable accesses per second, whereas in the script we got about 74 million. This tells us that the special case of doing nothing with a variable is about twice as fast in a script as it is inside a function !!
James Lebak
James Lebak 2023 年 2 月 2 日
Regarding end: what's special-cased is the scalar indexing into a scalar. Any of 1:n, 1:end, 1:1 are treated as a vector index into a scalar which takes the path through the most general but overall slowest code. NULL(end) is faster than NULL(1:end) because after end resolves it's a scalar index.
I was surprised at the difference between referencing a scalar variable in a script versus a function. More optimizations are enabled in functions than in scripts and so our expectation, as you said, is that functions are faster. It turns out that the case of assigning a whole variable to another variable is faster in a script because it reduces to a trivial pointer copy. I still think the guidance that functions are faster is generally correct, because these kind of operations should not be a high percentage of the overall execution time for most functions.
[You may ask, wait, what's being assigned? In this case, because there's no right-hand side, there's an implicit assign to ans, in both a script and a function. But I tried it with an explicit assignment to a left-hand side variable too, and it was faster in the script in that case as well.]
Walter Roberson
Walter Roberson 2023 年 2 月 2 日
Interesting.
You mention that in the case of assigning a whole variable to another variable is faster in a script because it reduces to a trivial pointer copy. Trivial pointer copy makes sense it itself. But what extra work is being done in the function case? I could see a difference if the variable name was one of the parameters (in which case you have to worry about copy-on-write), but ....
Oh wait... is ans effectively a global variable?? (Mumble, mumble... No, I still can't explain the difference.)
James Lebak
James Lebak 2023 年 2 月 2 日
Among other things, in functions we're handling an uninitialized LHS differently than we do in scripts. We may be able to do it faster than we are now. Like the empy paren-reference case, I don't think this is a very high priority case though.

サインインしてコメントする。

カテゴリ

ヘルプ センター および File ExchangePerformance and Memory についてさらに検索

製品

リリース

R2022b

質問済み:

2023 年 1 月 30 日

コメント済み:

2023 年 2 月 2 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by