Unexpected slowdown using () indexing

Question

3 投票

format long g
NULL = 0; counter = 0; start = tic; while toc(start) < 10; NULL; counter = counter + 1; end; counter/10
ans = 
                 3699475.2
NULL = @()0; counter = 0; start = tic; while toc(start) < 10; NULL(); counter = counter + 1; end; counter/10
ans = 
                 2326610.7
NULL = 0; counter = 0; start = tic; while toc(start) < 10; NULL(); counter = counter + 1; end; counter/10
ans = 
                  588245.3
NULL = 0; counter = 0; start = tic; while toc(start) < 10; NULL(1); counter = counter + 1; end; counter/10
ans = 
                 2540740.2

Observe that referring to a variable in a loop can iterate millions of time per second, and executing an anonymous function to retrieve a value is (fewer) millions of times per second -- but that taking a plain scalar and using empty () to dereference it slows down to the hundreds-of-thousands range (roughly a factor of 7). But we can see that using (1) indexing is slightly slower than just using the variable with no () but is still comparable to no indexing in speed.

So there is something about using the empty index on a scalar that invokes much worse performance.

7 件のコメント
5 件の古いコメントを表示 5 件の古いコメントを非表示

Bruno Luong 2023 年 1 月 31 日

編集済み: Bruno Luong 2023 年 1 月 31 日

MATLAB Online で開く

I think the only justification existing of this strange empty indexing is when using with comma list that reduces to an empty cell

A=magic(2);
c=cell(0);
Aemptyidexing = A(c{:}) % equal to A() i.e. to A
% 

But to me the value of A() == A is just an arbitrary design choice. I don't see any logical pattern with what come when the comma list is not empty.

To me this "feature" can be ignored; for good reason.

Personally I would prefer an error is thrown when empty indexing is encountered.

Walter Roberson 2023 年 1 月 31 日

MATLAB Online で開く

Looks like it triggers a copy!

format debug
NULL = 0
NULL = 
Structure address = 7f077e274ee0
m = 1
n = 1
pr = 7f4605e12160

     0
NULL
NULL = 
Structure address = 7f077e412e00
m = 1
n = 1
pr = 7f076cc60240

     0
NULL()
ans = 
Structure address = 7f077e3376a0
m = 1
n = 1
pr = 7f077bc2e320

     0

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

James Lebak 2023 年 2 月 1 日

3 投票

Paren-reference in many cases creates a copy. This is expected behavior.

NULL(1) is nearly as fast as NULL on a scalar because it's been specially optimized. NULL() is indexing with no indices (empty paren-reference). We allow this syntax because when NULL is a function handle, NULL() means to call the function with no arguments. But we don't consider it common usage, and we haven't optimized it. As you observed, it therefore creates an expensive copy. We could consider optimizing it in a future release. If you have a need for this operation to be performant we'd be interested to hear of it.

5 件のコメント
3 件の古いコメントを表示 3 件の古いコメントを非表示

Walter Roberson 2023 年 2 月 2 日

MATLAB Online で開く

It was more surprising than important. I would have expected the execution engine to optimize the construct already. It does lead, though, to some questions about relative execution rates.

In the below, the higher number is how many times faster that computation is relative to the slowest computation

    format long g
    N = 1e6;
    NULL = 0; start = tic; for counter = 1:N; NULL; end; stop = toc(start); rates(1) = N/stop;
    NULL = 0;  start = tic; for counter = 1:N; NULL(); end; stop = toc(start); rates(2) = N/stop;
    NULL = 0; start = tic; for counter = 1:N; NULL(1); end; stop = toc(start); rates(3) = N/stop;
    NULL = 0; start = tic; for counter = 1:N; NULL(1:1); end; stop = toc(start); rates(4) = N/stop;
    NULL = 0; start = tic; for counter = 1:N; NULL(end); end; stop = toc(start); rates(5) = N/stop;
    NULL = 0; start = tic; for counter = 1:N; NULL(1:end); end; stop = toc(start); rates(6) = N/stop;
    start = tic; for counter = 1:N; 1:1; end; stop = toc(start); rates(7) = N/stop;
    start = tic; for counter = 1:N; 1:1:1; end; stop = toc(start); rates(8) = N/stop;
    
    script_rates = rates.' / min(rates)
script_rates = 8×1
          102.761950641073
                         1
          42.1692466774125
          2.24343263393608
          34.7789455202167
          2.20961922761883
          7.29261622127669
          7.37390513569138
    rates(1:2).'
ans = 2×1
1.0e+00 *

          74112502.7792189
          721205.682812298
    
    do_timing_function(N)
function_rates = 8×1
           37.993390636192
                         1
          34.3770511120931
          1.91270048107262
          33.2856694620343
          1.89901723540296
          6.29798797534622
          5.99011852635145
ans = 2×1
1.0e+00 *

          32558442.4041154
          856950.165777009
function do_timing_function(N)
    NULL = 0; start = tic; for counter = 1:N; NULL; end; stop = toc(start); rates(1) = N/stop;
    NULL = 0;  start = tic; for counter = 1:N; NULL(); end; stop = toc(start); rates(2) = N/stop;
    NULL = 0; start = tic; for counter = 1:N; NULL(1); end; stop = toc(start); rates(3) = N/stop;
    NULL = 0; start = tic; for counter = 1:N; NULL(1:1); end; stop = toc(start); rates(4) = N/stop;
    NULL = 0; start = tic; for counter = 1:N; NULL(end); end; stop = toc(start); rates(5) = N/stop;
    NULL = 0; start = tic; for counter = 1:N; NULL(1:end); end; stop = toc(start); rates(6) = N/stop;
    start = tic; for counter = 1:N; 1:1; end; stop = toc(start); rates(7) = N/stop;
    start = tic; for counter = 1:N; 1:1:1; end; stop = toc(start); rates(8) = N/stop;
    
    function_rates = rates.' / min(rates)
    rates(1:2).'
end

So () is the slowest, but indexing 1:1 and indexing 1:end are surprisingly slow (and statistically tied). Low much of that time is in computation of 1:1, and how does that compare to 1:1:1 ? Those are a lot slower than I expected. I suspected at first that the overhead of end might be what was slowing down 1:end but the rate for end indexing is only a little worse than the rate for (1) indexing which tells us that end must have been special cased as well. I would have expected the execution engine to notice that the variable is not changing and so it could replace 1:1 or 1:end indexing with (1) indexing.

But there is, of course, still permitted to be a difference between the execution engine for a script compared to a function. If you compare the two rate tables above you can see that they are similar enough in nearly all locations for the differences to be plausibly just due to timing noise. But the case of a simple variable access with no index is much faster in script than in function.

We could hypothesize that maybe everything is much faster in a function, but if you look at the extra rate information, we can see that inside the function we only got about 33 million raw variable accesses per second, whereas in the script we got about 74 million. This tells us that the special case of doing nothing with a variable is about twice as fast in a script as it is inside a function !!

James Lebak 2023 年 2 月 2 日

Regarding end: what's special-cased is the scalar indexing into a scalar. Any of 1:n, 1:end, 1:1 are treated as a vector index into a scalar which takes the path through the most general but overall slowest code. NULL(end) is faster than NULL(1:end) because after end resolves it's a scalar index.

I was surprised at the difference between referencing a scalar variable in a script versus a function. More optimizations are enabled in functions than in scripts and so our expectation, as you said, is that functions are faster. It turns out that the case of assigning a whole variable to another variable is faster in a script because it reduces to a trivial pointer copy. I still think the guidance that functions are faster is generally correct, because these kind of operations should not be a high percentage of the overall execution time for most functions.

[You may ask, wait, what's being assigned? In this case, because there's no right-hand side, there's an implicit assign to ans, in both a script and a function. But I tried it with an explicit assignment to a left-hand side variable too, and it was faster in the script in that case as well.]

Walter Roberson 2023 年 2 月 2 日

Interesting.

You mention that in the case of assigning a whole variable to another variable is faster in a script because it reduces to a trivial pointer copy. Trivial pointer copy makes sense it itself. But what extra work is being done in the function case? I could see a difference if the variable name was one of the parameters (in which case you have to worry about copy-on-write), but ....

Oh wait... is ans effectively a global variable?? (Mumble, mumble... No, I still can't explain the difference.)

James Lebak 2023 年 2 月 2 日

Among other things, in functions we're handling an uninitialized LHS differently than we do in scripts. We may be able to do it faster than we are now. Like the empy paren-reference case, I don't think this is a very high priority case though.

サインインしてコメントする。

Unexpected slowdown using () indexing

7 件のコメント
5 件の古いコメントを表示 5 件の古いコメントを非表示

回答 (1 件)

5 件のコメント
3 件の古いコメントを表示 3 件の古いコメントを非表示

カテゴリ

製品

リリース

タグ

Community Treasure Hunt

Unexpected slowdown using () indexing

7 件のコメント 5 件の古いコメントを表示 5 件の古いコメントを非表示

回答 (1 件)

5 件のコメント 3 件の古いコメントを表示 3 件の古いコメントを非表示

カテゴリ

製品

リリース

タグ

参考

Community Treasure Hunt

7 件のコメント
5 件の古いコメントを表示 5 件の古いコメントを非表示

5 件のコメント
3 件の古いコメントを表示 3 件の古いコメントを非表示