Why is x(:) so much slower than reshape(x,N,1) with complex arrays?

Question

Matt J 2021 年 7 月 27 日

7
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/887219-why-is-x-so-much-slower-than-reshape-x-n-1-with-complex-arrays

編集済み: Matt J 2022 年 5 月 26 日

The two for loops below differ only in the flattening operation used to obtain A_1D . Why is the run time so much worse with A_3D(:) than with a call to reshape()?

Nx = 256;
Ny = 256;
Nz = 128;
N = Nx*Ny*Nz;
A0 = rand(N,1);
tic
for k = 1:20
    B = reshape( A0, [Nz,Ny,Nx] ) ;
    A_3D = fftn(B);
    A_1D = reshape( A_3D, N,1); %<--- Version 1
end
toc
Elapsed time is 3.770859 seconds.
tic
for k = 1:20    
    B = reshape( A0, [Nz,Ny,Nx] ) ;
    A_3D = fftn(B);
    A_1D = A_3D(:); %<--- Version 2
end
toc
Elapsed time is 5.056827 seconds.

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

Stephen23 2021 年 7 月 28 日

編集済み: Stephen23 2021 年 7 月 28 日

@Bruno Luong: does RESHAPE also copy the data?

If not, then does this mean that one array in memory can be linked to two or more meta-headers (with different array sizes)?

Bruno Luong 2021 年 7 月 28 日

I must admit that understanding why/when MATLAB make data copy become obscure to me since few years now. I did not come to a full understanding of how it works.

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Matt J 2021 年 7 月 28 日

4
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/887219-why-is-x-so-much-slower-than-reshape-x-n-1-with-complex-arrays#answer_755564

MATLAB Online で開く

The following simple test seems to support @Bruno Luong's conjecture that (:) results in data copying. The data of B1 resulting from reshape() has the same data pointer location as A, but B2 generated with (:) points to different data.

format debug
A=complex(rand(2),rand(2))
A = 
Structure address = 7f3f47f4e0e0
m = 2
n = 2
pr = 7f3fcb0112e0

   0.5114 + 0.6181i   0.5881 + 0.4450i
   0.5713 + 0.9018i   0.3682 + 0.8103i
B1=reshape(A,4,1),
B1 = 
Structure address = 7f3fcf1f4be0
m = 4
n = 1
pr = 7f3fcb0112e0

   0.5114 + 0.6181i
   0.5713 + 0.9018i
   0.5881 + 0.4450i
   0.3682 + 0.8103i
B2=A(:)
B2 = 
Structure address = 7f3f47e45a20
m = 4
n = 1
pr = 7f3faff0b980

   0.5114 + 0.6181i
   0.5713 + 0.9018i
   0.5881 + 0.4450i
   0.3682 + 0.8103i

8 件のコメント
6 件の古いコメントを表示6 件の古いコメントを非表示

Matt J 2021 年 7 月 28 日

編集済み: Matt J 2021 年 7 月 28 日

Mathworks tech support got back to me. As @Bruno Luong predicted, they claim this to be a feature since R2015b. Apparently, because subsref indexing operations generally result in data copying (paraphrasing), it was decided this would be true for A0(:) as a special case as well. Why this is only true for complex A0 and not real A0, I did not get a clear answer on.

I understand that you are observing the differences in performances between reshape and colon operation.

Since MATLAB R2015b, the colon operator, A0(:) is an indexing operation. For the provided code, MATLAB is going through every row and column, which is not computationally fast.

On the other hand, the ‘reshape’ command will only change the property of the created array, which is a rather fast process.

For your interests, I have also timed the code across different releases of MATLAB. The result is documented below:

MATLAB 8.3.0.85671 (R2014a)

Colon operator: 7.1884e-07

Reshape: 1.0690e-06

MATLAB 8.5.0.204617 (R2015a)

Colon operator: 6.4574e-07

Reshape: 1.0706e-06

MATLAB 8.6.0.267246 (R2015b)

Colon operator: 0.0487

Reshape: 5.1078e-07

MATLAB 9.0.0.341360 (R2016a)

Colon operator: 0.0493

Reshape: 5.6105e-07

MATLAB 9.6.0.1072779 (R2019a)

Colon operator: 0.041046

Reshape: 1.0141e-06

MATLAB 9.9.0.1467703 (R2020b)

Colon operator: 0.040691

Reshape: 7.0104e-07

MATLAB 9.10.0.1684407 (R2021a) Update 3

Colon operator: 0.040806

Reshape: 5.7803e-07

You can see the changes happened since MATLAB R2015b. If you would like further details on what has been altered under the hood, please feel free to reach out. Otherwise, I will close the case for now. Please do not hesitate to let me know if you have further questions on the matter.

G A 2021 年 8 月 14 日

MATLAB Online で開く

Walter, I am discussing complex valued arrays, it can be

max(A,[],'all')

but anyway for a complex number max(A) = max(abs(A))

Walter Roberson 2021 年 8 月 14 日

MATLAB Online で開く

The (:) options are the slowest. reshape(abs(A),N,1) might possibly be the fastest -- there is notable variation in different runs.

Nx = 256;

Ny = 256;

Nz = 128;

N = Nx*Ny*Nz;

A0 = complex(randn(Nx, Ny, Nz), randn(Nx, Ny, Nz));

t(1) = timeit(@() use_abs_all(A0, N), 0)

t = 0.0937

t(2) = timeit(@() use_abs_colon(A0, N), 0)

t = 1×2

0.0937 0.1727

t(3) = timeit(@() use_abs_reshape_null(A0, N), 0)

t = 1×3

0.0937 0.1727 0.0994

t(4) = timeit(@() use_abs_reshape_N(A0, N), 0)

t = 1×4

0.0937 0.1727 0.0994 0.0935

t(5) = timeit(@() use_all(A0, N), 0)

t = 1×5

0.0937 0.1727 0.0994 0.0935 0.1012

t(6) = timeit(@() use_colon(A0, N), 0)

t = 1×6

0.0937 0.1727 0.0994 0.0935 0.1012 0.1802

t(7) = timeit(@() use_reshape_null(A0, N), 0)

t = 1×7

0.0937 0.1727 0.0994 0.0935 0.1012 0.1802 0.1013

t(8) = timeit(@() use_reshape_N(A0, N), 0)

t = 1×8

0.0937 0.1727 0.0994 0.0935 0.1012 0.1802 0.1013 0.1018

cats = categorical({'abs(all)', 'abs(:)', 'reshape(abs,[])','reshape(abs,N)', 'all', '(:)', 'reshape([])', 'reshape(N)'});

bar(cats, t)

function B = use_abs_all(A, N)

B = max(abs(A), [], 'all');

end

function B = use_abs_colon(A, N)

B = max(abs(A(:)));

end

function B = use_abs_reshape_null(A, N)

B = max(reshape(abs(A), [], 1));

end

function B = use_abs_reshape_N(A, N)

B = max(reshape(abs(A), N, 1));

end

function B = use_all(A, N)

B = max(A, [], 'all');

end

function B = use_colon(A, N)

B = max(A(:));

end

function B = use_reshape_null(A, N)

B = max(reshape(A, [], 1));

end

function B = use_reshape_N(A, N)

B = max(reshape(A, N, 1));

end

サインインしてコメントする。

Answer 2

Walter Roberson 2021 年 7 月 28 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/887219-why-is-x-so-much-slower-than-reshape-x-n-1-with-complex-arrays#answer_755289

MATLAB Online で開く

Nx = 256;
Ny = 256;
Nz = 128;
N = Nx*Ny*Nz;
A0 = rand(Nx, Ny, Nz);
timeit(@() use_colon(A0, N), 0)
ans = 8.3490e-06
timeit(@() use_reshape_null(A0, N), 0)
ans = 6.5490e-06
timeit(@() use_reshape_N(A0, N), 0)
ans = 6.0925e-06
function use_colon(A, N)
   B = A(:);
end
function use_reshape_null(A, N)
    B = reshape(A, [], 1);
end
function use_reshape_N(A, N)
   B = reshape(A, N, 1);
end

In this particular test, the timing is close enough that we can speculate some reasons:

Using an explicit size to reshape to is faster than reshape([]) because reshape([]) has to spend time calculating the size based upon dividing numel() by the size of the known parameters.

Using (:) versus reshape() is not immediately as clear. The model for (:) is that it invokes subsref() with struct('type', {'()'}, 'subs', {':'}) and then subsref() has to invoke reshape() . I point out "model" because potentially the Execution Engine could optimize all of this, and one would tend to think that optimization of (:) should be especially good.

10 件のコメント
8 件の古いコメントを表示8 件の古いコメントを非表示

Adam Danz 2021 年 8 月 10 日

編集済み: Adam Danz 2021 年 8 月 10 日

MATLAB Online で開く

When I run your example (modified to store and plot values) using the run feature (first plot) and using Matlab online (second plot) I get conflicting results.

Nx = 256;

Ny = 256;

Nz = 128;

N = Nx*Ny*Nz;

A0 = rand(Nx, Ny, Nz);

T = nan(1,3);

T(1) = timeit(@() use_colon(A0, N), 0);

T(2) = timeit(@() use_reshape_null(A0, N), 0);

T(3) = timeit(@() use_reshape_N(A0, N), 0);

bar(categorical({'colon','reshapeNull','reshape'}),T)

title('Run feature')

function use_colon(A, N)

B = A(:);

end

function use_reshape_null(A, N)

B = reshape(A, [], 1);

end

function use_reshape_N(A, N)

B = reshape(A, N, 1);

end

Results of the exact same code using Matlab Online (same platform and Matlab release)

When I run it on my local copy of Matlab (same release, Windows 10 Pro), the first time the colon method was slower but on subsequent runs, it was faster than the reshape methods. There were also some warnings that the measured time may be inaccurate due fast execution. Using the tic/toc method with repeated measures to measure variability, on my system the colon method with real numbers is fastest.

Nx = 256;
Ny = 256;
Nz = 128;
N = Nx*Ny*Nz;
A0 = rand(Nx, Ny, Nz);
n = numel(A0); 
nIterations = 500;  % number of iterations to include within the timer
nReps = 100;        % number of times to repeate the process to measure variability 
durations = nan(nReps,3);
for i = 1:nReps
    
    T = tic; 
    for j = 1:nIterations
        y = A0(:); 
    end
    durations(i,1) = toc(T); 
    
     T = tic; 
    for j = 1:nIterations
        y = reshape(A0,[],1); 
    end
    durations(i,2) = toc(T); 
    
     T = tic; 
    for j = 1:nIterations
        y = reshape(A0,n,1); 
    end
    durations(i,3) = toc(T);
end
figure
boxplot(durations, 'labels',{'colon','reshapeNull','reshape'})
grid on
ylabel(sprintf('Duration of %d iterations (sec)',nIterations))
xlabel('Method')
title(sprintf('Summary of tic/toc timing repeated %d times (real numbers).',nReps))
subtitle('Win 10 Pro; R2021A update 4')

Walter Roberson 2021 年 8 月 10 日

編集済み: Walter Roberson 2021 年 8 月 11 日

I took your earlier plot version and ran it on my desktop, and on the Run feature here, and in LiveScript on my desktop. I modified it to scale the plot relative to the slowest, to make it easier to compare relative rates. I also modified it to return values from the functions, to avoid the possibility that Execution Engine might optimize away the work because of the variable not being returned,

Desktop .m and .mlx, colon was fastest in all tests.

The time requirements did not vary much for the .m version. reshape([]) was typically pretty much 2.5 times slower than colon.

The time requirements for the colon test for the .mlx varied quite a lot, sometimes taking twice as long. The reshape() timings did not vary nearly as much. Because of that, the relative ratios between colon and reshape([]) varied quite bit, from about 1.5 to 4.

Bringing the code over to the Run feature here, colon was almost always slowest. Furthermore, the minimum timings (for reshape(N)) were pretty much 10 times slower than what I was seeing on my desktop -- where that reshape would take about 6e-7 on desktop, it takes about 6e-6 here in the Run feature.

Walter Roberson 2021 年 8 月 11 日

MATLAB Online で開く

@Adam Danz, I could use another pair of eyes in looking at this.

I noticed when I was running your code on my desktop, that every time I had a large timing outlier on colon. My tests showed that it was always the very first run. When I poked around, I realized that there had to be some kind of internal optimization going on. To reduce the effects of "premature optimization", I moved the operative code into functions, and I added recreation of A0 for each repetition.

Please run the below code with seperate set true and false, and notice the substantial difference in rates for the runs.

To try to deal with the initial spike in timings for colon, I decided that I would call the work functions once, "prime the pump". That was not enough, so now I loop calling them several times, warm up the system, get all the Execution Engine optimization of the functions out of the way. But... with separate = false, I am still seeing the spike on duration(1,1) !!

The only thing I have been able to think of at the moment is that when I prime the pump, I am not saving the output of the calls to a variable, and that might be affecting the timing ??

By the way, have a look at the recorded d2 values -- the timing of the priming cycles. They are notably different than the other timings... and I see unexpected spikes early on, optimized times mixed with unoptimized times.

Nx = 256;
Ny = 256;
Nz = 128;
N = Nx*Ny*Nz;
nIterations = 500;  % number of iterations to include within the timer
nReps = 100;        % number of times to repeate the process to measure variability 
durations = nan(nReps,3);
d2 = nan(nReps,3);
seperate = false;
if ~seperate; A0 = rand(Nx, Ny, Nz); end
for i = 1:nReps
    
    if seperate; A0 = rand(Nx, Ny, Nz); end
    tic; for j = 1 : 5; A0_colon(A0,N); end; d2(i,1) = toc; %prime the pump
    T = tic; 
    for j = 1:nIterations
        y = A0_colon(A0,N); 
    end
    durations(i,1) = toc(T); 
    
    if seperate; A0 = rand(Nx, Ny, Nz); end
    tic; for j = 1 : 5; A0_reshape_null(A0,N); end; d2(i,2) = toc; %prime the pump
    T = tic; 
    for j = 1:nIterations
        y = A0_reshape_null(A0,N); 
    end
    durations(i,2) = toc(T); 
    
    if seperate; A0 = rand(Nx, Ny, Nz); end
    tic; for j = 1 : 5; A0_reshape_N(A0, N); end; d2(i,3) = toc;   %prime the pump
    T = tic; 
    for j = 1:nIterations
        y = A0_reshape_N(A0,N); 
    end
    durations(i,3) = toc(T);
end
figure
boxplot(durations, 'labels',{'colon','reshapeNull','reshape'})
grid on
ylabel(sprintf('Duration of %d iterations (sec)',nIterations))
xlabel('Method')
title(sprintf('Summary of tic/toc timing repeated %d times (real numbers).',nReps))
function y = A0_colon(A0,~)
    y = A0(:);
end
function y = A0_reshape_null(A0,~)
    y = reshape(A0, [], 1);
end
function y = A0_reshape_N(A0,N)
    y = reshape(A0, N, 1);
end

Walter Roberson 2021 年 8 月 11 日

I had the hypothesis that the 5 might have to do with my having 4 cores, or might have to do with the number of priming iterations I did, so I tested on my system that has more cores, and I did more priming iterations. The result was the same: duration(1,1) still had the major peak, and duration(5,1) was reliably a seconary peak.

Adam Danz 2021 年 8 月 12 日

I noticed that when I re-run it within a script without clearing variables, the second peak at x=5 vanishes. Still curious but out of ideas.

サインインしてコメントする。

Answer 3

Matt J 2022 年 5 月 26 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/887219-why-is-x-so-much-slower-than-reshape-x-n-1-with-complex-arrays#answer_972250

編集済み: Matt J 2022 年 5 月 26 日

MATLAB Online で開く

I was just told by Tech Support that the issue was fixed in R2022a, but it doesn't appear that way:

Nx = 256;
Ny = 256;
Nz = 128;
N = Nx*Ny*Nz;
A0 = rand(Nx, Ny, Nz);
A0=complex(A0,A0);
timeit(@() A0(:), 0)
ans = 0.0530
timeit(@() use_reshape_null(A0, N), 0)
ans = 6.5199e-06
timeit(@() use_reshape_N(A0, N), 0)
ans = 6.8033e-06
function use_reshape_null(A, N)
    B = reshape(A, [], 1);
end
function use_reshape_N(A, N)
   B = reshape(A, N, 1);
end

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Why is x(:) so much slower than reshape(x,N,1) with complex arrays?

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

採用された回答

8 件のコメント
6 件の古いコメントを表示6 件の古いコメントを非表示

その他の回答 (2 件)

10 件のコメント
8 件の古いコメントを表示8 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

Why is x(:) so much slower than reshape(x,N,1) with complex arrays?

7 件のコメント 5 件の古いコメントを表示5 件の古いコメントを非表示

採用された回答

8 件のコメント 6 件の古いコメントを表示6 件の古いコメントを非表示

その他の回答 (2 件)

10 件のコメント 8 件の古いコメントを表示8 件の古いコメントを非表示

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

8 件のコメント
6 件の古いコメントを表示6 件の古いコメントを非表示

10 件のコメント
8 件の古いコメントを表示8 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示