How to sum parts of a matrix of different sizes, without using for loops?

3 ビュー (過去 30 日間)

ER2018 2018 年 6 月 27 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/407634-how-to-sum-parts-of-a-matrix-of-different-sizes-without-using-for-loops

編集済み: Matt J 2018 年 6 月 29 日

I have a relatively large matrix NxN (N~20,000) and a Nx1 vector identifying the indices that must be grouped together. I want to sum together parts of the matrix, which in principle can have a different number of elements and non-adjacent elements. I quickly wrote a double for-loop that works correctly but of course it is inefficient. The profiler identified these loops as one of the bottlenecks in my code. I tried to find a smart vectorization method to solve the problem. I explored the arrayfun, cellfun, bsxfun functions, and looked for solutions to similar problems,... but I haven't found a final solution yet. I'll be grateful for any help! ER

This is the test code with the two for-loops:

 M=rand(10); % test matrix
 idxM=[1 2 2 3 4 4 4 1 4 2]; % each element indicates to which group each row/column of M belongs
 nT=max(idxM);
 sumM=zeros(nT,nT);
 for t1=1:nT
  for t2=1:nT
   sumM(t1,t2)=sum(sum(M(idxM==t1,idxM==t2)));
  end
 end

PS: Long-term reader, first-time poster.

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

ER2018 2018 年 6 月 27 日

nT is typically 20 to 50 times smaller than N

Matt J 2018 年 6 月 28 日

編集済み: Matt J 2018 年 6 月 28 日

MATLAB Online で開く

You could speed up the for-loops somewhat by rewriting as follows:

 for t2=1:nT
     partials=sum( M(:,idxM==t2) ,2); 
    for t1=1:nT
       sumM(t1,t2)=sum(partials(idxM==t1)); %1D sums
    end
 end

サインインしてコメントする。

サインインしてこの質問に回答する。

採用された回答

Matt J 2018 年 6 月 27 日

3
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/407634-how-to-sum-parts-of-a-matrix-of-different-sizes-without-using-for-loops#answer_326578

編集済み: Matt J 2018 年 6 月 27 日

MATLAB Online で開く

   S=sparse(1:N,idxM,1);
   sumM=S.'*(M*S);

13 件のコメント
11 件の古いコメントを表示11 件の古いコメントを非表示

ER2018 2018 年 6 月 29 日

編集済み: ER2018 2018 年 6 月 29 日

MATLAB Online で開く

I'm finally in front of a PC and I ran a few tests.

Long story short: Calculating once the indices of the elements in each group before entering the for-loops has also a very significant impact on the code speed. With this improvement, even the solution with two simple for-loops performs surprisingly well. Pre-calculating the indices also improves the performance of the "improved for-loops". But for summing your very first two-liner remains the best by far!

For the case of the sum:

%%Test - Sum
N=8000;
nT=400;
M=rand(N); % test matrix
idxM=randi(nT,1,N);
sumM0=zeros(nT,nT);
sumM1=zeros(nT,nT);
sumM2=zeros(nT,nT);
sumM3=zeros(nT,nT);
sumM4=zeros(nT,nT);
%  Test 0 - Reference
tic
for t1=1:nT
    for t2=1:nT
        sumM0(t1,t2)=sum(sum(M(idxM==t1,idxM==t2)));
    end
end
toc
% Elapsed time is 10.232443 seconds.
%  Code 1 - Test
tic;
for t2=1:nT
    partials=sum( M(:,idxM==t2) ,2);
    for t1=1:nT
        sumM1(t1,t2)=sum(partials(idxM==t1)); %1D sums
    end
end
toc;
% Elapsed time is 5.098394 seconds.
isequal(sumM0,sumM1)
% 0
% => Solution matrix not identical to the reference
mean(mean(abs(sumM0-sumM1)))
% Average error: 2.8107e-14
%  Code 2 - Test
tic
%  Calculate once the indices
idxT=cell(nT,1);
for t=1:nT
    idxT{t}=find(idxM==t);
end
for t1=1:nT
    for t2=1:nT
        sumM2(t1,t2)=sum(sum(M(idxT{t1},idxT{t2})));
    end
end
toc
% Elapsed time is 1.512766 seconds.
isequal(sumM0,sumM2)
% 1
% => Solution matrix identical to the reference
%  Code 3 - Test
tic;
%  Calculate once the indices
idxT=cell(nT,1);
for t=1:nT
    idxT{t}=find(idxM==t);
end
for t2=1:nT
    partials=sum( M(:,idxT{t2}) ,2);
    for t1=1:nT
        sumM3(t1,t2)=sum(partials(idxT{t1})); %1D sums
    end
end
toc;
% Elapsed time is 0.487134 seconds.
isequal(sumM0,sumM3)
% 0
% => Solution matrix not identical to the reference
mean(mean(abs(sumM0-sumM3)))
% Average error: 2.8107e-14
%  Code 4 - Test
tic;
S=sparse(1:N,idxM,1);
sumM4=S.'*(M*S);
toc;
% Elapsed time is 0.070906 seconds.
isequal(sumM0,sumM4)
% 0
% => Solution matrix not identical to the reference
mean(mean(abs(sumM0-sumM4)))
% Average error: 2.8107e-14

Reference: Not optimized double for-loop.

Test 1 (with improved for-loops) is ~2 times faster.

Test 2 (with basic for-loops using previously calculated indices) is ~7 times faster.

Test 3 (with improved for-loops using previously calculated indices) is ~21 times faster.

Test 4 (with matrix products using the sparse matrix with indices) is ~140 times faster.

Tests 1, 3, 4: the resulting matrix is almost identical to the reference, but there is a small difference (on average ~10E-14). Test 2: the resulting matrix is identical to the reference.

For the case of the max:

%%Test - Max
N=8000;
nT=400;
M=rand(N); % test matrix
idxM=randi(nT,1,N);
maxM0=zeros(nT,nT);
maxM1=zeros(nT,nT);
maxM2=zeros(nT,nT);
maxM3=zeros(nT,nT);
maxM4=zeros(nT,nT);
%  Test 0 - Reference
tic
for t1=1:nT
    for t2=1:nT
        maxM0(t1,t2)=max(max(M(idxM==t1,idxM==t2)));
    end
end
toc
% Elapsed time is 10.302640 seconds.
%  Code 1 - Test
tic;
for t2=1:nT
    partials=max( M(:,idxM==t2) ,[],2);
    for t1=1:nT
        maxM1(t1,t2)=max(partials(idxM==t1));
    end
end
toc;
% Elapsed time is 4.789534 seconds.
isequal(maxM0,maxM1)
% 1
% => Solution matrix identical to the reference
%  Code 2 - Test
tic;
S=sparse(1:N,idxM,true);
for t2=1:nT
    partials=max(M(:,S(:,t2)),[],2);
    for t1=1:nT
        maxM2(t1,t2)=max(partials(S(:,t1))); %1D maxs
    end
end
toc;
% Elapsed time is 0.615749 seconds.
isequal(maxM0,maxM2)
% 1
% => Solution matrix identical to the reference
%  Code 3 - Test
tic
%  Calculate once the indices
idxT=cell(nT,1);
for t=1:nT
    idxT{t}=find(idxM==t);
end
for t1=1:nT
    for t2=1:nT
        maxM3(t1,t2)=max(max(M(idxT{t1},idxT{t2})));
    end
end
toc
% Elapsed time is 1.639348 seconds.
isequal(maxM0,maxM3)
% 1
% => Solution matrix identical to the reference
%  Code 4 - Test
tic;
%  Calculate once the indices
idxT=cell(nT,1);
for t=1:nT
    idxT{t}=find(idxM==t);
end
for t2=1:nT
    partials=max( M(:,idxT{t2}) ,[],2);
    for t1=1:nT
        maxM4(t1,t2)=max(partials(idxT{t1}));
    end
end
toc;
% Elapsed time is 0.569280 seconds.
isequal(maxM0,maxM4)
% 1
% => Solution matrix identical to the reference

Reference: Not optimized double for-loop.

Test 1 (with improved for-loops) is ~2 times faster.

Test 2 (with improved for-loops, using the sparse matrix of indices) is ~17 times faster.

Test 3 (with basic for-loops using previously calculated indices) is ~6 times faster.

Test 4 (with improved for-loops using previously calculated indices) is ~18 times faster.

For Tests 1-4, the resulting matrix is always identical to the reference.

Matt J 2018 年 6 月 29 日

編集済み: Matt J 2018 年 6 月 29 日

The comparison between Tests 2 and 4 is interesting, but also a bit surprising. Under the hood, the sparse matrix form is holding the indices in the same form as the cell array. I would have expected much less of a difference.

ER2018 2018 年 6 月 29 日

I agree... the first time I tried it it seemed like magic. And this is why I was interested in trying the approach with other functions.

サインインしてコメントする。

その他の回答 (1 件)

Matt J 2018 年 6 月 29 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/407634-how-to-sum-parts-of-a-matrix-of-different-sizes-without-using-for-loops#answer_326900

編集済み: Matt J 2018 年 6 月 29 日

MATLAB Online で開く

Here's another method. It isn't as fast as the sparse matrix approach, but it is loop-free and adaptable to other operations (min, max, etc...)

tmp=splitapply(@(z)sum(z,2),M,idxM.');
sumM=splitapply(@(z)sum(z,1),tmp,idxM);

11 件のコメント
9 件の古いコメントを表示9 件の古いコメントを非表示

ER2018 2018 年 6 月 29 日

MATLAB Online で開く

I ran another test for another application - summing the logarithm of the elements of each group.

The sparse-matrix double-product remains the fastest, especially for large nT (try 400...).

N=20000;
nT=40;
M=rand(N);
idxM=randi(nT,N,1);
%  Code 0 - Reference
tic
logM1=zeros(nT,nT);
for t1=1:nT
    for t2=1:nT
        logM1(t1,t2)=sum(sum(log(M(idxM==t1,idxM==t2))));
    end
end
toc
% Elapsed time is 8.881048 seconds.
%  Code 2 - Test
tic;
logM2=zeros(nT,nT);
%  Calculate once the indices
idxT=cell(nT,1);
for t=1:nT
  idxT{t}=find(idxM==t);
end
tempM=log(M);
for t2=1:nT
  partials=sum( tempM(:,idxT{t2}) ,2);
  for t1=1:nT
      logM2(t1,t2)=sum(partials(idxT{t1})); %1D sums
  end
end
toc;
isequal(logM1,logM2)
% 0 --> not identical
mean(mean(abs((logM1-logM2)/logM1)))
% Elapsed time is 5.128909 seconds.
%  Code 4 - Test
tic;
S=sparse(1:N,idxM,true);
logM4=S.'*log(M)*S;
toc;
isequal(logM1,logM4)
% Elapsed time is 4.526294 seconds.

I wasn't able to make the splitapply version working. Honestly I haven't studied the function yet.

%  Code 3 - Test
% NOT WORKING
tic;
tmp=splitapply(@(z)log(z),M,idxM.');
logM3=splitapply(@(z)log(z),tmp,idxM);
toc;
isequal(logM1,logM3)

ER2018 2018 年 6 月 29 日

編集済み: ER2018 2018 年 6 月 29 日

We're talking about the case with log, right?

Sorry, ignore that. I was indeed referring to the max code in the same comment.

Matt J 2018 年 6 月 29 日

編集済み: Matt J 2018 年 6 月 29 日

MATLAB Online で開く

My fastest version of sum(log), and also the most RAM-efficient, is as follows,

%  Code 3 - Test
% NOT WORKING
tic;
    tmp=splitapply(@(z)log(prod(z,2)),M,idxM.');
    logM3=splitapply(@(z)sum(z,1),tmp,idxM);
T3=toc

It's potentially dangerous, because depending on nT and the magnitudes of M(i,j) the prod could overflow. However, with N=2e4,nT=400, and M=rand(N) this does not happen and I get T3=2.54 sec.

サインインしてコメントする。

サインインしてこの質問に回答する。

カテゴリ

MATLAB Software Development Tools Performance and Memory

Help Center および File Exchange で Performance and Memory についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by

How to sum parts of a matrix of different sizes, without using for loops?

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

採用された回答

13 件のコメント
11 件の古いコメントを表示11 件の古いコメントを非表示

その他の回答 (1 件)

11 件のコメント
9 件の古いコメントを表示9 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

How to sum parts of a matrix of different sizes, without using for loops?

5 件のコメント 3 件の古いコメントを表示3 件の古いコメントを非表示

採用された回答

13 件のコメント 11 件の古いコメントを表示11 件の古いコメントを非表示

その他の回答 (1 件)

11 件のコメント 9 件の古いコメントを表示9 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

13 件のコメント
11 件の古いコメントを表示11 件の古いコメントを非表示

11 件のコメント
9 件の古いコメントを表示9 件の古いコメントを非表示