Can the efficienty of this code be improved, either computationally or just in terms of lines of code?

Question

James Akula 2022 年 12 月 5 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1871562-can-the-efficienty-of-this-code-be-improved-either-computationally-or-just-in-terms-of-lines-of-cod

編集済み: Jan 2022 年 12 月 5 日

Dumb question for a smart person who has a moment to kill.

Let's say I have data that will come in from n groups, and I know a priori those groups will be numbered 1 through n in some variable, A. I will have a second variable, B, that contains the data. Then, I want to get (for example) the mean of the data in each group. It is easy to pull off with a loop, but is there better code I could be using for this procedure? For a small example dataset, I might have

A = [2; 3; 1; 2; 2; 3; 1; 2; 2; 3];

B = [4.10047; 7.44549; 3.62159; 6.56964; 2.87221; 4.51231; 4.01697; 5.60534; 5.5440; 7.07802];

tic

%%% Can this be done better or in one line of code? %%%

C = NaN(max(A), 1);

for ii = 1:numel(C)

C(ii) = mean(B(A == ii));

end

%%% Can this be done better or in one line of code? %%%

toc

Elapsed time is 0.004956 seconds.

disp(C)

3.8193 4.9383 6.3453

bar(C)

Is there a better way to do this?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Jan 2022 年 12 月 5 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1871562-can-the-efficienty-of-this-code-be-improved-either-computationally-or-just-in-terms-of-lines-of-cod#answer_1120477

編集済み: Jan 2022 年 12 月 5 日

MATLAB Online で開く

A0 = [2; 3; 1; 2; 2; 3; 1; 2; 2; 3];
B0 = [4.10047; 7.44549; 3.62159; 6.56964; 2.87221; 4.51231; 4.01697; 5.60534; 5.5440; 7.07802];
A = repmat(A0, 1e6, 1);  % Let Matlab work with more than tiny data
B = repmat(B0, 1e6, 1);
tic
C = NaN(max(A), 1);
for ii = 1:numel(C)
    m = A == ii;
    C(ii) = sum(B(A == ii));
end
toc
Elapsed time is 0.132737 seconds.

Shorter but slower:

tic
D = accumarray(A, B, [], @mean);
toc
Elapsed time is 0.775913 seconds.
isequal(C, D)
ans = logical
   1

Another apporach:

tic
S = zeros(max(A), 1);
N = zeros(size(S));
for k = 1:numel(A)
    m    = A(k);
    S(m) = S(m) + B(k);
    N(m) = N(m) + 1;
end
E = S ./ N;
toc
Elapsed time is 0.091502 seconds.
isequal(C, E)  % Not equal!!!
ans = logical
   0
% But the differences are caused by rounding only:
(C - E) ./ C
ans = 3×1
1.0e-10 *

   -0.0674
    0.2422
   -0.1365

The difference is caused by the numerical instability of sums. Comparing the results with the mean of A0 and B0 shows, that all methods have comparable accuracy.

Locally under R2018b I get these timings:

Elapsed time is 0.205890 seconds.  % Original
Elapsed time is 0.512173 seconds.  % ACCUMARRAY
Elapsed time is 0.061097 seconds.  % Loop over inputs

2 件のコメント
なしを表示なしを非表示

James Akula 2022 年 12 月 5 日

編集済み: Torsten 2022 年 12 月 5 日

MATLAB Online で開く

Thanks. Looks like there are multiple ways to trim the code, but no way to do it faster.

I took your repmat modification and added Steven Lord's answer, below, and the original loop looks like the clear winner.

A = [2; 3; 1; 2; 2; 3; 1; 2; 2; 3];
B = [4.10047; 7.44549; 3.62159; 6.56964; 2.87221; 4.51231; 4.01697; 5.60534; 5.5440; 7.07802];
A = repmat(A, 1e6, 1);  % Let Matlab work with more than tiny data
B = repmat(B, 1e6, 1);
tic
C = NaN(max(A), 1);
for ii = 1:numel(C)
    C(ii) = mean(B(A == ii));
end
toc
Elapsed time is 0.157308 seconds.
tic
D = accumarray(A, B, [], @mean);
toc
Elapsed time is 0.873463 seconds.
tic
[E] = groupsummary(B, A, @mean)
E = 3×1
    3.8193
    4.9383
    6.3453
toc
Elapsed time is 2.450524 seconds.
tic
F = arrayfun(@(i)mean(B(A == i)),1:max(A)).';
toc
Elapsed time is 0.156121 seconds.
isequal(C, D, E, F)
ans = logical
   1

Torsten 2022 年 12 月 5 日

I took your repmat modification and added Steven Lord's answer, below, and the original loop looks like the clear winner.

Or "arrayfun" (see above).

サインインしてコメントする。

Answer 2

Steven Lord 2022 年 12 月 5 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1871562-can-the-efficienty-of-this-code-be-improved-either-computationally-or-just-in-terms-of-lines-of-cod#answer_1120487

MATLAB Online で開く

A = [2; 4; 1; 2; 2; 4; 1; 2; 2; 4];
B = [4.10047; 7.44549; 3.62159; 6.56964; 2.87221; 4.51231; 4.01697; 5.60534; 5.5440; 7.07802];
[C, groupnumbers] = groupsummary(B, A, @mean)
C = 3×1
    3.8193
    4.9383
    6.3453
groupnumbers = 3×1
     1
     2
     4

The groupnumbers output can help if some elements in 1:n don't appear in A (as is the case using the modified A I used in this example where all the 3's are replaced by 4's.)

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

James Akula 2022 年 12 月 5 日

I knew there had to be one line of code that did this. Thanks!

サインインしてコメントする。

Can the efficienty of this code be improved, either computationally or just in terms of lines of code?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

2 件のコメント
なしを表示なしを非表示

その他の回答 (1 件)

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

Can the efficienty of this code be improved, either computationally or just in terms of lines of code?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

2 件のコメント なしを表示なしを非表示

その他の回答 (1 件)

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示