How can I speed up this indexing code?

Question

bhousden 2021 年 10 月 28 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1574113-how-can-i-speed-up-this-indexing-code

コメント済み: bhousden 2021 年 10 月 29 日

採用された回答: Ive J

MATLAB Online で開く

I have a cell array in which each cell contains a string. E.g....

a={'AAAAAA';'BBBBBB';'CCCCCC';'AAAAAA';'DDDDDD'};

Each cell in array a is associated with a row in a numerical array that contains 10 columns. E.g....

b=[0,0,0,1,1,0,0,0,0,0;
   0,1,0,0,0,0,0,0,0,0;
   1,0,1,0,1,0,2,0,0,0;
   3,0,0,0,0,0,0,0,0,1;
   0,0,0,0,0,0,2,0,0,1];

Some strings in a are repeated such as 'AAAAAA' as shown above. What I need to do is find all repeated cells in a and sum the assocated columns from b into a single row. This should result in two new arrays (unia and bnew) which have equal numbers of rows but every string in unia is unique.

Easy enough to do with a loop such as:

unia=unique(a);
bnew=zeros(numel(unia),10);
for n=1:numel(unia)
    pos=find(strcmp(a,unia{n}));
    bnew(n,:)=sum(b(pos,:),1);
end

This works fine for small arrays but I have a case where a has 6 million cells and unia has 300,000 cells so I need something much faster. Any ideas?

Thanks!

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Ive J 2021 年 10 月 28 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1574113-how-can-i-speed-up-this-indexing-code#answer_819083

MATLAB Online で開く

Avoid comparing strings within the loop and instead take advantage of the index vector from unique:

a = ["A", "B2", "A", "C", "AA", "B2", "B2"]; % use strings instead of cell array of characters, they're much more efficinet to work with
b = randi([0 2], numel(a), 3)
b = 7×3
     1     0     1
     0     1     2
     0     1     2
     0     2     1
     2     0     2
     2     2     0
     0     2     1
[anew, ~, idx] = unique(a);
bnew = arrayfun(@(x) sum(b(x == idx, :), 1), 1:numel(anew), 'uni', false);
bnew = vertcat(bnew{:})
bnew = 4×3
     1     1     3
     2     0     2
     2     5     3
     0     2     1
anew
anew = 1×4 string array
    "A"    "AA"    "B2"    "C"

Also, you can use tall arrays when dealing with large arrays.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

bhousden 2021 年 10 月 29 日

Perfect! Thanks for your help.

サインインしてコメントする。

How can I speed up this indexing code?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

How can I speed up this indexing code?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示