How to vectorize this for maximum efficiency?

3 ビュー (過去 30 日間)
tensorisation
tensorisation 2019 年 8 月 31 日
コメント済み: Guillaume 2019 年 9 月 1 日
In my calculation I had to go over non-zero elements of a 3 dimensional array (label). Instead of making 3 loops over each index of the array (which would be inefficient), I ended up doing something like this:
highest_label=max(max(max(label)));
clusters_size=zeros(1,highest_label);
label_lin=nonzeros(label(:));
for q=1:length(label_lin)
clusters_size(label_lin(q))=clusters_size(label_lin(q))+1;
end
Is there a way to get rid of this loop as well, by vectorizing, and make this even more efficient? I tried stuff like:
clusters_size(label_lin)=clusters_size(label_lin)+1;
But that doesn't give a correct result.
To further clarify what I need, I will give an example:
Say:
label_lin=[1;1;2;3;4;5;5;5];
clusters_size=zeros(1,5)
Now I want to count the different values of label_lin in clusters_size (this is what the for loop does), so that:
clusters_size=[2,1,1,1,3]
If I do something like:
clusters_size(label_lin)=clusters_size(label_lin)+1;
I will instead get:
clusters_size=[1,1,1,1,1]
Which is obviously incorrect.

回答 (1 件)

dpb
dpb 2019 年 8 月 31 日
>> histc(label_lin,unique(label_lin))
ans =
2
1
1
1
3
>>
  14 件のコメント
dpb
dpb 2019 年 9 月 1 日
Yeah, has often been noted that histc is not as efficient as could be--I just wish TMW had chosen to work on it instead or at least not changed binning behavior in introducing histcounts. This incessant introduction of overlapping functionality and inconsistent syntax and behavior is really a pain and source of both confusion and error.
As far as "large", a double array of 512^3 is 128 MB. Once upon a time, that would have taxed the largest machines, now "entry-level" is 4GB physical memory. The key item is that the operation is inside a looping construct. However, as you note above, that's still not a significant fraction of the time in the loop so even if it were cut to nil it wouldn't make all that much difference. The need to optimize brings along the need to find out what areas offer real gains overall, not just "peephole" optimization.
Looking at CC2periodic you could also profile it...not sure which output form you're using; notice that one branch uses accumarray and sort both...there might be some opportunity there.
Whether it would suit your need or not, noticed a C source on GitHub that might be of some use if wanted to write a mex file.
Guillaume
Guillaume 2019 年 9 月 1 日
It's not too hard to write your own version of bwconncomp or bwlabel so you could make it wrap around the edges. Whether or not an m file implementation will gain you anything over what you're using now remains to be seen. Certainly, if it were coded as a mex you could see a gain.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeLoops and Conditional Statements についてさらに検索

製品


リリース

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by