Matlab find unique column-combinations in matrix and respective index

Question

Benvaulter 2017 年 3 月 22 日

1
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/331309-matlab-find-unique-column-combinations-in-matrix-and-respective-index

編集済み: Jan 2017 年 3 月 23 日

I have a large matrix with with multiple rows and a limited (but larger than 1) number of columns containing values between 0 and 9 and would like to find an efficient way to identify unique row-wise combinations and their indices to then build sums (somehwat like a pivot logic). Here is an example of what I am trying to achieve:

a =

uniqueCombs =

   2     3
   2     3
   2     1

numOccurrences =

 2
 1
 2

indizies:

[1;4]
[2]
[3;5]

From matrix a, I want to first identify the unique combinations (row-wise), then count the number occurrences / identify the row-index of the respective combination.

I have achieved this through generating strings with num2str and strcat, but this method appears to be very slow. Along these thoughts I have tried to find a way to form a new unique number through concatenating the values horizontally, but Matlab does not seem to support this (e.g. from [1;2;3] build 123). Sums won't work because they would remove the possibility to identify unique combinations. Any suggestions on how to best achieve this? Thanks!

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Guillaume 2017 年 3 月 22 日

3
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/331309-matlab-find-unique-column-combinations-in-matrix-and-respective-index#answer_259890

MATLAB Online で開く

More or less the same as Jan's, using accumarray instead of splitapply (I'm still old school!):

A = [ 1     2     3
      2     2     3
      3     2     1
      1     2     3
      3     2     1];
[B, ~, ib] = unique(A, 'rows');
numoccurences = accumarray(ib, 1);
indices = accumarray(ib, find(ib), [], @(rows){rows});  %the find(ib) simply generates (1:size(a,1))'

4 件のコメント
2 件の古いコメントを表示2 件の古いコメントを非表示

Guillaume 2017 年 3 月 23 日

編集済み: Guillaume 2017 年 3 月 23 日

MATLAB Online で開く

I suspect that accumarray will be faster as it is built-in compiled code whereas splitapply is m code, but I haven't conducted any test.

Note: for the indices,

indices = accumarray(ib, (1:numel(ib))', [], @(rows){rows});

is probably slightly faster, just not as concise.

Jan 2017 年 3 月 23 日

編集済み: Jan 2017 年 3 月 23 日

MATLAB Online で開く

@Guillaume: I compare this with cellfun: In older versions Matlab contained the C-sources for this Mex function. Here calling a function handle is very expensive, because the Matlab tier has to be called. Therefore the implicitely defined methods provided by strings are much faster: 'length', 'isclass' etc.

Then using a compiled Mex function is not a real benefit, because mexCallMATLAB has some overhead. This might concern accumarray also. I guess that your accumarray approach is faster than the loop, but I know that it looks very cryptic ;-)

But now I can leave the speculations and run a test: With

A = randi([1, 100], 1e5, 3); % Test data

my loop takes 14.75 seconds, your accumarray approach takes 0.44 seconds. The results differ in the order of the indices. So perhaps this is wanted:

[B, iB, iA] = unique(A, 'rows');
indices     = accumarray(iA, (1:numel(iA)).', [], @(r){sort(r)});

The result is clear: @Benvaulter, please unaccept my answer and select Guillaume's, and of course use it also to save time and energy.

サインインしてコメントする。

Answer 2

Jan 2017 年 3 月 22 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/331309-matlab-find-unique-column-combinations-in-matrix-and-respective-index#answer_259879

編集済み: Jan 2017 年 3 月 23 日

MATLAB Online で開く

A = [ 1     2     3; ...
      2     2     3; ...
      3     2     1; ...
      1     2     3; ...
      3     2     1];
[B, iB, iA] = unique(A, 'rows');
G = unique(iA);
numOccurrences = splitapply(@sum, iA, G);

I cannot test a method to obtain the indices list as wanted. I assume this works with splitapply also. A simple loop approach at least:

n = length(G);
indices = cell(1, n);
for k = 1:n
  indices{k} = find(iA == G(k));
end

[EDITED] Code is tested now. Use the much faster solution of Guillaume for productive work.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Benvaulter 2017 年 3 月 23 日

Perfect solution to my problem - thanks a lot!

サインインしてコメントする。

Matlab find unique column-combinations in matrix and respective index

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

4 件のコメント
2 件の古いコメントを表示2 件の古いコメントを非表示

その他の回答 (1 件)

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

Matlab find unique column-combinations in matrix and respective index

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

4 件のコメント 2 件の古いコメントを表示2 件の古いコメントを非表示

その他の回答 (1 件)

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

4 件のコメント
2 件の古いコメントを表示2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示