MATLAB Answers


Finding the index of dupplicate rows in a matrix

Danielle Leblance さんによって質問されました 2017 年 2 月 6 日
最新アクティビティ Stephen Cobeldick
さんによって 編集されました 2017 年 2 月 6 日
I have matrix M 275935x2 . I want to remove duplicate rows. I tried two methods: Method 1)
M2=unique(M,'rows') % and it gave M2 179109x2 double
Method 2)
x0=find(hist(M,unique(M))>1); % it gave only 8301 duplicate values.
Which method is correct? I want to find th eindices of duplicate rows and not simply remove them. Any help is appreciated

  0 件のコメント

サインイン to comment.


1 件の回答

Stephen Cobeldick
Answer by Stephen Cobeldick on 6 Feb 2017
Edited by Stephen Cobeldick on 6 Feb 2017

>> A = randi(1e4,275935,2);
>> [B,~,Y] = unique(A,'rows','stable');
>> [C,X] = hist(Y,unique(Y));
>> Z = ismember(Y,X(C>1)); % indices of repeated rows of A
For example this random data set had
>> nnz(Z)
ans =
row that occur most than once. To get the indices of the duplicate rows, try this:
[U,W] = unique(A,'rows','stable');
D = setdiff(1:size(A,1),W); %indices of duplicate rows.

  2 件のコメント

I am sure there is something wrong. I am attaching the data.the unique function gives a matrix B which is different than the one that I obtain if I remove the duplicates Z
Stephen Cobeldick
2017 年 2 月 6 日
This works for me:
>> load matlab.mat
>> [B,W] = unique(M,'rows','stable');
>> D = setdiff(1:size(M,1),W); % indices of duplicate rows.
And now compare:
>> M(D,:) = [];
>> isequal(M,B)
ans =

サインイン to comment.

Translated by