find duplicated rows in matlab without for loop

Question

0 投票

Hello Friedns,

I have a very large matrix with 2 columns. I need to find the location of duplicated rows (the position of them) . However, I don't want to solve this problem with a for loop because I've tried it before (see the attached code) and it takes a long time. I'm looking for an alternative way to do this. I would be grateful if you could suggest me an idea.

Best,

Mina

x  = [File(:,1) File(:,2)];
Grid=unique(x,'rows');
for j=1:length(DD)
    idx=find(day_of_year==DD(j));
    File2=File(idx,:);
    for g=1:length(Grid)
        [index1]    = (ismember(File2(:,[1 2]),Grid(g,:),'rows'));
        idx2=find(index1==1);
        Total=[Total;Grid(g,1) Grid(g,2) DD(j) mean(File2(idx2,3)) mean(File2(idx2,4)) mean(File2(idx2,5)) mean(File2(idx2,6))];
    end
end

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Matt J 2023 年 7 月 21 日

編集済み: Matt J 2023 年 7 月 21 日

MATLAB Online で開く

0 投票

[~,I]=unique(x,'rows');
locations=setdiff(1:height(x),I) %locations of duplicate rows

2 件のコメント
なしを表示なしを非表示

Mina Mino 2023 年 7 月 21 日

Hi Matt J,

Many thanks for your help! It seems that for each line your code is only finding the position of one of the duplicated lines. However, there are more than one duplicates for each row. How can I find all duplicate rows of each row?

Thanks in advance for your answer and time.

Matt J 2023 年 7 月 21 日

編集済み: Matt J 2023 年 7 月 21 日

MATLAB Online で開く

It seems that for each line your code is only finding the position of one of the duplicated lines.

I don't think so. It should return the indices of all rows that have been seen before. As you can see below, the final locations list includes all rows except for 1 and 3, which is where a new row is encountered.

x=[ 1 2;
    1 2;
    0 4;
    1 2;
    0 4
    0 4];
[~,I]=unique(x,'rows');
locations=setdiff(1:height(x),I) %locations of duplicate rows
locations = 1×4
     2     4     5     6

サインインしてコメントする。

Answer 2

Walter Roberson 2023 年 7 月 21 日

移動済み: Matt J 2023 年 7 月 21 日

MATLAB Online で開く

2 投票

The third output of unique gives the "group number" for each entry. There are different ways of handling that. one of ways is

[unique_rows, ~, ic] = unique(x,'rows');
appears_in_rows = accumarray(ic, (1:size(x,1)).', [], @(v) {v});
T = table(unique_rows, appears_in_rows);

This would create a table in which the first variable is each unique row, and the second variable is a cell array of row indices that are that unique row. The cell array will always have at least one entry, but might have more.

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

Mina Mino 2023 年 7 月 21 日

移動済み: Matt J 2023 年 7 月 21 日

@Walter Roberson Thansk for your help! I exactly need it:)

Best,

サインインしてコメントする。

find duplicated rows in matlab without for loop

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

2 件のコメント
なしを表示なしを非表示

その他の回答 (1 件)

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

カテゴリ

製品

リリース

タグ

Community Treasure Hunt

find duplicated rows in matlab without for loop

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

2 件のコメント なしを表示 なしを非表示

その他の回答 (1 件)

1 件のコメント -1 件の古いコメントを表示 -1 件の古いコメントを非表示

カテゴリ

製品

リリース

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示