MATLAB Answers

0

Find all lines of a matrix which have a matching pair of values

Florent Hannard さんによって質問されました 2019 年 5 月 29 日
最新アクティビティ Luna
さんによって コメントされました 2019 年 6 月 10 日
Hi everyone,
Given a matrix, I would like to find all pairs of lines which have a matching pair of values.
So for :
A = [1 2 3 4;
2 5 7 8;
4 5 7 9;
4 6 7 13;
6 8 13 15]
I would like to get a matrix with similar values :
Pairs = [5 7;
4 7;
6 13]
And for which lines these values are found :
Index = [2 3;
3 4;
4 5]
In other words : intersection of A(Index(i,1),:) and A(Index(i,2),:) is Pairs(i,:)
One simple way to do it would be to use the intersect function :
for i=1:length(A(:,1))
for j=i+1:length(A(:,1))
intersect(A(i,:),A(j,:))
end
end
But this is not efficient for a very large matrix A. Could someone suggest a more efficient way to do it ?
Thanks!
Florent

  3 件のコメント

Guillaume
2019 年 5 月 29 日
Note that:
length(A(:, 1))
is more simply written as
size(A, 1)
which will probably execute faster as well (since it doesn't need to allocate memory for the temporary A(:, 1)).
Guillaume
2019 年 5 月 29 日
What if a pair is common to more than two rows? How should Index be constructed in that case?
"But this is not efficient for a very large matrix A"
What is considered large in this context?
Florent Hannard 2019 年 5 月 29 日
That's right thanks, but what I hope is to get rid of (at least) one of the loop (with something similar to ismember).
Large could be something like [500x10] or more, and I will need to perform this operation many times (which is why I am trying to optimize it).
EDIT : I know that a pair will never be common to more than two rows.

サインイン to comment.

4 件の回答

Jan
回答者: Jan
2019 年 5 月 31 日

Sort the input array to avoid some overhead in intersect:
A = [1 2 3 4;
2 5 7 8;
4 5 7 9;
4 6 7 13;
6 8 13 15]
As = sort(A, 2);
nA = size(A, 1);
m = nA * (nA - 1) / 2;
Pairs = zeros(m, 2); % Pre-allocate!
Index = zeros(m, 2);
count = 0;
for i1 = 1:nA
Ai1 = A(i1, :);
for i2 = i1 + 1:nA
match = ismembc(Ai1, A(i2, :)); % C-Mex function, undocumented
if any(match)
count = count + 1;
Pairs(count, :) = Ai1(match);
Index(count, 1) = i1;
Index(count, 2) = i2;
end
end
end
Pairs = Pairs(1:count, :);
Index = Index(1:count, :);

  1 件のコメント

Luna
2019 年 6 月 10 日
+1

サインイン to comment.


Kevin Phung
回答者: Kevin Phung
2019 年 5 月 29 日
編集済み: Kevin Phung
2019 年 5 月 29 日

A = [1 2 3 4; 2 5 7 8; 4 5 7 9; 4 6 7 13; 6 8 13 15];
Pairs = [5 7; 4 7; 6 13];
index =zeros(size(Pairs,1),1);
for i = 1:size(Pairs,1)
loc = ismember(A,Pairs(i,:));
index(i,:)=find(sum(loc,2)==numel(Pairs(i,:)))';
end

  5 件のコメント

Guillaume
2019 年 5 月 29 日
Is the question about identifying/ the pairs (the hard part) or just the location of already identified pairs?
Jan
2019 年 5 月 29 日
Pairs is wanted as output of the function.
Florent Hannard 2019 年 5 月 29 日
Indeed, Pairs and index are wanted as outputs and not just the location of already indentified paires.

サインイン to comment.


Luna
回答者: Luna
2019 年 5 月 29 日
編集済み: Luna
2019 年 5 月 29 日

Basically you can use this:
Finds the first row of the pairs' elements seperately as logical indexes. Then adds them together.
If it matches at the same time on that row, sum of rows will be 2. So these are your new indexes that are found. Repeats them for 2nd and 3rd row also.
IndexesNew = zeros(size(Pairs,1),2);
for i = 1:size(Pairs,1)
IndexesNew(i,:) = find(sum(ismember(A,Pairs(i,1)) + ismember(A,Pairs(i,2)),2) == 2)';
end

  1 件のコメント

Florent Hannard 2019 年 5 月 29 日
Thank you but this is not what I need.
Pairs and index are wanted as outputs and not just the location of already indentified paires.

サインイン to comment.


Guillaume
回答者: Guillaume
2019 年 5 月 29 日

I don't see how you can avoid doing size(A, 1) * (size(A, 1)-1) / 2 comparisons of the rows one way or another. The problem with using intersect in a loop is that you're repeatedly sorting the rows that you may already have sorted previously so certainly taking that out of the loops would be useful.
The following may or may not be faster than using intersect in a loop. I haven't tested.
%If size(A, 1) is fixed, this part can be precomputed before applying the rest to each A
%It simply precomputes row indices for comparison
[row2, row1] = ndgrid(1:size(A, 1));
row1 = nonzeros(tril(row1, -1));
row2 = nonzeros(tril(row2, -1));
%finding pairs and locations
[uA, ~, destcol] = unique(A); %only one call to unique on the whole matrix
hasnumber = zeros(size(A, 1), numel(uA)); %columns of hasnumber correspond to values in uA, rows to rows of A. Will contain one when the corresponding row contains uA
hasnumber(sub2ind(size(hasnumber), repmat(1:size(A, 1), 1, size(A, 2))', destcol)) = 1;
ismatch = hasnumber(row1, :) + hasnumber(row2, :) == 2; %find which numbers are common to both row1 and row2
haspair = sum(ismatch, 2) == 2; %pair when exactly two numbers are common
uA = repmat(uA, 1, sum(haspair));
RowIndices = [row1(haspair), row2(haspair)] %indices of row that have a matching pair
PairValues = reshape(uA(ismatch(haspair, :)'), 2, []).' %Actual values of pair

  0 件のコメント

サインイン to comment.



Translated by