Very slow loop trying to find any intersection
古いコメントを表示
Hi Everyone,
I am trying to figure out if there is any intersection between a pair of observations in terms of partners that they have worked with. Jaccard_dyadic is the dyadic table in which the first two columns identify the observations (i.e. the pair that makes up the unique identifier). Then I am trying to fill row 'm' with the value 1, whenever both of the observations have worked with any of the same inventors (assignee_inventor is a matrix in which all of the observations are the rows, and inventors the columns, filled with a 1 whenever the observation of the corresponding row has worked with the inventor of the corresponding column). The complicated loop structure I have created below does exactly that - however, it is super slow. Any help of how to speed up this process would be much appreciated (I suspect that there is a much simpler way of doing this).
for i = 1:(find(jaccard_dyadic(:,1)==0, 1, 'first')-1)
for l=1:p(2)
if any(assignee_inventors(jaccard_dyadic(i,1),l)==assignee_inventors(jaccard_dyadic(i,2),l) && assignee_inventors(jaccard_dyadic(i,2),l)==1)
jaccard_dyadic(i,m)=1;
end
end
end
EDIT:
This is the whole code I am using. I have added some sample data. Given that the results are quite sparse, I hope that there are some instances of what I am looking for here. I haven't uploaded the way I want the output to be, but essentially it is just the last row of the jaccard_dyadic matrix (filled with zeros) that I want to take on the value 1 if there is any overlap as described above.
%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%Any Same Inventors
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
load('jaccard_dyadic_test.mat')
B = readmatrix('Inventor_copy.csv');
assignees = B(:,1);
inventors = B(:,2);
assignee_inventors=zeros(max(unique(B(:,1))), max(unique(B(:,2))));
empty_dim = size(B);
%%
for i=1:empty_dim(1)
assignee_inventors(assignees(i),inventors(i))=1;
end
%%
% actual code for what I need
p = size(assignee_inventors);
m = find(all(jaccard_dyadic==0), 1, 'first');
for i = 1:(find(jaccard_dyadic(:,1)==0, 1, 'first')-1)
for l=1:p(2)
if any(assignee_inventors(jaccard_dyadic(i,1),l)==assignee_inventors(jaccard_dyadic(i,2),l) && assignee_inventors(jaccard_dyadic(i,2),l)==1)
jaccard_dyadic(i,m)=1;
end
end
end
fprintf('After Inventors ');toc
回答 (1 件)
A simplified version to get the overview:
dy = jaccard_dyadic;
in = assignee_inventors;
n = find(dy(:,1) == 0, 1, 'first') - 1;
for i = 1:n
for k = 1:p(2) % k is less confusing as l
if any(in(dy(i,1), k) == in(dy(i,2), k) && in(dy(i,2), k) == 1)
jaccard_dyadic(i, m) = 1;
end
end
end
What is m ? What is the purpose of the any()? For a scalar input you can omit the any() and write:
if in(dy(i,1), k) == in(dy(i,2), k) && in(dy(i,2), k) == 1
Isn't this the same as:
if in(dy(i,1), k) == 1 && in(dy(i,2), k) == 1
Which values can in contain? If it is only 0 or 1:
if in(dy(i,1), k) && in(dy(i,2), k)
Then your loop might be equivalent to:
jaccard_dyadic = assignee_inventors(dy(:, 1), 1:p(2)) & ...
assignee_inventors(dy(:, 2), 1:p(2));
Here I guess, that "m" is the inner loop counter. Maybe you need to add "==1" to both operands. replace the "1:p(2)" by a simple ":" if this matchs your needs.
3 件のコメント
John Kirk
2019 年 6 月 5 日
John Kirk
2019 年 6 月 6 日
After you have explained, that m is a constant, the inner loop can be omitted:
in = assignee_inventors; % Shorter names for nicer code
dy = jaccard_dyadic;
p = size(in);
m = find(all(dy==0), 1, 'first');
for i = 1:500
ja(i, m) = any(in(dy(i,1), :) & in(dy(i,2), :), 2);
end
The outer loop can be vectorized also:
ja(:, m) = any(in(dy(:, 1), :) & in(dy(:, 2), :), 2);
I'd prefer to test the code before posting. Therefore it is better to post some input data, e.g. created by rand.
"I get an error that the index in position 1 is invalid."
Please post a copy of the complete error message, not a rephrased version. Which index is meant? Which code did you try exactly? Post it, because it might contain a typo. Maybe your jaccard_dyadic has more elements than assignee_inventors and some elements are zero. You can check this easily.
カテゴリ
ヘルプ センター および File Exchange で Loops and Conditional Statements についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!