Finding the most orthogonal set of n vectors from dataset of unit vectors

8 ビュー (過去 30 日間)
Z
Z 2020 年 8 月 8 日
編集済み: Z 2020 年 10 月 13 日
I am attempting to find the set of determine a subset of vectors in a matrix that are the most dissimilar. Each row in this matrix is a unit vector which is unique.
idx = nchoosek(1:size(dataset, 1), 5); % Produce all possible combinations of indices
score = zeros(size(idx, 1), 1); % Store similarity score of unit vectors into the corresponding row of score
for I = 1 : size(idx, 1)
score(I) = sum(prod(dataset(idx(I, :), :), 1)); % Element-wise multiply this combination's vectors and add
end
[best_scores, best_idx] = mink(score, 10)
  1 件のコメント
David Goodmanson
David Goodmanson 2020 年 8 月 8 日
Hi Alex
cosider
a = [ 1 1 1 1 1 1 1 1 -2
2 2 2 2 2 2 2 2 -4
3 3 3 3 3 3 3 3 -6]
Since these three vectors are scalar multiples of each other, they are parallel and as mutually nonorthogonal as you can get. But sum(prod (a)) = 0. So that test is not goiing to work.
[the use of 1 in prod(a,1) is superfluous since 1 is the default]

サインインしてコメントする。

採用された回答

Bruno Luong
Bruno Luong 2020 年 8 月 8 日
編集済み: Bruno Luong 2020 年 8 月 8 日
I propose this
% random test A, which represents 100 normalized vector in R^5
n = 5;
m = 100;
A = randn(n,m);
A = A ./ sqrt(sum(A.^2,1));
% Want to select k columns of A, most mutually "orthogonal"
k = 3; % requirement: k <= rank(A) <= n
n = size(A,2);
c = nchoosek(1:n,k);
d = arrayfun(@(r) detsub(A(:,c(r,:))), 1:size(c,1));
[~,imax] = max(d);
% selection of best k columns of A
col = c(imax,:);
B = A(:,col)
% check scalar products, it must have small off-diagonal terms
B'*B
function d = detsub(A)
[~,R] = qr(A,0);
d =prod(abs(diag(R)));
end
  8 件のコメント
Bruno Luong
Bruno Luong 2020 年 8 月 25 日
Yes
Let me explain to you more. When you do qr on A where each column of A is unit vector:
[Q,R] = qr(A)
you have the following properties
  • span {Q(:,1:i)} contains span {A(:,1:i)} for all i.
  • the coefficient R(i,j) is dot(Q(:,i),A(:,j)) for i <= j.
  • Since norm(A(:,j))=1 and R is triangular we have norm(R(1:j,j)) = 1.
Now doing some math. For i < j.
abs(dot(A(:,i),A(:j)) = norm Projection A(:,j) on span{A(:,i)}
<= norm Projection A(:,j) on span{Q(:,1:i)}
= norm(R(1:i;j))
= sqrt(R(1,j)^2 + R(2,j)^2 + ... R(i,j)^2)
You want abs(dot(A(:,i),A(:j)) small (most orthogonal) meanning that you want
R(1,j)^2 + R(2,j)^2 + ... R(i,j)^2
to be small, for all i < j.
Since from the third property
R(1,j)^2 + R(2,j)^2 + ... R(j,j)^2 = 1
you just want the last term R(j,j)^2 is as large as possible (therefore the partial sum is small).
And that you want for all j.
I chose det(A) = prod(R(j,j)) as metric, since it has some geometrical interpretation as the volume of the parallelepipied formed by columns of A.
Argually you could use other metric such as "Trace"(A) = sum (abs (R(j,j)) as metric.
My own intuition tell me the det choice is somewhat better.
Z
Z 2020 年 8 月 25 日
That is a valuable explanation. Thank you so much for your input!

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeSpline Postprocessing についてさらに検索

タグ

製品


リリース

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by