why covariance matrix Should be positive and symmetric in mahalanobis distance

15 ビュー (過去 30 日間)
Mustafa Al-Nasser
Mustafa Al-Nasser 2022 年 7 月 31 日
編集済み: John D'Errico 2022 年 7 月 31 日
Dear All;
i am trying to use mahalanobis distance in pdist command but some time i got the error that covariance matrix should be symmetric and positive definite , for symmetric we can multiply the matrix by its transpose and positive matrix but if original covariance matrix is semipositive even if multiply it , we got semi definite matrix , my question first why we need to have it as positive definite and as Symmetrix

回答 (1 件)

John D'Errico
John D'Errico 2022 年 7 月 31 日
編集済み: John D'Errico 2022 年 7 月 31 日
SIGH. Multiplying a covariance matrix by its transpose is NOT what you want to do! If it is already a covariance matrix, that operation will SQUARE the eigenvalues. So that is completely incorrect. You will no longer have the same covariance matrix, or anything reasonably close to what you started with!!!!!!
Perhaps more to the point is, why does the Mahalanobis diatance computation require a POSITIVE DEFINITE AND SYMMETRIC matrix?
The reason is the distance computation will use a Cholesky decomposition. And that will require a symmetric matrix, that must at least be positive semi-definite. But then the distance computation will use the inverse of the Cholesky factor. And that won't exist if your matrix is singular.
I suppose you might ask why it needs to use a matrix factorization at all? That gets into the meaning of Mahalanobis distance, and for this I would probably need to teach an entire class on the subject, and a deep explanation of the linear algebra. But think of Mahalanobis distance as a variable ruler. The ruler varies in length, depending on which direction you point it in. (A strange, anisotropic ruler at that.) And the various directions in turn depend on the eigenvectors of your covariance matrix. If we look in the direction of an eigenvector with a zero eigenvalue, then the ruler is infinitely short. And that means any distance then computed with an infinitely short ruler will appear to be infinitely large as a distance.
Is there a solution? Perhaps. I say that because you can use the tool I posted on the File Exchange, to find the NEAREST positive definite matrix to a given matrix. It will adjust your matrix so that the result is a minimally perturbed matrix, that is now positive definite. However, will that really help you? It may, or it may not.
You can find nearestSPD here for free download:
Still, you need to recognize that a distance is meaningless for a singular covariance matrix, and even for a nearly singular matrix, it is still going to give you meaningless results, where the distance predicted is now essentially infinite. Perhaps an example will be best.
First, some numbers. I'll start with a set that lies entirely in one plane in 3-d.
T = randn(5,9);
xyz = randn(100,5)*T + randn(1,9);
Now, if I compute the covariance matrix of that set, it will be singular.
C = cov(xyz)
C = 9×9
7.1437 -2.7642 -3.2300 5.5161 -0.5471 2.1555 0.6802 -0.2952 -3.8048 -2.7642 3.1467 0.3381 -2.0715 3.1604 -2.0812 -1.5819 -0.3788 1.7973 -3.2300 0.3381 3.3430 -3.0227 -1.9160 -0.7774 0.8666 1.3295 2.3502 5.5161 -2.0715 -3.0227 4.7846 0.1201 2.1454 0.7556 -0.6177 -2.7985 -0.5471 3.1604 -1.9160 0.1201 4.8446 -1.6881 -1.9552 -1.3806 0.2587 2.1555 -2.0812 -0.7774 2.1454 -1.6881 2.4747 0.9769 0.3535 -0.2348 0.6802 -1.5819 0.8666 0.7556 -1.9552 0.9769 2.0841 0.2934 -0.3868 -0.2952 -0.3788 1.3295 -0.6177 -1.3806 0.3535 0.2934 1.1205 1.2145 -3.8048 1.7973 2.3502 -2.7985 0.2587 -0.2348 -0.3868 1.2145 4.2462
format long g
eig(C)
ans = 9×1
-1.68087761675556e-15 -1.02105998091962e-15 7.16925857609911e-17 3.36349347888726e-15 1.28911346822909 1.40442982472531 2.92559062275443 9.61030197193199 17.958662993204
rank(C)
ans =
5
As you should see, there are some negative eigenvalues. They are only negative by a tiny amount, as is common. But chol will fail, as would then using C to compute a mahalanobis distance.
The probem is, you DO NOT want to multiply C by itself. That would be completely inappropriate here. But we can find a matrix close to C that IS both symmetric and positive definite.
Chat = nearestSPD(C);
Is it symmetric?
(Chat - Chat') == 0
ans = 9×9 logical array
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Is it positieve definite?
chol(Chat)
ans = 9×9
2.67277883583431 -1.03421030446445 -1.2084731809849 2.06380560509652 -0.204706283184804 0.806458388475214 0.254484155852974 -0.110435926130482 -1.42353186195163 0 1.44121763274409 -0.632629609899025 0.0436611971548337 2.04596945062528 -0.865317769332076 -0.915022610023341 -0.342050600422094 0.225579483920938 0 0 1.21754350079729 -0.411520683102374 -0.713730089267441 -0.287683645097916 0.488937585526114 0.804601892187321 0.634554444877452 0 0 0 0.595060257838539 0.268062710613992 0.672827235140599 0.7923880020854 -0.0735515099094856 0.656580985420117 0 0 0 0 0.188120234777501 -0.734921447455287 0.561354533943111 -0.581777457518581 -1.15545200162012 0 0 0 0 0 3.17503150592186e-07 -2.22042401570526e-07 2.16972142007105e-07 4.72583158618215e-07 0 0 0 0 0 0 1.28184635446111e-07 -6.71237111288756e-08 -6.58245425263812e-08 0 0 0 0 0 0 0 6.82856991135454e-08 4.22720994512424e-08 0 0 0 0 0 0 0 0 6.66400187462506e-08
It is still rank 5 only.
rank(Chat)
ans =
5
And the difference between C and Chat is tiny.
norm(C - Chat)
ans =
6.65785480782041e-15
But now I can use pdist
pdist(xyz,'mahalanobis',Chat)
ans = 1×4950
2.84925932765838 2.73002732127091 3.83997957085061 4.03333939000031 2.75441716919389 3.60073537163287 3.2499790686024 2.97358017479128 2.79076137167206 2.98269119406068 3.46565395188214 4.56772098588518 3.7438426016466 4.92760165362847 3.02478700649402 2.98815858291422 3.54821876309422 4.75961550223585 2.18340391142699 4.959005099167 2.68021454856295 4.91579508476726 4.51371497606922 3.3176144021624 2.30523684809823 3.87213400910461 1.84740687254188 2.31108008274239 4.35978978108475 3.48587365249073
So all of the interpoint distances between members of XYZ are well posed and finite.
Be careful though, as if we compute the distance between random points, that are NOT in the hyperplane of the data used to generate that covariance matrix?
xyzrand = randn(5,9);
squareform(pdist(xyzrand,'mahalanobis',Chat))
ans = 5×5
1.0e+00 * 0 131169588.353458 85585167.3579053 112269487.18868 119107661.675588 131169588.353458 0 50814318.9383866 108312925.248608 99442514.7904085 85585167.3579053 50814318.9383866 0 95798661.1311307 95528822.6837144 112269487.18868 108312925.248608 95798661.1311307 0 46152830.7902491 119107661.675588 99442514.7904085 95528822.6837144 46152830.7902491 0
As you can see, all of those distances are now virtually infinite, at least as well as pdist can determine them using a Mahalanobis distance.
But if I had tried to use pdist with the matrix C instead, it would have failed, in either case.
  1 件のコメント
Mustafa Al-Nasser
Mustafa Al-Nasser 2022 年 7 月 31 日
Thank you very much for your answer , that is great one
Actually when i know that multipying the by transpose completey chnage distance value but i was not intrested in value itself but how far relatively are they
but you tools much better than my idea
Thanks a lot

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeParticle & Nuclear Physics についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by