To calculate mahalanobis distance when the number of observations are less than the dimension

Question

Pradeep Krishnamurthy 2017 年 3 月 31 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/333008-to-calculate-mahalanobis-distance-when-the-number-of-observations-are-less-than-the-dimension

回答済み: Ilya 2017 年 8 月 30 日

I am in the field of neuroscience and the data I am working on has the number of trials (or observations) less than the number of neurons (or dimensions). When I use mahal function on my data, I get the following error:

Error using mahal (line 38) The number of rows of X must exceed the number of columns.

Instead of posting my data, which is huge, you can run the following code that issues the same error.

A = rand(1,100); % new data

B = rand(10,100); % 10 observations and dimension 100

d = mahal(A,B);

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

John D'Errico 2017 年 3 月 31 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/333008-to-calculate-mahalanobis-distance-when-the-number-of-observations-are-less-than-the-dimension#answer_261252

編集済み: John D'Errico 2017 年 3 月 31 日

The problem is Mahalanobis distance is not defined in your case.

https://en.wikipedia.org/wiki/Mahalanobis_distance

You can't compute a meaningful distance when the result would be undefined. Why do I say this? A Mahalanobis distance requires a covariance matrix. A NON-singular covariance matrix. If your matrix is singular, then the computation will produce garbage, since you cannot invert a singular matrix. Since you don't have sufficient data to estimate a complete covariance matrix, mahal must fail.

Think about it in terms of what a mahalanobis distance means, and what a singular covariance matrix tells you. A singular covariance matrix tells you have NO information in some spatial directions about the system under study. So when you try to invert that, you get infinities, essentially infinite uncertainty.

2 件のコメント
なしを表示なしを非表示

Pradeep Krishnamurthy 2017 年 3 月 31 日

Thank you for taking time to respond. That is exactly the problem I ran into when I tried writing my own mahalanobis function. I had to invert a covariance matrix that was ill-conditioned and matlab warned me that the results may not be accurte.

I am using mahalanobis for a classification problem. Can you think of a classifier similar to mahalanobis when I can circumvent this problem ?

John D'Errico 2017 年 3 月 31 日

MATLAB Online で開く

The idea behind Mahalanobis distance is to look at a point in context of what your data shows. Your data has no information in some directions. With fewer data points than dimensions, that MUST be true. What does no information mean? ZERO variance in that direction.

So, now think of it in terms of how you compute Mahalanobis distance. You look in some direction, and divide by the standard deviation, implicitly scaling things by how many standard deviations a point is in context of the information that you have. But if the standard deviation in that direction is zero?

Maybe I'm not making this clear. Were we talking in person, I could see if it was making sense to you. Suppose we had a very simple problem, with a NON-singular covariance matrix. I'll make one up here.

C =
        1e-08        5e-05
        5e-05         1.25
[V,D] = eig(C)
V =
           -1        4e-05
        4e-05            1
D =
        8e-09            0
            0         1.25

I chose C so it has one very small eigenvalue. If we were to submit a point in the direction of that very small eigenvalue, then mahal would say it was very far out. But go out an equal Euclidean distance in the direction of the normal sized eigenvalue, and a Mahalanobis distance would not say it was far out. Essentially, Mahalanobis distance reflects the data.

But suppose we have a singular covariance matrix? Now the matrix will have at least one zero eigenvalue. Any thing seen in your test point in that direction must be infinitely far out.

The formula for Mahalanobis distance is simple to write, and simple to compute. The wiki link I gave shows it. If you look online, you will probably see people suggesting you can use a pseudo-inverse instead of inv there. (I did look, and I did see exactly that idea.)

Essentially, that would just result in the computation ignoring anything in a test point that was in the null-space of your reference data. To be honest, I really don't think much of the idea. Closing your eyes to something does not make it go away, or mean that it does not exist. That is effectively what pinv would do there.

Could you use pinv? Sure. I'd really want to add an indicator of how much stuff you were ignoring by using pinv there, so how much of a test point was actually in the null space of your data.

サインインしてコメントする。

Answer 2

Hugo Malagon 2017 年 8 月 29 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/333008-to-calculate-mahalanobis-distance-when-the-number-of-observations-are-less-than-the-dimension#answer_279607

Reduce the dimensions. Principal component analyses, for example.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Answer 3

Ilya 2017 年 8 月 30 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/333008-to-calculate-mahalanobis-distance-when-the-number-of-observations-are-less-than-the-dimension#answer_279630

For classification, use regularized discriminant or pseudo discriminant. Both options are supported in fitcdiscr. Regularization add a positive value to the diagonal of the covariance matrix to make it full-rank. Pseudo discriminant amounts to taking pinv.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

To calculate mahalanobis distance when the number of observations are less than the dimension

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

2 件のコメント
なしを表示なしを非表示

その他の回答 (2 件)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

To calculate mahalanobis distance when the number of observations are less than the dimension

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

2 件のコメント なしを表示なしを非表示

その他の回答 (2 件)

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示