Why pca on my matrix gives the first number in latent matrix greater than one?

Question

Penny13 2019 年 3 月 5 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/448414-why-pca-on-my-matrix-gives-the-first-number-in-latent-matrix-greater-than-one

コメント済み: Penny13 2019 年 3 月 11 日

I have a 626284 by 26 matrix which is all zeros and ones. I did [coeff,score,latent] = pca(X) on my matrix but latent gave me the following numbers:

1.47069819212040

0.338544895320084

0.225716863688052

0.188056189419163

0.157949433440297

0.126385063251976

0.0906964951134501

0.0773105845697984

0.0738595589018172

0.0659590250255644

0.0616215954476751

0.0537688669401442

0.0262686347674844

0.0160550157883815

0.0112744279903577

0.0105353514551859

6.11095771880279e-33

6.03879225801973e-33

5.96730010116445e-33

So what could be the reason?

Thank you for your guidance.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

David Goodmanson 2019 年 3 月 6 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/448414-why-pca-on-my-matrix-gives-the-first-number-in-latent-matrix-greater-than-one#answer_364010

編集済み: David Goodmanson 2019 年 3 月 6 日

MATLAB Online で開く

Hi Penny,

Is there a reason that you think that a matrix of all ones and zeros can't have a latent value greater than 1? Here is a counterexample:

n = 50;
m = 20;
A = [ones(n,m);triu(ones(m,m));zeros(n,m)];
[coeff,score,latent] = pca(A);
rA =rank(A)
% results
latent = 
4864
2955
0826
0379
0218
0143
0102
0077
0061
0050
0042
0036
0032
0029
0026
0025
0023
0022
0022
0021
 rA = 20    
    

The triu matrix was inserted so that every column is linearly independent, which sidesteps a potentially artificial trick situation where a lot of columns are identical. Matrix A has full rank of 20.

pca starts out by taking the mean of each column, so the idea here was to make the excursions from the mean as large as possible. WIth only 1 and 0 avialable, this means creating columns that are half ones and half zeros (or close to it). After that, constructing a bunch of columns that are nearly parallel puts most of the deviation along a single axis.

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

David Goodmanson 2019 年 3 月 8 日

Hi Penny,

There is plenty of information out there, starting with 'help pca' and then wikipedia, but in brief: yes the latent matrix is as you say, but there is no reason the variances need to be small. Variances are just the average value of a sum of squares of deviations from the mean, and they can be large. If you take a set of data and multiply all the values by 10, the variance goes up by a factor of 100. It's not like the correlation coefficient, which is normalized and comes out between +-1.

The latent variable is as you say. Coefficients are components of the principal axes, which are unit vectors. So the sum of squares of each column in the component matrix = 1. Scores are the variances for each measurement (row) along the principal axes.

Penny13 2019 年 3 月 11 日

Thank you so much for your answer,David.

サインインしてコメントする。

Why pca on my matrix gives the first number in latent matrix greater than one?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

Community Treasure Hunt

Why pca on my matrix gives the first number in latent matrix greater than one?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示