PCA: number of attributes much bigger than number of data samples

35 ビュー (過去 30 日間)
Micha? Kowalczyk
Micha? Kowalczyk 2011 年 7 月 23 日
Hello, I would like to apply PCA analysis to data in which I have 100 samples, each of them represented by 10000 variables. So we have the following situation: [m n] = size(myData); m = 100 n = 10000
In such case calling PCA this way:
[pc,score,latent,tsquare] = princomp(zscore(myData));
returns score and latent of only m-1=99 components. Everything above index 99 is equal to 0. Why? Can I trust those values returned by above function?
Thank you for any help. Michael

採用された回答

Daniel Shub
Daniel Shub 2011 年 7 月 23 日
This is not typically how I run PCA. I typically have many more samples than variables. I think the components returned by PCA are still valid in that each component explains the maximal amount of variance in the data. When you have more variables than samples, the issue is that the principal components are not unique. The reason you only get 99 components is that you only have 100 samples. You can explain all the variance with N-1 (or maybe N) components.

その他の回答 (1 件)

Arturo Moncada-Torres
Arturo Moncada-Torres 2011 年 8 月 24 日
I recommend you to look at this great tutorial by Will Dwinnel. I think you will find everything you need here.

カテゴリ

Help Center および File ExchangeDimensionality Reduction and Feature Extraction についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by