what does pc1 and pc2 represent?

Question

Neo 2023 年 2 月 8 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1908530-what-does-pc1-and-pc2-represent

コメント済み: Neo 2023 年 2 月 12 日

Screenshot 2023-02-07 at 8.28.08 PM.png

Hello, I have a plot of pca 3d plot and the axes are pc1, pc2 and pc3. How do i determine what each pc component represents?

Also how can i establish the variance that each pc plot contributes? I attached an example for reference.

How can i determine this?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Image Analyst 2023 年 2 月 8 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1908530-what-does-pc1-and-pc2-represent#answer_1166775

The pca function will tell you what each PC is composed of. It gives you the weights of the original independent variables, right? You can think of a 3-D PCA as a rotation of a coordinate system.

6 件のコメント
4 件の古いコメントを表示4 件の古いコメントを非表示

Image Analyst 2023 年 2 月 12 日

編集済み: Image Analyst 2023 年 2 月 12 日

MATLAB Online で開く

kmeans_relabel_class_numbers.m

Once you know what class numbers are you can take, for example, the mean distance from the centroid. I'm sure you've figured it out by now, but anyway here's how I'd do it. The class assignments (indexes) and centroids are both returned from kmeans. Something like (untested)

numClasses = 3;
[assignedClasses, centroids] = kmeans(xy, numClasses);
for k = 1 : numClasses
    % Get points assigned to the k'th class
    thisxy = xy(assignedClasses == k, :);
    x = thisxy(:, 1);
    y = thisxy(:, 2);
    % Get distances from every point in this class to the centroid of this class.
    distances = sqrt((x - centroids(k, 1)) .^ 2 + (y - centroids(k, 2)) .^ 2);
    % Get mean and standard deviation of those distances.
    meanDistances(k) = mean(distances);
    stDevDistances(k) = std(distances);
end

If you want to reorder the class numbers, see the attached demo.

Neo 2023 年 2 月 12 日

Okay I will study this thank you image analyst, you are my inspiration

サインインしてコメントする。

Answer 2

John D'Errico 2023 年 2 月 10 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1908530-what-does-pc1-and-pc2-represent#answer_1168690

MATLAB Online で開く

For example...

load fisheriris
whos
  Name           Size            Bytes  Class     Attributes

  cmdout         1x33               66  char                
  meas         150x4              4800  double              
  species      150x1             18100  cell                
[COEFF, SCORE, LATENT] = pca(meas);

So the first component is huge compared to the others, in terms of the total variance explained. The total variance in that system is:

sum(var(meas))
ans = 4.5730

And the

LATENT
LATENT = 4×1
    4.2282
    0.2427
    0.0782
    0.0238

So most of the variability lies in the first component.

But what do the components mean, physically? This is eomthing very difficult to know at times.

COEFF(:,1)
ans = 4×1
    0.3614
   -0.0845
    0.8567
    0.3583

Those coefficients represent the linear combination chosen of the various original variables. But trying to say what the linear combination means can be difficult. A biologist might try to infer some sort of meaning to those various weights. And I suppose you might decide that variables like {sepal length, sepal width, petal length, petal width} might all be related in some way, and in combination might tell you something about the plant. But don't go out of your way to try to assign some meaning here.

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

Image Analyst 2023 年 2 月 10 日

What about kmeans? That is unsupervised classification where you tell it the number of clusters you think the data should have and it assigns each data point to a cluster. You can certainly run it but you need to have many points. For example if you have a bunch of images that have 3 clusters that it's working well with and now pass it data with only 2 clusters but you force it to find three, it will take one of the clusters and split it up into two clusters so you will have 3 but some of those points will not be assiend to the the right cluster.

John D'Errico 2023 年 2 月 10 日

kmeans will work on 3d data. But kmeans is not the only clustering tool in existence. And yes, it helps if you have more data. The more data, the better is always the rule in anything. (Only once in a long career as a consultant did I ever tell a client they gave me more data than I really needed.) It also helps if you know how many clusters are to be found. I'm not a true statistician though, (I only play one in the movies) so I won't suggest if a better method is available for the clustering. That would surely depend on the actual data anyway.

サインインしてコメントする。

what does pc1 and pc2 represent?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

6 件のコメント
4 件の古いコメントを表示4 件の古いコメントを非表示

その他の回答 (1 件)

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

what does pc1 and pc2 represent?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

6 件のコメント 4 件の古いコメントを表示4 件の古いコメントを非表示

その他の回答 (1 件)

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

6 件のコメント
4 件の古いコメントを表示4 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示