K-means clustering output matrix C (represent centroid locations)

Rui Silva
Rui Silva 2022 年 9 月 22 日
コメント済み: Rui Silva 2022 年 9 月 22 日
I am using kmeans clustering function to cluster a principal components matrix (pcs) with dimension 13149x10, where each column of the pcs matrix represent a principal component with decreasing order of explained variance. The number of clusters that I want to use is k = 8. I want to plot the location of the 8 centroids resulting from the kmeans algorithm, but the output matrix C (which gives the location of the centroids) has a size of 8x10 (k-by-p). Why there are 10 columns/ooordinates in C to define the position of each centroid? It shouldn't it be 2 columns?
The command that I am using is this:
[idx,C,sumd,D] = kmeans(pcs(:,1:10),k,'Replicates',10,'Start','cluster');
Appreciate your help.
Thank you!


the cyclist
the cyclist 2022 年 9 月 22 日
You have 8 centroids, each of which is a point in a 10-dimensional space. Each row gives the 10 coordinates needed to locate that centroid in 10 dimensions.
I'm not sure why you expected 2 columns.
Rui Silva
Rui Silva 2022 年 9 月 22 日
Thanks for your reply,
I was actually making some confusion with the examples in the documentation which are in two-dimensional space. Now it makes sense.
Kind regards,
Rui Silva


