How to manually set K-means centroids when classifying an image

57 ビュー (過去 30 日間)
Hello World (wasn't that what the books told you to print way back when you started doing HTML?...)
I am exploring the kmeans function in matlab to classify an RGB image into three classes. I would like to force the kmeans with regards to the location of the centroids. As I can understand from the documentation, I should use the 'start' option, however I can not figure out how to set it correctly: In the images, I wan't to separate blue sky from water and land. Let's say that I find the sky to have an average RGB value of [120,130,190], water at [110,150,150] and land at [120,140,120]. Could any of you give an example of how to force the kmeans with these centroids? Thank you in advance for any input!


Shashank Prasanna
Shashank Prasanna 2014 年 3 月 27 日
if your data matrix X is n-by-p, and you want to cluster the data into 3 clusters, then the location of each centroid is 1-by-p, you can stack the centroids for the 3 clusters into a single matrix which is 3-by-p and provide to kmeans as starting centroids.
C = [120,130,190;110,150,150;120,140,120];
I am assuming here that your matrix X is n-by-3.
This is explained in the documentation:
  1 件のコメント
Andreas Westergaard
Andreas Westergaard 2014 年 3 月 29 日
Shashank, thank you very much. Just what I was requesting.


その他の回答 (2 件)

Tom Lane
Tom Lane 2014 年 3 月 29 日
If your goal is to specify the centroids in advance, and not just have kmeans start with them and adjust them as things go along, then I think you don't want to use kmeans at all. Just use pdist2, find the closest centroid for each point, and classify into the cluster defined by the closest centroid.
  2 件のコメント
Image Analyst
Image Analyst 2014 年 3 月 29 日
That is the main reason that automatic thresholds are not always robust. If you have to find something that can range from anywhere of 0% of an image to 100% of an image, using thresholds that force you to pick automatically, or clusters that force you to pick a certain number of clusters, are not robust. They will fail if you don't have the proper number of pixels in the image belonging to those classes. For most or all of my color classification applications I use fixed values to determine the class. I used a training set to determine where the classes will be and then once I decide on them, they are fixed for all images. That way I can get area fractions for all color classes no matter if they are present or 100% or somewhere in between. If you had one cluster and told it to find 4 clusters, it would find 4 clusters but it will chop your image up into 4 clusters when if you had 3 other "real" colors there, it would find them all accurately, whereas in the first case it was calling the cluster 4 clusters when it should actually only be one cluster.


Image Analyst
Image Analyst 2014 年 3 月 27 日
Please mark the Answer as accepted if that's what you were looking for. Thanks.
  3 件のコメント
Andreas Westergaard
Andreas Westergaard 2014 年 3 月 29 日
編集済み: Andreas Westergaard 2014 年 3 月 29 日
Hi "Image Analyst" I tried the segmentation you suggested and it looks promising. I will pursue it a bit more. My initial idea of using Kmeans was because I need to process images under different light conditions. Thank you again for your valuable input. By the way, I tried to accept your answer as well but apparently I am only allowed to accept one answer. I gave a vote instead...


Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by