フィルターのクリア

What does sumd method in k-means clustering function exactly calculate?

2 ビュー (過去 30 日間)
Onur Kapucu
Onur Kapucu 2018 年 5 月 8 日
コメント済み: Onur Kapucu 2018 年 5 月 8 日
I am doing basic experiments with kmeans function. As a real simple example, say that I have a data set of 4 items with 1 attribute and this attribute is their value:
Data=[1;2;3;4];
If I want to split this data set into 2 clusters I should get one centroid in 1.5 and another in 3.5:
[idx,C,sumd]=kmeans(Data,2)
C =
1.5000
3.5000
and I get it. However to my understanding sumd in this case should be:
abs(1-1.5)+abs(2-1.5) or abs(3-3.5)+abs(4-3.5)
ans =
1
but I am getting sumd as:
sumd =
0.5000
0.5000
for both clusters. Instead of getting 1's for both.
My question is what exactly does sumd calculate?

採用された回答

Ameer Hamza
Ameer Hamza 2018 年 5 月 8 日
編集済み: Ameer Hamza 2018 年 5 月 8 日
If you look at the documentation of kmeans(), you will know that it uses the square of the Euclidean distance, by default. So you should calculate it like this
abs(1-1.5).^2+abs(2-1.5).^2 or abs(3-3.5).^2+abs(4-3.5).^2
ans =
0.5 (both cases)

その他の回答 (1 件)

the cyclist
the cyclist 2018 年 5 月 8 日
It's because the default distance metric used is the squared Euclidean distance (for minimization, and reporting). See the Distance input parameter.

カテゴリ

Help Center および File ExchangeStatistics and Machine Learning Toolbox についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by