Scatter-plot of data in which the cluster membership is coded by colors.

2 ビュー (過去 30 日間)
Mark S
Mark S 2021 年 5 月 31 日
編集済み: Mark S 2021 年 6 月 1 日
Hello, I have created a dendrogram of my given data.
NumCluster = 1566;
dist = pdist(alldata, 'euclidean');
GroupsMatrix = linkage(dist, 'complete');
clust = cluster(GroupsMatrix, 'maxclust', NumCluster);
E = evalclusters(alldata,clust,'CalinskiHarabasz')
[H,T,perm] = dendrogram(GroupsMatrix, 1566, 'colorthreshold', 'default');
I want to create now a scatter-plot of the data in which the cluster membership is coded by colors. I have tried to implement it like this
gscatter (alldata(:,1),alldata(:,2), E.OptimalY,'rbgk','xod')
Update:
The error is now gone but all the scatter plot has the same color. How can I cluster the membership by different colors? And is my E chosen correctly for the number of clusters? For my E I have 1566 cluster. I did not know if this is okay.
  2 件のコメント
the cyclist
the cyclist 2021 年 5 月 31 日
It can be difficult to diagnose issues without the data. Specifically, we don't see how E is defined. Can you upload the data, such that we can actually run your code and reproduce the error?
Mark S
Mark S 2021 年 6 月 1 日
Yes, I have attached the data

サインインしてコメントする。

回答 (1 件)

KSSV
KSSV 2021 年 6 月 1 日
編集済み: KSSV 2021 年 6 月 1 日
gscatter (alldata(:,1),alldata(:,2),clust,'rbgk','xod')
Check the option E.OptimalY, it is empty []. So all the points are shown by same color/ maekers.
  3 件のコメント
KSSV
KSSV 2021 年 6 月 1 日
編集済み: KSSV 2021 年 6 月 1 日
alldata = csvread('Data.csv') ;
NumCluster = 10; % <-----change cluster number here
dist = pdist(alldata, 'euclidean');
GroupsMatrix = linkage(dist, 'complete');
clust = cluster(GroupsMatrix, 'maxclust', NumCluster);
E = evalclusters(alldata,clust,'CalinskiHarabasz') ;
gscatter (alldata(:,1),alldata(:,2),clust)
Mark S
Mark S 2021 年 6 月 1 日
編集済み: Mark S 2021 年 6 月 1 日
Thanks. It works fine. One additional question: How can I find an optimal cluster number? Is it best to vary the NumCluster myself or is there another method? I have found this in the matlab help:
https://it.mathworks.com/matlabcentral/answers/76879-determining-the-optimal-number-of-clusters-in-kmeans-technique
klist=2:500;%the number of clusters you want to try
myfunc = @(X,K)(kmeans(X, K));
eva = evalclusters(alldata,myfunc,'CalinskiHarabasz','klist',klist)
classes=kmeans(alldata,eva.OptimalK);
I get here for my optimalK=3. But I am not sure if this is ok. Is the calculation for the optimal cluster numbers so ok?

サインインしてコメントする。

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by