How can the data be accurately clustered?

Question

Med Future 2024 年 2 月 17 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2083163-how-can-the-data-be-accurately-clustered

コメント済み: Med Future 2024 年 2 月 19 日

Greetings, I possess a dataset labeled as "Data," which encompasses four columns. Following this, I have employed the K-means algorithm to generate clusters, assigning the Cluster Value 6 to the variable "clusters" in the respective cells. Nevertheless, it has come to my attention that certain values within these clusters are inaccurately assigned to other clusters.

I am in need of guidance regarding the reassignment of clusters utilizing a specific algorithm. The ground truth is depicted in the attached image. I would greatly appreciate assistance with this endeavor.

 disp('Calculating Centroid')
 K=6;
[idx,C,sumdist] = kmeans(Data,K,'Display','final');
dataset=Data;
dataset_idx=zeros(length(dataset),5);
dataset_idx=dataset(:,1:4);
dataset_idx(:,5)=idx;
clusters = cell(K,1);
for i = 1:K
   clusters{i} = dataset_idx(dataset_idx(:,5) == i,:);
end
cluster_assignments=idx;

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

Med Future 2024 年 2 月 18 日

@Image Analyst I believe I may not have communicated my requirement clearly earlier. Let me provide more detail:

I have four features in columns 1, 2, 3, and 4. My goal is to use machine learning or deep learning methods to cluster these values. For instance, I know that the ground truth of Cluster 1 has feature values of 1,200,000 and 1,250,000 in Column 2. Corresponding to these values, Column 4 contains the value 30.

The issue arises when using K-means clustering, as it often misclusters the results, as shown in the cluster cell. In such cases, we need to be able to reassign the clusters based on the feature values.

Med Future 2024 年 2 月 19 日

clusters.mat

@Image Analyst

I have attached the file name clusters. in which each clusters is attached with specific value. Also you can see the groundTruth Image I am getting the following problem.

- Cluster 1:

- Represents correctly clustered values

- Cluster 2:

- Missing values: 1360, 1380

- Contains some values from Cluster 5 with column 4 values of 10, 20, 30, 40

- Cluster 3:

- Should only include values of 1250

- Includes values from Cluster 4 and Cluster 6 with values of 1200

- Values of 1230 with corresponding column 4 value of 20 are moved to Cluster 4

- Cluster 4:

- Contains values of 1230 with corresponding column 4 value of 20 from Cluster 3

- Cluster 5:

- Includes values from Cluster 2 with column 4 values of 10, 20, 30, 40

- Cluster 6:

- Contains only the 2nd column value of 1210

- Corresponding column 4 value of 10

As attached Image below. Please help me to solve this issue.

Ground Truth Image

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Catalytic 2024 年 2 月 18 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2083163-how-can-the-data-be-accurately-clustered#answer_1411113

Maybe clusterdata would be useful. It has lots of different algorithmic options you could try.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Med Future 2024 年 2 月 18 日

@Catalytic However, when using this clustering method, there are instances where it misclusters the results. In such cases, it becomes necessary to reassign the clusters based on the feature values. Could you assist me with this?

サインインしてコメントする。

How can the data be accurately clustered?

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

回答 (1 件)

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

How can the data be accurately clustered?

7 件のコメント 5 件の古いコメントを表示5 件の古いコメントを非表示

回答 (1 件)

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示