How to run k means clustering on multiple items in a text file

Question

0 投票

Example3.txt

I am trying to find a way to run k-means clustering on data from the attached text file. The first column shows an individual item. for example: A8888888888888888880011 is one item and it appears several times in the data. I would like to run k means clustering for each item in the text file (the one provided is shortened by a few thousand lines). I read the data in and sort it by using:

fid = fopen(filename);
cell_data = textscan(fid, '%s %n %n');
fclose(fid);
matrix_data = [cell_data{:}];
sort_data = sortrows(matrix_data, [1,3]);

After this I would like to have one coordinate for each item. The coordinate being the result of k means clustering returning a point (column b, column c).

So after running this on the text file, I would end up with 3 points, one for item 11, 12, and 13.

I hope this makes sense but if there are any questions please ask

4 件のコメント
2 件の古いコメントを表示 2 件の古いコメントを非表示

sixwwwwww 2013 年 10 月 14 日

In your example, you have a cluster of 8 elements(7 x and 1 y). So can we take mean of 8 points as a cluster point? or should we select 1 of these 8 points strictly?

Jonathan LeSage 2013 年 10 月 14 日

Could you clarify what data you are trying to cluster in your example? Are you trying to cluster column one data or column two as one-dimensional vectors? Or are you trying to cluster two-dimensional data?

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Yatin 2013 年 10 月 14 日

0 投票

Hi,

You can try using the "kmeans" function by formatting your data as per the data format required in the function. The function returns the centroid of the cluster which is the best representation of the cluster. For more information you can refer to the link below:

http://www.mathworks.com/help/stats/kmeans.html

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

How to run k means clustering on multiple items in a text file

4 件のコメント
2 件の古いコメントを表示 2 件の古いコメントを非表示

採用された回答

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

タグ

Community Treasure Hunt

How to run k means clustering on multiple items in a text file

4 件のコメント 2 件の古いコメントを表示 2 件の古いコメントを非表示

採用された回答

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

タグ

参考

Community Treasure Hunt

4 件のコメント
2 件の古いコメントを表示 2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示