Command "cluster" with big data: it used to work fast but now it works slow

1 回表示 (過去 30 日間)
Kerim Khemraev
Kerim Khemraev 2016 年 11 月 30 日
編集済み: Kerim Khemraev 2016 年 12 月 1 日
Hello!
I have matrix in variable " dat".
Number of rows = 564372
Number of columns = 11
Each row represents an observation and I need to cluster this data. Command " kmeans" works fast and now I'm trying agglomerative clusterisation. I computed linkage (it took about 8 hours) with the command:
Z=linkage(dat,'centroid','euclidean','savememory','on');
Then I came home and I computed few cluster with different thresholds:
T=cluster(Z,'cutoff',1.4);
I was extremely surprised when I saw that the cluster computation took only 10-15 seconds and the result was fine. Then I saved my linkage data:
Z=dlmwrite('Z-linkage.txt',Z);
Next day I launched Matlab, imported Z-linkage.txt and tryed to compute cluster again. But for this time it works very slow. It may take hours and I don't have any idea what is the problem?
Please help!
Thank you for any suggestion

回答 (1 件)

John D'Errico
John D'Errico 2016 年 11 月 30 日
Since we have absolutely nothing to go on about the actual data, I can only guess.
Clustering tools usually use random starts. That means you may get lucky some times, seeing rapid convergence.
  1 件のコメント
Kerim Khemraev
Kerim Khemraev 2016 年 12 月 1 日
編集済み: Kerim Khemraev 2016 年 12 月 1 日
Thank you for reply,
May be, but I computed clusters many times (about 50 times) during one session in Matlab. And it was fast. I don't think that I got many times successful random start.
I can share my data. Here it is:

サインインしてコメントする。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by