Clustering/sorting of close data points

Question

0 投票

Hello all,

I would like to bundle up some close data points. This is how my data looks like: 214981 366893 455877 455877 455877 455878 889359 889359 1443570 1443570........

Can anybody suggest an easy way to do this?

Thanks.

Raju

6 件のコメント
4 件の古いコメントを表示 4 件の古いコメントを非表示

Raju Kumar 2022 年 7 月 11 日

編集済み: Raju Kumar 2022 年 7 月 11 日

Hello @Walter Roberson,

Thanks for your reply. How would you consider 'tol' for the following data if I want to sort out two or more close values?

sortedData =uniquetool(Data, tol?) % Data = [214981 366893 455877 455877 455877 455878..]

Raju

Walter Roberson 2022 年 7 月 11 日

@Raju Kumar

Consider for example your value 455877 : what is the minimum and maximum value that you wish to be considered to be the same group as 455877, if those values were encountered as data?

Consider also 214981: what is the minimum and maximum value that you wish to be considered to be the same group as 214981, if those values were encountered as data?

I ask about two different values because the boundary for higher values might not have the same range as for lower value. For example in [1 8 25] the 8 might be considered to be relatively far from the "1" (since it is 8 times the value), whereas by the time you got to 200000, the value 200010 might be considered "close" to 200000 since the difference is pretty small relative to the value.

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Image Analyst 2022 年 7 月 11 日

編集済み: Image Analyst 2022 年 7 月 12 日

MATLAB Online で開く

0 投票

This is really too broad a question to answer yet. You haven't even plotted your data or told us what "close" is. I suggest you start by reading this page:

https://www.mathworks.com/help/stats/cluster-analysis.html

Maybe you can simply take the histogram.

If you have any more questions, then attach your data and code to read it in with the paperclip icon after you read this:

TUTORIAL: How to ask a question (on Answers) and get a fast answer

This is how I'd classify your data. Basically you can do it manually for such few data points. For far more data points, you can try the Classification Learner App on the apps tab of the tool ribbon. Even for these few, it looks like SVM might be good. But please attach far more data so we can find the best classifier.

data = [214981 366893 455877 455877 455877 455878 889359 889359 1443570 1443570]';

classes = [1,1,1,1,1,1,2,2,3,3]';

plot(data, 'b.', 'MarkerSize', 30);

grid on;

2 件のコメント
なしを表示なしを非表示

Raju Kumar 2022 年 7 月 12 日

編集済み: Raju Kumar 2022 年 7 月 12 日

matlab.mat

Hi @Image Analyst

Thanks for your lead and information. Here is the data file. I basically want to group or number them whenever they are close (by increasing the counting eg. 1,2, 3...etc where number 1 should be assigned to first close numbers and number 2 should be assigned to second close numbers and so on).

Raju

Image Analyst 2022 年 7 月 12 日

MATLAB Online で開く

raju.mat

Exactly how are you seeing clusters in that data?

s = load('raju.mat')

s = struct with fields:

ToA: [26182×1 double]

toa = s.ToA;

classes = [1,1,1,1,1,1,2,2,3,3]';

plot(toa, 'b.', 'MarkerSize', 10);

grid on;

xlabel('Index of Vector')

ylabel('Value of toa')

サインインしてコメントする。

Clustering/sorting of close data points

6 件のコメント
4 件の古いコメントを表示 4 件の古いコメントを非表示

回答 (1 件)

2 件のコメント
なしを表示なしを非表示

カテゴリ

タグ

Community Treasure Hunt

Clustering/sorting of close data points

6 件のコメント 4 件の古いコメントを表示 4 件の古いコメントを非表示

回答 (1 件)

2 件のコメント なしを表示 なしを非表示

カテゴリ

タグ

参考

Community Treasure Hunt

6 件のコメント
4 件の古いコメントを表示 4 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示