Extracting some points and finding some nearest elements.
古いコメントを表示
I have data I used dbscan clustering method. Now I need to find 5 different elements from each cluster. And calculate the 5 nearest elements of each point and group it.
In the below figure there are some points marked(pencil marked)and grouped the 5 elements(black round).
[I marked only 3 clusters just for example, I need it in the full clusters.]
After that how can I remove those clusters that do not have 5 nearest elements? Anybody, please help me.
clc;
clear;
data=xlsread('glass.xlsx');
minpts=6;
epsilon=4;
[idx, corepts] = dbscan(data,epsilon,minpts);
gscatter(data(:,1),data(:,2),idx);

回答 (1 件)
Image Analyst
2020 年 3 月 8 日
I don't even think you need dbscan for this. You just need to define a length that separates "near enough" and "too far away". Then you just check every point in the array to see if it has 5 that are near enough, and keep those.
nearEnough = 0.02; % Whatever you want.
x = data(:,1);
y = data(:,2);
indexesToKeep = false(1, length(x)); % Initialize to not keeping any of them.
for k = 1 : length(x)
distances = sqrt((x(k) - x).^2 + (y(k) - y).^2);
if sum(distances > nearEnough) >= 5
% At least 5 are close enough to this k'th point, so keep this point.
indexesToKeep(k) = true;
end
end
x = x(indexesToKeep);
y = y(indexesToKeep);
12 件のコメント
sreelekshmi ms
2020 年 3 月 8 日
Image Analyst
2020 年 3 月 8 日
Not sure what you're saying. But my code should work. You can apply it to each colored group (that comes from dbscan) one at a time if you want.
sreelekshmi ms
2020 年 3 月 8 日
Image Analyst
2020 年 3 月 9 日
For step 2, describe how you pick those 5 points from all the points in that class.
Not sure what step 4 is supposed to do.
What is your definition of "near" or "not near"? How far -- what distance is that?
sreelekshmi ms
2020 年 3 月 9 日
編集済み: sreelekshmi ms
2020 年 3 月 9 日
Image Analyst
2020 年 3 月 9 日
I still have no idea how you're going to pick the first 5 points. Let's say you have 7000 points and there are 1000 points in each of 7 clusters. Now, which 5 of those 7000 would you pick in your step 2?
And once you've picked those initial 5 points, you will check to see how many "near" neighbors each has. Like point 1 may have 20 near neighbors, point 2 may have 3 near neighbors, point 3 may have 6 near neighbors, point 4 may have 250 near neighbors, and point 5 may have 2 near neighbors. So points 1, 3, and 4 have more than 5 near neighbors and go into "class 1" while points 2 and 5 have more than 5 near neighbors and so they go into class 2. Class 2 has 6998 points - all except the two points that have at least 5 near neighbors. Is that correct?
sreelekshmi ms
2020 年 3 月 9 日
Image Analyst
2020 年 3 月 9 日
So when you're picking the 5 from each class "that are maximum far apart", how do you define that? Do you look at each point in the class and
- find the distance to the nearest other point, or
- find the average distance from every other point in the class, or
- find the average distance to a certain number of points, like the average distance to the 8 closest other points?
Are you using any of those definitions of maximum? Or some other definition?
sreelekshmi ms
2020 年 3 月 10 日
sreelekshmi ms
2020 年 3 月 10 日
sreelekshmi ms
2020 年 3 月 10 日
sreelekshmi ms
2020 年 3 月 11 日
カテゴリ
ヘルプ センター および File Exchange で Statistics and Machine Learning Toolbox についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!