Extracting some points and finding some nearest elements.

I have data I used dbscan clustering method. Now I need to find 5 different elements from each cluster. And calculate the 5 nearest elements of each point and group it.
In the below figure there are some points marked(pencil marked)and grouped the 5 elements(black round).
[I marked only 3 clusters just for example, I need it in the full clusters.]
After that how can I remove those clusters that do not have 5 nearest elements? Anybody, please help me.
clc;
clear;
data=xlsread('glass.xlsx');
minpts=6;
epsilon=4;
[idx, corepts] = dbscan(data,epsilon,minpts);
gscatter(data(:,1),data(:,2),idx);

回答 (1 件)

Image Analyst
Image Analyst 2020 年 3 月 8 日

0 投票

I don't even think you need dbscan for this. You just need to define a length that separates "near enough" and "too far away". Then you just check every point in the array to see if it has 5 that are near enough, and keep those.
nearEnough = 0.02; % Whatever you want.
x = data(:,1);
y = data(:,2);
indexesToKeep = false(1, length(x)); % Initialize to not keeping any of them.
for k = 1 : length(x)
distances = sqrt((x(k) - x).^2 + (y(k) - y).^2);
if sum(distances > nearEnough) >= 5
% At least 5 are close enough to this k'th point, so keep this point.
indexesToKeep(k) = true;
end
end
x = x(indexesToKeep);
y = y(indexesToKeep);

12 件のコメント

sreelekshmi ms
sreelekshmi ms 2020 年 3 月 8 日
Thank you sir.
I need to find the dense area that's why I used dbscan. And later of this step, I want to merge these clusters(like linkage).
Find the dense area. Choose the 5 points from each dense area. Again group the 5 nearest elements. Merge it. Remove those not have at least 5 elements.
dist2 = (data(:,1) - P(:,1).').^2 + (data(:,2) - P(:,2).').^2;
[~,id] = mink(dist2,5,1);
clusters = data(id);
link=linkage(clusters);
figure()
dendrogram(link)
uni=length(link);
outl=rmoutliers(link);
And in the step finding of the nearest elements, I am getting clusters to depend on the nearest values. How can I get correct answers.
P is 5 different elements from each cluster.
Image Analyst
Image Analyst 2020 年 3 月 8 日
Not sure what you're saying. But my code should work. You can apply it to each colored group (that comes from dbscan) one at a time if you want.
sreelekshmi ms
sreelekshmi ms 2020 年 3 月 8 日
Yes sir it giving x and y.
How can apply it to each colored group (that comes from dbscan)? I need to apply it into all groups.
I need the following steps,
  1. Group data-based density(group dense regions).
  2. Find 5 points from each group.
  3. Find each 5 nearest elements of points(that we got from step 2)and group it(plot also).
  4. Merge data (like linkage).
  5. Remove those that do not have at least 5 nearest elements.
Please help me.
Image Analyst
Image Analyst 2020 年 3 月 9 日
For step 2, describe how you pick those 5 points from all the points in that class.
Not sure what step 4 is supposed to do.
What is your definition of "near" or "not near"? How far -- what distance is that?
sreelekshmi ms
sreelekshmi ms 2020 年 3 月 9 日
編集済み: sreelekshmi ms 2020 年 3 月 9 日
In step 2 selecting those 5 elements that far from each.
In step 4 - If any of 2 the points(that from step 2) that do not have 5 nearest elements merge them to one group(like one group contains 2 and other contains 3). After that remove those groups that do not have 5 elements (after merging also(If any)).
The distance we can take any values, suppose it is 0.4.
How can I get these?
Image Analyst
Image Analyst 2020 年 3 月 9 日
I still have no idea how you're going to pick the first 5 points. Let's say you have 7000 points and there are 1000 points in each of 7 clusters. Now, which 5 of those 7000 would you pick in your step 2?
And once you've picked those initial 5 points, you will check to see how many "near" neighbors each has. Like point 1 may have 20 near neighbors, point 2 may have 3 near neighbors, point 3 may have 6 near neighbors, point 4 may have 250 near neighbors, and point 5 may have 2 near neighbors. So points 1, 3, and 4 have more than 5 near neighbors and go into "class 1" while points 2 and 5 have more than 5 near neighbors and so they go into class 2. Class 2 has 6998 points - all except the two points that have at least 5 near neighbors. Is that correct?
sreelekshmi ms
sreelekshmi ms 2020 年 3 月 9 日
Group the data based on density(Dense areas grouped). Suppose we get 6 dense areas we will one by one from it. From 1st dense area, we will choose the 5 points that are maximum far apart. Then we will get 6*5=30 points. Take the 30 points and check again the nearest points(the nearest elements that satisfy a minimum user-defined threshold).
Merging:- A cluster group X contains 4 elements it needs only 1 element to make a group and a cluster group Y that contain 1 element then merge these clusters(Only when the X and Y clusters are near).
Class distribution: At least 5 elements class 1.
6 - 20 class 2.
21-30 class 3.
31-40 class4.
41-50 class 5.
More than 50 class 6. (System defined classes can also choose).
How can I do these? Please help me.
Thank you.
Image Analyst
Image Analyst 2020 年 3 月 9 日
So when you're picking the 5 from each class "that are maximum far apart", how do you define that? Do you look at each point in the class and
  1. find the distance to the nearest other point, or
  2. find the average distance from every other point in the class, or
  3. find the average distance to a certain number of points, like the average distance to the 8 closest other points?
Are you using any of those definitions of maximum? Or some other definition?
sreelekshmi ms
sreelekshmi ms 2020 年 3 月 10 日
2.find the average distance from every other point in the class.
Then taking 5 elements from each class and step 3 and others.
sreelekshmi ms
sreelekshmi ms 2020 年 3 月 10 日
Using the synthetic dataset how can I do this?
sreelekshmi ms
sreelekshmi ms 2020 年 3 月 10 日
At-least In the glass data set how can I apply the above steps I described. Please help me.
sreelekshmi ms
sreelekshmi ms 2020 年 3 月 11 日
Is there any way to divide the data based on dense areas. If any, please help me.

サインインしてコメントする。

カテゴリ

ヘルプ センター および File ExchangeStatistics and Machine Learning Toolbox についてさらに検索

質問済み:

2020 年 3 月 8 日

コメント済み:

2020 年 3 月 11 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by