Removing neighbours that are too close from eachother in a vector

3 ビュー (過去 30 日間)
Rafael Marques
Rafael Marques 2023 年 1 月 19 日
回答済み: Stephen23 2023 年 1 月 19 日
Hello,
I am trying to solve a problem in which I have, for example, the following vector:
x = [1 8 9 10 15 20 22 25 34]
And i want the output to be like this:
x = [1 9 15 22 34]
So basically analyzing which values are too close from eachother and removing them, while keeping the "middle" one (the one closest to the average of the points removed).
I've tried with
uniquetol(x], 5, 'DataScale', 1)
But uniquetol() only allows to select the hightest or lowest of the values removed. Any ideas/help are welcomed.
  3 件のコメント
DGM
DGM 2023 年 1 月 19 日
Since you're operating on multiple elements simultaneously, it's unclear what should happen when you have a run of similar values, as your filter may both accept and reject an element in succession.
x = [1 2 3 4];
Assuming a 3-element window:
At step 1, you reject 1 and 3, and you accept 2
At step 2, you reject 2 and 4, and accept 3.
Is this merely determined by the filter direction?
Rafael Marques
Rafael Marques 2023 年 1 月 19 日
First of all, thank you for your answers!
The vector is designed to never have ties.
And assuming:
x = [1 2 3 4];
The output should be x=[2] so since both 2 and 3 are closest to average, we pick the smallest number of them. I'm sorry I wasn't clear with my question

サインインしてコメントする。

採用された回答

Stephen23
Stephen23 2023 年 1 月 19 日
x = [1,8,9,10,15,20,22,25,34]
x = 1×9
1 8 9 10 15 20 22 25 34
[~,~,y] = uniquetol(x, 5, 'DataScale',1);
z = accumarray(y(:),x(:),[],@(v) v(ceil(end/2)))
z = 5×1
1 9 15 22 34

その他の回答 (2 件)

DGM
DGM 2023 年 1 月 19 日
編集済み: DGM 2023 年 1 月 19 日
Here's my guess. The behavior is dependent on whether you can actively accept or reject entries. If it's assumed that values can only be rejected, then:
x = [1 8 9 10 15 20 22 25 34];
%x = [1 9 15 22 34];
wradius = 1; % for a 1x3 window
badrange = 5; % anything less is "too close"
goodvalues = true(size(x));
for k = 1+wradius:numel(x)-wradius
sidx = k-wradius:k+wradius;
sample = x(sidx);
if max(sample)-min(sample) <= badrange
av = mean(sample);
[~,idx] = min(abs(sample-av));
mask = false(1,3);
mask(idx) = true;
goodvalues(sidx) = mask & goodvalues(sidx);
end
end
output = x(goodvalues)
output = 1×5
1 9 15 22 34
I'm only picking the first value that's "closest" to the mean. If there are ties, it's going to keep the leftmost match.

Image Analyst
Image Analyst 2023 年 1 月 19 日
Sounds like you want to use the dbscan clustering algorithm.
So in the above diagram from Wikipedia the red and yellow points are all in one cluster because they are all within a certain distance of at least one other point in the cluster. Point N is not in that cluster - it's in it's own cluster because it is not closer than the specified distance from any of the other points.
I'm attaching an example as applied to images but you could apply it to a 1-D numberline like you have. Once you find the clusters, replace each cluster with the mean of the cluster and then remove duplicates.
dbscan is one of the best clustering algorithms and one you should learn about even if you don't see how it can apply here.

カテゴリ

Help Center および File ExchangeCluster Analysis and Anomaly Detection についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by