clusterization of data in 1-D vector

3 ビュー (過去 30 日間)
paganelle
paganelle 2020 年 10 月 28 日
コメント済み: paganelle 2020 年 10 月 28 日
I have large logical vector looking as V = [0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 ..............]
I need to find the position of each group of 1 (lets say - center of each group) but if two groups of ones are too close to each other (say, less than 3 zerros in between) I need to consider those groups as a single group. I.e. at the firs stage I need to find groups (bold-underlined elements) and then find the ceter element of each group (shift +/-1 element does not matter)
1st stage (clusterization): [0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 ..............]
2nd stage (find a center of each cluster): [0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 ..............]
The way I implemented now is following: I do smoothing of the entire vector (it is couple million elements). The span is chousen to be equal of maximum expected lenght of the group and then I look for local maxima (islocalmax) with 'MinSeparation' of minimum distace between groups. It works, but really slow (I have 360x180 = 64800 of vectors - yes, it is LAT/LONG grid with ~10M elements in each vector)
Is any way to speed up this? I believe it should be some "textbook" examples of it!

採用された回答

Adam Danz
Adam Danz 2020 年 10 月 28 日
編集済み: Adam Danz 2020 年 10 月 28 日
There are lots of alternatives.
  • Input A is a vector of 1s and 0s.
  • n is minimum number of 0s between 1s separate groups of 1s.
  • T is a table showing the start and stop index for each consecutive group of 1s split by less than n zeros and the length of each group.
A = [0 0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 1 1 0 1 0 1 1 0 1 0 0 0 0 1 1 1 1];
% Length of each group of consecutive 1s
T = table();
T.OnesLength = diff(find([0;A(:);0]==0))-1;
T(T.OnesLength==0,:) = [];
% Index of 1st '1' in each group of consecutive 1s
T.OnesStart = find(diff([0;A(:)])==1);
% Index of last '1' in each group of consecutive 1s
T.OnesStop = T.OnesStart + T.OnesLength - 1;
% Determine the number of 0s between consecutive 1s
ZerosBetween = [T.OnesStart(2:end) - T.OnesStop(1:end-1); NaN]-1;
disp(T)
OnesLength OnesStart OnesStop __________ _________ ________ 3 4 6 3 9 11 6 18 23 2 29 30 1 32 32 2 34 35 1 37 37 4 42 45
% join groups of consecutive 1s with less than n zeros between.
n = 3;
joinGroups = ZerosBetween < n;
t = find(diff([0;joinGroups])==1);
f = find(diff([0;joinGroups])==-1);
T.remove = false(height(T),1);
for i = 1:numel(t)
T.OnesStop(t(i)) = T.OnesStop(f(i));
T.OnesLength(t(i)) = sum(T.OnesLength(t(i):f(i))) + sum(ZerosBetween(t(i):f(i)-1));
T.remove(t(i)+1:f(i)) = true;
end
T(T.remove,:) = [];
T.remove = [];
disp(T)
OnesLength OnesStart OnesStop __________ _________ ________ 8 4 11 6 18 23 9 29 37 4 42 45
Now you can use the segment length and the start/stop indices to compute the segement centers.
  1 件のコメント
paganelle
paganelle 2020 年 10 月 28 日
Perfect way, thank you!
It is ~5 times faster than method I used previously.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeResizing and Reshaping Matrices についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by