How to find a chunk of a certain number of zeros inside a vector

Hi all,
I have a vector of ones and zeros randomly distributed.
i.e: A = [0;1;1;0;0;0;0;1;1;1;1;0;1;]
What I want is to find the location of the first zero of the first chunk with 4 OR MORE zeros appearing in the vector.
In this example the result would be:
pos = 4;
The size of the group of zeros doesn't have to be necessarily 4, this was just an example.
I cannot find a simple way to do this but most probably there's a command for for this kind of operations that I cannot recall.
Many thanks in advance,
Pedro Cavaco

 採用された回答

David Young
David Young 2011 年 6 月 21 日

0 投票

A = [0;1;1;0;0;0;0;1;1;1;1;0;1;]
n = 4;
To find the first group of 4 or more zeros:
p = regexp(char(A.'), char(zeros(1, n)), 'once')
To find the first group of exactly 4 zeros:
zz = char(zeros(1,n));
p = regexp(char(A.'), ['(?<=^|' char(1) ')' zz '(' char(1) '|$)'], 'once')

5 件のコメント

Pedro Cavaco
Pedro Cavaco 2011 年 6 月 21 日
It gives p=1 where the result should be p=4.
David Young
David Young 2011 年 6 月 21 日
It gives 4 when I run it!
Pedro Cavaco
Pedro Cavaco 2011 年 6 月 21 日
Maybe, I didn't explain well.
Your last answer works fine, but it is looking for a group of zeros with the size of exactly the size of 'n'.
But I only need to find the first group of zeros that is grather or equal to 'n' and not exactly 'n'.
BTW, what is all the ?<=^|' for???
It's a very fancy solution ;)
David Young
David Young 2011 年 6 月 21 日
There are two solutions in this answer. The first of them works for the case of n or more zeros. The ?<= is a lookbehind operator to ensure that the match is at the start of the group of zeros - there is a requirement that the character before the zeros is char(1) or the start of the string. See doc regexp and follow the link to "Regular Expressions" for more details. You don't need this for the simple solution which finds groups of 4 or more zeros.
David Young
David Young 2011 年 6 月 21 日
By the way, in the case of n or more zeros, it's not obvious whether to use my first answer, with regexp, or Andrei's answer, with strfind. For very long strings, it may be faster to use regexp because of its 'once' option; however, strfind is simpler and will have a lower overhead.

サインインしてコメントする。

その他の回答 (3 件)

Andrei Bobrov
Andrei Bobrov 2011 年 6 月 21 日

0 投票

EDIT
A1 = A(:)';
out = strfind([1 A1],[1 0])-1; % all groups zeros
strfind([A1 1],[0 0 1]); % all groups two zeros
...
strfind([A1 1],[zeros(1,4) 1]); % all groups 4 zeros

6 件のコメント

Titus Edelhofer
Titus Edelhofer 2011 年 6 月 21 日
Nice! I guess, you mean something like strfind(A1, zeros(1,n)) where n=4 was asked?
Titus
Pedro Cavaco
Pedro Cavaco 2011 年 6 月 21 日
I guess this finds the first zero in A but doesn't assure that you have 3 more zeros in front of it.
David Young
David Young 2011 年 6 月 21 日
Even with Titus Edelhofer's correction, this still finds all occurrences, not just the first.
Andrei Bobrov
Andrei Bobrov 2011 年 6 月 21 日
@Titus Edelhofer. strfind([1 A1 1],[zeros(1,4) 1])-1
Andrei Bobrov
Andrei Bobrov 2011 年 6 月 21 日
speed
>> A = +(rand(10000,1)<.2);
tic, zz = char(zeros(1,4));
p = regexp(char(A(:).'), ['(?<=^|' char(1) ')' zz '(' char(1) '|$)'], 'once'); toc
Elapsed time is 0.002538 seconds.
>> tic, A1 = A(:)';strfind([A1 1],[zeros(1,4) 1]);toc
Elapsed time is 0.000652 seconds.
Andrei Bobrov
Andrei Bobrov 2011 年 6 月 21 日
it's idea of Matt Fig

サインインしてコメントする。

Gerd
Gerd 2011 年 6 月 21 日

0 投票

Hi Pedro,
just programming straigforward I would use
A = [0;1;1;0;0;0;0;1;1;1;1;0;1;];
cons = 4;
indices = find(A==0);
for ii=1:numel(indices)-cons
if (indices(ii+1)-indices(ii) == 1) && (indices(ii+2)-indices(ii+1)==1) && indices(ii+3)-indices(ii+2)==1
disp(indices(ii));
end
end
Result is 4
Gerd

3 件のコメント

David Young
David Young 2011 年 6 月 21 日
You could put a "break" in the conditional to make this more efficient, since only the first occurrence is required. Also, it's probably more useful to assign the result to a variable rather than calling disp.
Pedro Cavaco
Pedro Cavaco 2011 年 6 月 21 日
Gerd, in your solution you have this big IF condition. Will it still work if 'cons' becomes say 100 or would you have to include more && (indices(ii+4)-indices(ii+3)==1) .... ?
Gerd
Gerd 2011 年 6 月 21 日
Hi Pedro,
I tried both solution in a .m-file(David's and mine)
Please have a look at the result.
tic;
A = [0;1;1;0;0;0;0;1;1;1;1;0;1;];
cons = 4;
indices = find(A==0);
for ii=1:numel(indices)-cons
if (indices(ii+1)-indices(ii) == 1) && (indices(ii+2)-indices(ii+1)==1) && indices(ii+3)-indices(ii+2)==1
disp(indices(ii));
end
end
t1 = toc;
tic;
A = [0;1;1;0;0;0;0;1;1;1;1;0;1;];
n = 4;
p = regexp(char(A.'), char(zeros(1, n)), 'once');
disp(p);
t2 = toc;
With your testvector the result is really fast.

サインインしてコメントする。

David Young
David Young 2011 年 6 月 21 日

0 投票

Another approach to finding the first group of 4 or more zeros:
A = [0;1;1;1;0;0;0;0;1;1;1;1;0;1;0;0;0;1];
n = 4;
c = cumsum(A);
pad = zeros(n, 1)-1;
ppp = find([c; pad] == [pad; c]) - (n-1);
p = ppp(1)
EDIT Code corrected - n replaced by (n-1) to give correct offset.

3 件のコメント

Pedro Cavaco
Pedro Cavaco 2011 年 6 月 21 日
Nice, this solution work very well David. Thank you very much!!!
Just a small correction though:
It is finding the position immediately before the first zero so I simply did:
ppp = find([c; pad] == [pad; c]) - (n-1);
And the work fine.
Pedro Cavaco
Pedro Cavaco 2011 年 6 月 21 日
But like you said on your first answer it is much faster with the
regexp!!! :D
David Young
David Young 2011 年 6 月 21 日
Sorry, you are right!

サインインしてコメントする。

カテゴリ

ヘルプ センター および File ExchangeCharacters and Strings についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by