フィルターのクリア

Find double repetitions in a (sorted) array.

1 回表示 (過去 30 日間)
bbb_bbb
bbb_bbb 2017 年 10 月 20 日
コメント済み: Andrei Bobrov 2017 年 10 月 23 日
Given an array submitted in a form of struct field, containing integer numbers. For convenience, let's assume that the numbers are already sorted in ascending order:
>> s.x
ans =
2
ans =
2
ans =
5
ans =
5
ans =
5
ans =
8
ans =
8
Find indexes of elements, which occur exact 2 times:
ind =
1 2 6 7
  4 件のコメント
Jan
Jan 2017 年 10 月 21 日
編集済み: Jan 2017 年 10 月 21 日
The iterative growing of arrays is a standard mistake from the view point of efficiency. Simply pre-allocate:
d = diff(x);
j = 0;
ind = zeros(1, numel(d));
indi = 1;
for i=1:numel(d)
if d(i)==0
j=j+1;
else
if j==1
ind(indi) = i-1;
ind(indi+1) = i;
indi = indi + 2;
end
j=0;
end
end
ind = ind(1:indi-1);
This does not catch the case, if the last two elements are equal.
bbb_bbb
bbb_bbb 2017 年 10 月 21 日
This does not catch the case, if the last two elements are equal.
Adding this line repairs this:
if d(end)==0, d(end+1)=1; end
This variant seems to be the fastest.

サインインしてコメントする。

採用された回答

Andrei Bobrov
Andrei Bobrov 2017 年 10 月 20 日
編集済み: Andrei Bobrov 2017 年 10 月 23 日
x= [2; 2; 5; 5; 5; 8; 8; 13; 13; 13; 13];
[~,~,g] = unique(x); % OR for last versions of MATLAB: g = findgroups(x)
c = accumarray(g,1:numel(x),[],@(x){x});
out = cell2mat(c(cellfun(@numel,c) == 2));
or
[a,~,g] = unique(x);
out = find(ismember(x,a(accumarray(g,1) == 2)));
or (FIXED)
out = reshape(strfind([1,diff(x(:)')~=0,1],[1 0 1]) + [0;1],[],1);
out = reshape(bsxfun(@plus,strfind([1,diff(x(:)')~=0,1],[1 0 1]),[0;1]),[],1); % for old MATLAB
  13 件のコメント
bbb_bbb
bbb_bbb 2017 年 10 月 23 日
This is ok. I sligtly modified it for speed. The fastest and concisiest algorithm of all suggested!
out = strfind([true,diff(x')~=0,true],[1 0 1]);
out = reshape([out;out+1],1,[]);
Andrei Bobrov
Andrei Bobrov 2017 年 10 月 23 日
Thank you mister Bbb_bbb.

サインインしてコメントする。

その他の回答 (4 件)

Rik
Rik 2017 年 10 月 20 日
編集済み: Rik 2017 年 10 月 20 日
It always pays off to get rid of loops and/or pre-allocating your output.
x= [2; 2; 5; 5; 5; 8; 8; 13; 13; 13; 13];
s = struct('x', num2cell(x));
x=[s.x];
%only newer releases: 0.000778 seconds
tic
count=histcounts(x,0.5 : max(x)+0.5);
ind=find(sum(x==find(count==2)'));
toc
%should work on most releases: 0.000628 seconds
tic
count=histcounts(x,0.5 : max(x)+0.5);
count=find(count==2);
ind=find(sum(repmat(x,length(count),1)==repmat(count',1,length(x))));
toc
%your loop: 0.001100 seconds
tic
d=diff(x); j=0; ind=[];
for i=1:numel(d)
if d(i)==0
j=j+1;
else
if j==1
ind(end+1)=i-1;
ind(end+1)=i;
end
j=0;
end
end
toc
  8 件のコメント
Rik
Rik 2017 年 10 月 21 日
That's because x has a different shape:
x1= [2; 2; 5; 5; 5; 8; 8; 13; 13; 13; 13];
s = struct('x', num2cell(x1));
x2=[s.x];
x1 is 11x1 and x2 is 1x11
bbb_bbb
bbb_bbb 2017 年 10 月 21 日
This variant isn't working at big arrays:
x=randi([1,1e6],1e5,1); x=sort(x)';
count=histcounts(x,0.5 : max(x)+0.5);
count=find(count==2);
ind=find(sum(repmat(x,length(count),1)==repmat(count',1,length(x))));
Error using repmat
Maximum variable size allowed by the program is exceeded.

サインインしてコメントする。


Image Analyst
Image Analyst 2017 年 10 月 20 日
You didn't tag it as homework. Is it? This will do it:
% Assignment of a struct with a field containing integer numbers
x= [2; 2; 5; 5; 5; 8; 8; 13; 13; 13; 13];
s = struct('x', num2cell(x));
numbers = [s.x]
[groupNumber, groupValue] = findgroups(numbers)
counts = histcounts(groupNumber)
ofGroupSize2 = find(counts == 2) % Find those only if they have a length of 2.
values = groupValue(ofGroupSize2)
indexes = find(ismember(numbers, values))
  2 件のコメント
bbb_bbb
bbb_bbb 2017 年 10 月 20 日
編集済み: bbb_bbb 2017 年 10 月 21 日
No, its no homework - so called "just-for-fun project".
[groupNumber, groupValue] = findgroups(numbers)
Undefined function or variable 'findgroups'.
Matlab 2015a
Image Analyst
Image Analyst 2017 年 10 月 20 日
編集済み: Image Analyst 2017 年 10 月 21 日
You can use regionprops() instead of findgroups() if you have an old version and have the Image Processing Toolbox. See my separate answer with demo code.

サインインしてコメントする。


Jan
Jan 2017 年 10 月 21 日
Your code looks like the input is sorted. The other approaches do not have this limitation. If it is really sorted:
d = [true; diff(x) ~= 0]; % TRUE if values change
b = x(d); % Elements without repetitions
k = find([d', true]); % Indices of changes
n = diff(k);
is2 = find(n==2);
ind4 = reshape([k(is2); k(is2)+1], 1, []);
Code taken from FEX: RunLength.

Image Analyst
Image Analyst 2017 年 10 月 21 日
編集済み: Image Analyst 2017 年 10 月 21 日
If you have the Image Processing Toolbox, you can use regionprops():
% Assignment of a struct with a field containing integer numbers
x= [2; 2; 5; 5; 5; 8; 8; 13; 13; 13; 13];
s = struct('x', num2cell(x));
numbers = [s.x] % A labeled "image"
% Find lengths of each run of numbers plus the indexes where they occur.
props = regionprops(numbers, 'Area', 'PixelIdxList')
% Extract from structure into one vector.
allLengths = [props.Area]
% Find those only if they have a length of 2.
ofGroupSize2 = find(allLengths == 2)
% Find indexes of those runs with length 2.
indexes = [props(ofGroupSize2).PixelIdxList]
% Shape into row vector
indexes = reshape(sort(indexes(:)), 1, [])
  1 件のコメント
bbb_bbb
bbb_bbb 2017 年 10 月 21 日
This works, but the variant is the longest (6.2 sec on 1e6 elements vector). The fastest is 0.03 sec.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeMatrix Indexing についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by