Finding the indexes of multiple substrings within a larger string.

Question

Steve 2023 年 3 月 24 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1934764-finding-the-indexes-of-multiple-substrings-within-a-larger-string

コメント済み: Steve 2023 年 4 月 1 日

I’m trying to find the indexes of all two digit pairs in a very long string of numbers, say “c”. I can easily find all occurrences of one string at a time; for example strfind(c, ’00’)…strfind (c, ’01’). But I want a way to do this for all sets one hundred sets; 00 to 99. I tried this:

x=0:99;
dig=sprintf('%02d ',x);
%converts the vector 0to99 into a string with two digits, space between numbers
dub_dig=strsplit(dig);
%splits each pair into cells
dub_dig_str=string(dub_dig);
%converts to a string

How do I get this sequence of strings (dub_dig_str) to work in something like a for loop using the strfind function? When I try this it crashes. I would like to output a matrix of indexes of where each pair occurs, for all pairs.

Thanks

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Stephen23 2023 年 3 月 24 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1934764-finding-the-indexes-of-multiple-substrings-within-a-larger-string#answer_1200414

編集済み: Stephen23 2023 年 3 月 24 日

MATLAB Online で開く

idx = regexp(c,'\d\d')     %   no overlaps
idx = regexp(c,'\d(?=\d)') % with overlaps

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

Stephen23 2023 年 3 月 26 日

編集済み: Stephen23 2023 年 3 月 27 日

MATLAB Online で開く

"My goal is to output a separate row of indexes for each pair of numbers (one hundred total, 00to99), stating where each appears in c."

Aaah, so you actually want to compare the pairs against another set with a specific order, which is what you were achieving with the loop. Here is an alternative approach:

c = char(randi(+'09',1,123)) % random data
c = '636906240866589674219474419874013401492319709264753858931901195001783553643124423974644494528171633587231581331511128165489'
% Character pairs:
[T,U] = meshgrid('0':'9'); % all pairs
P = cellstr([T(:),U(:)])  % all pairs
P = 100×1 cell array
    {'00'}
    {'01'}
    {'02'}
    {'03'}
    {'04'}
    {'05'}
    {'06'}
    {'07'}
    {'08'}
    {'09'}
    {'10'}
    {'11'}
    {'12'}
    {'13'}
    {'14'}
    {'15'}
    {'16'}
    {'17'}
    {'18'}
    {'19'}
    {'20'}
    {'21'}
    {'22'}
    {'23'}
    {'24'}
    {'25'}
    {'26'}
    {'27'}
    {'28'}
    {'29'}
Q = cellstr(c([1:end-1;2:end]).'); % data pairs
% Find indices of data pairs:
[~,X] = ismember(Q,P);
% Place indices into cell array:
Y = (1:numel(Q)).';
Z = accumarray(X,Y,[100,1],@(a){a})
Z = 100×1 cell array
    {[      64]}
    {4×1 double}
    {0×0 double}
    {0×0 double}
    {0×0 double}
    {0×0 double}
    {[       5]}
    {0×0 double}
    {[       9]}
    {[      44]}
    {0×0 double}
    {3×1 double}
    {2×1 double}
    {2×1 double}
    {[      36]}
    {2×1 double}

Checking the indices of '00' and some random pair:

Z{1}
ans = 64
Z{strcmp(P,'23')}
ans = 3×1
    39
    80
   103

You can probably do something simiar with table operations. Lets try it now:

D = cell2table(Q, 'VariableNames',"Pair");
D.Index = (1:numel(Q)).';
G = groupsummary(D,"Pair",@(a){a})
G = 69×3 table
PairGroupCountfun1_Index____________________________

    {'00'}        1         {[      64]}
    {'01'}        4         {4×1 double}
    {'06'}        1         {[       5]}
    {'08'}        1         {[       9]}
    {'09'}        1         {[      44]}
    {'11'}        3         {3×1 double}
    {'12'}        2         {2×1 double}
    {'13'}        2         {2×1 double}
    {'14'}        1         {[      36]}
    {'15'}        2         {2×1 double}
    {'16'}        2         {2×1 double}
    {'17'}        2         {2×1 double}
    {'19'}        5         {5×1 double}
    {'21'}        1         {[      19]}
    {'23'}        3         {3×1 double}
    {'24'}        2         {2×1 double}

Steve 2023 年 4 月 1 日

Thank you. This works. I must admit, as a beginner, some of the code looks cryptic (e.g., "@(a){a}", and the output of cells 'Z' is hard to work with mathematically, but I'm sure it's possible. I'm appreciating the tradeoffs between classic numerical functions and string approaches.

サインインしてコメントする。

Answer 2

Walter Roberson 2023 年 3 月 24 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1934764-finding-the-indexes-of-multiple-substrings-within-a-larger-string#answer_1200434

MATLAB Online で開く

c = 'a91bb48353'
c = 'a91bb48353'
mask = ismember(c, '0':'9');
odd_pair = find(mask(1:2:end-1) & mask(2:2:end)) * 2 - 1
odd_pair = 1×2
     7     9
even_pair = find(mask(2:2:end-1) & mask(3:2:end)) * 2
even_pair = 1×3
     2     6     8
pair_starts_at = union(odd_pair, even_pair)
pair_starts_at = 1×5
     2     6     7     8     9

2 件のコメント
なしを表示なしを非表示

Walter Roberson 2023 年 3 月 26 日

MATLAB Online で開く

c = char(randi([0 9], 1, 30) + '0')
c = '305452612469209463851343808968'
C = c - '0';
odds = C(1:2:end-1) * 10 + C(2:2:end);
evens = C(2:2:end-1) * 10 + C(3:2:end);
odd_idx = (1:numel(odds)) * 2 - 1;
even_idx = (1:numel(evens)) * 2;
indices = accumarray([odds(:); evens(:)] + 1, [odd_idx(:); even_idx(:)], [], @(locs){locs});
populated = find(~cellfun(@isempty, indices));
[num2cell(populated-1), indices(populated)]
ans = 27×2 cell array
    {[ 5]}    {[       2]}
    {[ 8]}    {[      26]}
    {[ 9]}    {[      14]}
    {[12]}    {[       8]}
    {[13]}    {[      21]}
    {[20]}    {[      13]}
    {[24]}    {[       9]}
    {[26]}    {[       6]}
    {[30]}    {[       1]}
    {[34]}    {[      22]}
    {[38]}    {2×1 double}
    {[43]}    {[      23]}
    {[45]}    {[       4]}
    {[46]}    {2×1 double}
    {[51]}    {[      20]}
    {[52]}    {[       5]}

Steve 2023 年 4 月 1 日

Thank you Walter. This method worked for me as well. Cheers

サインインしてコメントする。

Finding the indexes of multiple substrings within a larger string.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

その他の回答 (1 件)

2 件のコメント
なしを表示なしを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

Finding the indexes of multiple substrings within a larger string.

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

7 件のコメント 5 件の古いコメントを表示5 件の古いコメントを非表示

その他の回答 (1 件)

2 件のコメント なしを表示なしを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示