How could i search a text for varying patterns

1 回表示 (過去 30 日間)
Joey Cavale
Joey Cavale 2018 年 5 月 23 日
コメント済み: Joey Cavale 2018 年 5 月 23 日
Hello, I am trying to search through an excel file looking through model names and only pulling out the option codes.
For Example the excel sheet with the model names would look like: '1010AB-50K-TB' '1010AB-25K-B' '1010AB-100K'
and I want to create a new array with just the option codes (which in this case is AB). So to look something like this:
'AB' 'AB' 'AB'
and so on... the option codes vary from two letters to three letters and vary from A-Z (AB, AE, AF, AAA, ABD). The only consistency in the model names is the four numbers before the option code and the dash after. How do y'all think the best way to do this would be?
  4 件のコメント
Joey Cavale
Joey Cavale 2018 年 5 月 23 日
Here are some examples
Paolo
Paolo 2018 年 5 月 23 日
I have edited my regex expression, now the first letter will also range from A to Z. Check out my answer down below.

サインインしてコメントする。

採用された回答

Paolo
Paolo 2018 年 5 月 23 日
編集済み: Paolo 2018 年 5 月 23 日
For a pattern of two to three letters, starting with any letter (A to Z) and ending with any letter (A to Z) you can use the following code:
expression = '([A-Z][A-Z]?[A-Z](?=-))';
model_names = {'A1010ACZ-50K-AAA' '1010AB-25K-B' '1010AT-100K'};
[tokens] = regexp(cell2mat(model_names),expression,'tokens');
The expression has three elements, all ranging from A to Z (the [A-Z]). The third element is optional (denoted by the question mark). The expression contains a lookahead (?=-) which searches for the dash however does not include it in the tokens.
The code returns:
tokens{:} = {'ACZ'},{'AB'},{'AT'}
  5 件のコメント
Paolo
Paolo 2018 年 5 月 23 日
Are you working with each model number individually? If so, you can either use:
expression = '([A-Z][A-Z]?[A-Z](?=-)).*'
or even specify once in the function:
[tokens] = regexp(model_names,expression,'tokens','once');
Or are you working with a cell array?
Joey Cavale
Joey Cavale 2018 年 5 月 23 日
Ahhh.. the 'once' term in the regexp function works wonders, thank you!
expression = '([A-Z][A-Z]?[A-Z](?=-))';
[txt] = regexp(MODEL, expression, 'match','once');

サインインしてコメントする。

その他の回答 (1 件)

jonas
jonas 2018 年 5 月 23 日
編集済み: jonas 2018 年 5 月 23 日
Not sure if it's the best way, but the code starts at str(5) and ends at the dash, so this should work
[~,s] = regexp(str, '-', 'match')
out=str(5:s(1)-1)
  2 件のコメント
Joey Cavale
Joey Cavale 2018 年 5 月 23 日
MODEL = TXT(:,1); % Model names
for i = 7:length(MODEL)
if contains(MODEL{i,1},'-') == 0
continue % Skipping names that don't satisfy argument
end
str = MODEL{i,1};
[~,s] = regexp(str,'-','match');
out{i,1} = str(5:s(1)-1);
end
Got this to work, had to make it a little janky but it ended up getting the job done, thanks for the help!
Paolo
Paolo 2018 年 5 月 23 日
Unsure as to why you selected this answer. The code is limited in application and will not work if for some reason the codes change format, let's say from
1010AB-50K-TB
to
10105AB-50K-TB
will result in an incorrect output.
5AB

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeCharacters and Strings についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by