Need help with regexpi expression for multiple variants of the same phrase

1 回表示 (過去 30 日間)
Kellen Krajewski
Kellen Krajewski 2022 年 12 月 14 日
編集済み: Stephen23 2022 年 12 月 14 日
I have a question regarding the use of regexpi to determine if certain string words are input into a text file. The text files were created by multiple individuals and use slightly different phrasing to mean the same variable. For example in a text file containing a gait evaluation the phrase 'slow cadence' was recorded, but 'slow cadence' can be denoted as 'slow cadence' or 'slow stepping'. My original code was as follows:
data=fileread('Test.txt');
A=isempty(regexpi(data{'slow cadence','slow stepping'}));
However, this version can return a false positive as it will mix and match string within the {}. For example the following code for the same file will return a '0' for the isempty function even though none of the string phrases match completely:
data=fileread('Test.txt');
A=isempty(regexpi(data{'fast cadence','slow stepping'}));
I feel like I am missing a simple command to indicate that A can be 'slow cadence' OR 'slow stepping'. Any help is much appreciated.

回答 (2 件)

Stephen23
Stephen23 2022 年 12 月 14 日
編集済み: Stephen23 2022 年 12 月 14 日
You will probably find the 'ONCE' option also very very very useful (here I inverted the logical output, because true=contains is usually much simpler to work with than messing-with-your-head true=doesnotcontain):
str = fileread('Test.txt');
idx = ~isempty(regexpi(str, 'slow (cadence|stepping)','once'))
idx = logical
1
Using regular expressions requires reading the documentation again and again and again and again and again... it takes quite a while to get profficient and comfortable using them. Also, make sure you read the documentation.
You might also find my interactive tool useful for helping to develop regular expressions:
I should also mention, that if you want to use regular expressions then you need to read the documentation. A lot.
PS: Another approach using the newer CONTAINS and patterns:
pat = regexpPattern('slow (cadence|stepping)');
idx = contains(str,pat, 'ignorecase',true)
idx = logical
1

Fifteen12
Fifteen12 2022 年 12 月 14 日
I think you want to look at making regular expressions. Try this:
A=isempty(regexpi(data,'(slow cadence|slow stepping)'));
You'll probably want to do more case matching as well, using wild cards to subsitite for white spaace, etc. You can find more here

カテゴリ

Help Center および File ExchangeCharacters and Strings についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by