regexp: \w* vs \w+

15 ビュー (過去 30 日間)
Jim
Jim 2020 年 5 月 28 日
コメント済み: Ameer Hamza 2020 年 5 月 29 日
Given:
t = " the cat";
pStar = "\w*";
pPlus = "\w+";
regexp(t, pStar, 'match'),
regexp(t, pPlus, 'match'),
Both return:
ans =
1×2 string array
"the" "cat"
My question is why \w*, which is supposed to match zero or more occurrences of the \w class, doesn't match the two spaces and return them, either as leading characters in the two substrings (" the" and " cat") or as additional standalone substrings. I guess I'm unclear what "zero occurrences" means here. When the regex engine examines the 1st space, why wouldn't it match it, since it indeed matches zero occurrences of the class? Perhaps someone familiar with the engine's behavior can help.

採用された回答

Ameer Hamza
Ameer Hamza 2020 年 5 月 28 日
\w does not match spaces. \s is used to match spaces. \w only matches a-z A-Z 0-9 and underscore. Following attempts to show the difference between * and +
t = " the cat";
pStar = "\w*\s";
pPlus = "\w+\s";
regexp(t, pStar, 'match')
regexp(t, pPlus, 'match')
  2 件のコメント
Jim
Jim 2020 年 5 月 28 日
I think I see. Another way of putting it might be:
matches zero occurrences of pattern = doesn't match pattern
Ameer Hamza
Ameer Hamza 2020 年 5 月 29 日
It depends on what do you mean by "doesn't match pattern". You can think that in the case of \w*, it still matches the pattern. It is just that the pattern is optional. If the pattern exists in the input string, it matches it, and if the pattern does not exist, it ignores it and matches the rest of the regular expression.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeInstall Products についてさらに検索

製品


リリース

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by