フィルターのクリア

How can I use regexp to return a list of variable names?

15 ビュー (過去 30 日間)
Joshua Muse
Joshua Muse 2018 年 2 月 22 日
編集済み: Stephen23 2018 年 2 月 23 日
I want to extract variable names from a string. For my purposes, variables start with a letter or an underscore, and they end before anything except for open parenthesis ("(").
So
'sin(var1) + var2'
should become something like
["var1" "var2"]
I tried this:
testStr = 'sin(var1) + var2';
vars = regexp(testStr,'([a-zA-Z_]\w*)(?:[^(\w]|$)','tokens')
and got this:
ans =
0x0 empty cell array
What am I doing wrong?
  3 件のコメント
Joshua Muse
Joshua Muse 2018 年 2 月 23 日
You're right. I was not very clear about my definition of a variable. A variable:
  • begins with either an English letter or an underscore
  • contains any number of English letters, underscores, and digits
  • ends before something that is not an English letter, underscore, or digit
In the string:
'var1 * _var2 * 2var3 + func(var5))'
the variables are:
var1
_var2
var3
var5
notice that the 2 in front of var3 is not included, nor is "func."
I've tested the expression on regex101 and it matches all of the correct expressions, but when I call regexp with the arguments 'tokens', I don't get an array of the token text like I was expecting.
Stephen23
Stephen23 2018 年 2 月 23 日
編集済み: Stephen23 2018 年 2 月 23 日
func meeets these three condidrions:
  • "begins with either an English letter or an underscore" yes!
  • "contains any number of English letters, underscores, and digits" yes!
  • "ends before something that is not an English letter, underscore, or digit" yes!
So why is func not on your list of variables when it meets all of your conditions?
Note that the regular expression you defined on regex101 actually uses "ends before something that is not an English letter, underscore, digit, or open bracket".

サインインしてコメントする。

回答 (2 件)

Ji Huang
Ji Huang 2018 年 2 月 23 日
編集済み: Ji Huang 2018 年 2 月 23 日
I would do it in two steps. First, remove the functions. i.e. characters before open parenthesis
testStr = 'sin(var1) + var2';
var_step_1 = regexprep(testStr,'[\w_]{0,}\(', '\(')
It gives "(var1) + var2". Then, match vars.
var_step_2 = regexp(var_step_1,'[\w_]{0,}', 'match')

Stephen23
Stephen23 2018 年 2 月 23 日
編集済み: Stephen23 2018 年 2 月 23 日
I used regexpi for simplicity:
>> str = 'var1 * _var2 * 2var3 + func(var5))';
>> C = regexpi(str,'([A-Z_]\w*)(?![\(\w])','match');
>> C{:}
ans = var1
ans = _var2
ans = var3
ans = var5
If you want to develop regular expressions then you might be interested in downloading my simple Interactive Regular Expression tool:

カテゴリ

Help Center および File ExchangeLogical についてさらに検索

タグ

製品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by