Multiple named tokens in a regexp

2 ビュー (過去 30 日間)
Joseph
Joseph 2013 年 12 月 6 日
編集済み: Cedric 2013 年 12 月 28 日
Hi all,
I am having a little trouble finding out if you can have named tokens that are optional in a regular expression.
My goal is to seperate a string such as:
'CO(g) [atm] = 1000'
where it has the form
'Parameter [Units] = Value '
However, some Parameter don't have Units
'name = carbon'
If I have a regular expression pattern
'(?<Parameter>[^\[=]+)\s?\[(?<Units>[^\]]+)\]\s?=\s?(?<Value>.*)'
this will only work if all three named tokens are present.
Is there a way of modifying this to make it capture either Parameters,Units,Value or Parameters,Value? I tried to use a none capturing grouping '(?:\[?<Units>[^\]]+)\])?' but that doesn't seem to work right.
Basically, can there be optionally captured Named Tokens? If so, how do you construct the regular expression.
UPDATE:
I used:
(?<parameter>[^\[=]+)\s?\[(?<units>[^\]]+)\]?\s?=\s?(?<value>.*)|(?<parameter>[^\[=]+)\s?=\s?(?<value>.*)
So,
'co [k] = 5'
Parameter = 'co '
Units = 'k'
Value = '5'
And,
'co = 5'
Parameter = 'co '
Units = []
Value = '5'
However, the regular expressions looks very unelegant due to the redundance after the '|'. Anyone have any suggestions how to make it look better?
  1 件のコメント
Cedric
Cedric 2013 年 12 月 28 日
編集済み: Cedric 2013 年 12 月 28 日
The following is a bit simpler, but I wouldn't call it "more elegant" ..
pattern = '(?<parameter>\S+) \[?(?<unit>[^=\]]*)\]?\s*=\s*(?<value>\w+)' ;

サインインしてコメントする。

回答 (0 件)

カテゴリ

Help Center および File ExchangeCharacters and Strings についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by