How do I get regexp to read specific values in a text file?

Question

Brad 2013 年 5 月 16 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/76067-how-do-i-get-regexp-to-read-specific-values-in-a-text-file

回答済み: Anver Hisham 2017 年 5 月 18 日

採用された回答: Matt Kindig

I have a text file containing numerous blocks of data that look like this:

Type: 1 Part ID: 23568 Time Tag: 55012.12345678

Loc ID: 1 Bar ID: 9 Past LR: 0 Isync/Osync: 1

Platform: 1

Type: 1 Part ID: 23568 Time Tag: 55057.12345678

Loc ID: 1 Bar ID: 9 Past LR: 0 Isync/Osync: 1

Platform: 1

Type: 1 Part ID: 23568 Time Tag: 55123.12345678

Loc ID: 1 Bar ID: 6 Past LR: 0 Isync/Osync: 1

Platform: 23

Type: 1 Part ID: 23568 Time Tag: 55124.12345678

Loc ID: 1 Bar ID: 4 Past LR: 0 Isync/Osync: 1

Platform: 23

Type: 1 Part ID: 23568 Time Tag: 55213.12345678

Loc ID: 1 Bar ID: 9 Past LR: 0 Isync/Osync: 1

Platform: 55

Type: 1 Part ID: 23568 Time Tag: 55300.12345678

Loc ID: 1 Bar ID: 11 Past LR: 0 Isync/Osync: 1

Platform: 55

I’m attempting to extract the numeric values for the time tag, Bar ID, and Platform parameters, then store the data separately for plots.

However, I can’t seem to get the proper expression to read the individual values. I’ve tried multiple expressions. The following is the closest I’ve gotten to achieving any values at all:

indata = fileread('data.txt');

pattern = 'Platform:\s\d+';

data = regexp(indata,pattern);

This results in a < 1 x 3 double>

97 207 317

I’ve got to be missing something simple. Any ideas on how to get this to extract the proper values?

2 件のコメント
なしを表示なしを非表示

Matt Kindig 2013 年 5 月 16 日

編集済み: Matt Kindig 2013 年 5 月 16 日

MATLAB Online で開く

What is the difference between this format:

Platform: 1
Type: 1 Part ID: 23568 Time Tag: 55057.12345678

And this format:

Platform: 1 Type: 1 Part ID: 23568 Time Tag: 55123.12345678

Should both be handled the same?

Brad 2013 年 5 月 16 日

Matt, my apologies. I didn't have the aplicable carriage returns in place.

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Matt Kindig 2013 年 5 月 16 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/76067-how-do-i-get-regexp-to-read-specific-values-in-a-text-file#answer_85731

編集済み: Matt Kindig 2013 年 5 月 16 日

MATLAB Online で開く

This seems to work for me:

pattern = 'Platform:\s+(\d+)';
data = regexp(indata, pattern, 'tokens');
data = cell2mat(cellfun(@(x) str2double(x{:}), data, 'UniformOutput', false));

Others can be done similarly.

EDIT: implemented a better way.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Brad 2013 年 5 月 17 日

Matt, thanks for taking a look at this. As I mention below to Cedric, these spaces and tabs are still giving me fits every now and then.

サインインしてコメントする。

Answer 2

Cedric 2013 年 5 月 16 日

2
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/76067-how-do-i-get-regexp-to-read-specific-values-in-a-text-file#answer_85732

編集済み: Cedric 2013 年 5 月 16 日

MATLAB Online で開く

I assumed that you forgot the Platform field in the first line of your sample data and took the following instead:

 Platform: 1 Type: 1 Part ID: 23568 Time Tag: 55057.12345678
 Loc ID: 1 Bar ID: 9 Past LR: 0 Isync/Osync: 1
 Platform: 1 Type: 1 Part ID: 23568 Time Tag: 55123.12345678
 Loc ID: 1 Bar ID: 6 Past LR: 0 Isync/Osync: 1
 Platform: 23 Type: 1 Part ID: 23568 Time Tag: 55124.12345678
 Loc ID: 1 Bar ID: 4 Past LR: 0 Isync/Osync: 1
 Platform: 23 Type: 1 Part ID: 23568 Time Tag: 55213.12345678
 Loc ID: 1 Bar ID: 9 Past LR: 0 Isync/Osync: 1
 Platform: 55 Type: 1 Part ID: 23568 Time Tag: 55300.12345678
 Loc ID: 1 Bar ID: 11 Past LR: 0 Isync/Osync: 1

You can get all fields as follows:

 buffer  = fileread('data.txt') ;
 pattern = 'form:\s+(\d+).*?Tag:\s+([\d\.]+).*?Bar ID:\s+(\d+)' ;
 tokens  = regexp(buffer, pattern, 'tokens') ;
 data    = reshape(str2double([tokens{:}]), 3, []).' ;

and if you are sure that there is only one space after each sem-icolumn, you can use a simpler pattern:

pattern = 'form:\s(\d+).*?Tag:\s([\d\.]+).*?Bar ID:\s(\d+)' ;

Running this leads to a data array with the following columns:

 >> data(:,1)
 ans =
     1
     1
    23
    23
    55
 >> data(:,2)
 ans =
   1.0e+04 *
    5.5057
    5.5123
    5.5124
    5.5213
    5.5300
 >> data(:,3)
 ans =
     9
     6
     4
     9
    11

PS: did you see my last comment in the thread with your previous question about regexp? I think that the official doc is really well built and a fantastic resource for learning regexp! :-)

Note that depending the length of the file(s), running three times REGEXP, each time for matching one given field, might be more efficient that running it one time only with the current, more complex pattern.

EDIT: just to clarify, your first attempt was not that bad; you forgot the 'match' 3rd parameter, so REGEXP outputted positions of matches instead of matches (i.e. the first Platform was matched on character 97, the second on character 207, etc). Also your pattern would match e.g. 'Platform: 1' and not just '1'. If you want to match 'Platform: 1' but output just '1', you can either token-ize the expression matching the number in the pattern:

 >> pattern_tok = 'Platform:\s(\d+)' ;
 >> tokens = regexp(buffer, pattern_tok, 'tokens')
 tokens = 
    {1x1 cell}    {1x1 cell}    {1x1 cell}    {1x1 cell}    {1x1 cell}
 >> [tokens{:}]
 ans = 
    '1'    '1'    '23'    '23'    '55'

which is what Matt proposed, or use a positive look-behind and match the number:

 >> pattern_posLb = '(?<=Platform:\s)\d+' ;
 >> matches = regexp(buffer, pattern_posLb, 'match')
 matches = 
    '1'    '1'    '23'    '23'    '55'

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Brad 2013 年 5 月 17 日

Cedric, I've been using the REGEXP function quite a bit lately. Every now and then I still get stuck when trying to account for the spaces and tabs. REGEXP is quite a powerful function. I just used it on another small project handed to me this afternoon. Practice is paying off!! Thanks, again.

サインインしてコメントする。

Answer 3

Anver Hisham 2017 年 5 月 18 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/76067-how-do-i-get-regexp-to-read-specific-values-in-a-text-file#answer_267493

MATLAB Online で開く

You can use grepValues library,

data = grepValues('data.txt','Platform');

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

How do I get regexp to read specific values in a text file?

2 件のコメント
なしを表示なしを非表示

採用された回答

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (2 件)

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

How do I get regexp to read specific values in a text file?

2 件のコメント なしを表示なしを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (2 件)

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

2 件のコメント
なしを表示なしを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示