Extracting specific repeating lines of text after a heading using fgetl and textscan

Question

Vincent Scalfani 2016 年 7 月 19 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/296250-extracting-specific-repeating-lines-of-text-after-a-heading-using-fgetl-and-textscan

コメント済み: Vincent Scalfani 2016 年 7 月 21 日

Here is an example of the data I am working with. I would like to extract the line directly following each KEY tag. The files have many thousands of these, so I need to create a loop with textscan or something similar.

> <NAME>
mary
> <AGE>
30
> <KEY>
RDHQFKQIGNG
> <NAME>
john
> <AGE>
56
> <KEY>
JFJNNFNFKFNN

Desired result:

RDHQFKQIGNG
JFJNNFNFKFNN

Here is where I am at (adapted from a similar question in the past), the code does not seem to be moving the cursor, and instead works for the first one, and then grabs all data after it, instead of just the data following the KEY line.

f = fopen('data.txt', 'rt'); 
tline = fgetl(f);
while isempty(strfind(tline, '> <KEY>'))
    if tline == -1 
        break;
    end
    line = fgetl(f);
end
if tline ~= -1
    data = textscan(f,'%s','Delimiter','\r\n');
else
    disp('not found');
end
fclose(f);

Thanks!

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Stephen23 2016 年 7 月 19 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/296250-extracting-specific-repeating-lines-of-text-after-a-heading-using-fgetl-and-textscan#answer_229053

MATLAB Online で開く

temp1.txt

>> str = fileread('temp1.txt');
>> C = regexp(str,'(?<=> <KEY>\s+)\S+','match')
C = 
  'RDHQFKQIGNG'    'JFJNNFNFKFNN'

Tested on this file:

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

Stephen23 2016 年 7 月 20 日

MATLAB Online で開く

temp1.m

Try this:

  E = regexp(str,'^> <KEY>\s+\S+','match','lineanchors');
  E = strtrim(strrep(E,'> <KEY>',''));

And have a play with this script:

Vincent Scalfani 2016 年 7 月 21 日

Amazing!!! PERFECT. It took 1 second to process over 4 million lines of text. Thanks so much for your time.

サインインしてコメントする。

Extracting specific repeating lines of text after a heading using fgetl and textscan

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

Community Treasure Hunt

Extracting specific repeating lines of text after a heading using fgetl and textscan

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示