How to make textscan robust against non-matching lines?

Question

Joan Vazquez 2021 年 4 月 8 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/796102-how-to-make-textscan-robust-against-non-matching-lines

コメント済み: Stephen23 2021 年 4 月 9 日

data.txt

I have files with lines that I want to parse, preferably with textscan. In between those lines, there may be lines to be skipped (unpredictable format and abundance, but definetely new lines). What is the best way to deal with it?

E.g. for the data in attachment, this will stop outputiing #HELLOMATHWORKS messages after line 4.

fid = fopen('data.txt');
out = textscan(fid,'#HELLOMATHWORKS,%[^,],%n');
fclose(fid);

This is a MWE out of a large code base.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Stephen23 2021 年 4 月 8 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/796102-how-to-make-textscan-robust-against-non-matching-lines#answer_670412

編集済み: Stephen23 2021 年 4 月 9 日

MATLAB Online で開く

data.txt

str = fileread('data.txt');
tkn = regexp(str,'#HELLOMATHWORKS,([^,]+),(\S+)','tokens');
tkn = vertcat(tkn{:})
tkn = 6×2 cell array
    {'COM1'}    {'2146'}
    {'COM1'}    {'2147'}
    {'COM1'}    {'2148'}
    {'COM1'}    {'2149'}
    {'COM1'}    {'2150'}
    {'COM1'}    {'2151'}
vec = str2double(tkn(:,2))
vec = 6×1
        2146
        2147
        2148
        2149
        2150
        2151

2 件のコメント
なしを表示なしを非表示

Joan Vazquez 2021 年 4 月 8 日

編集済み: Joan Vazquez 2021 年 4 月 8 日

This does not produce the same output as my code:

tmp =

1×2 cell array

{6×1 cell} {6×1 double}

(Actually my messages have many more fields, this was just a MWE with 2... I have many similar functions using texscan to parse messages and I wanted to avoid refactoring them)

It is a good idea to work directly with regular expressions, but it seems that the formatSpec input parameter of textscan is not just any regular expression, it is more limited...

Anyway, It's OK for the moment, I'll accept the answer, thanks

Stephen23 2021 年 4 月 9 日

@Joan Vazquez: I presume that the text #HELLOMATHWORKS is not what is actually in your file. If the actual text contains some unique character that does not exist anywhere else in the file, you might be able to leverage the LineEnding/EndOfLine option to achieve the goal of reading the file data using textscan.

サインインしてコメントする。

Answer 2

Joan Vazquez 2021 年 4 月 8 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/796102-how-to-make-textscan-robust-against-non-matching-lines#answer_670177

MATLAB Online で開く

This works, but it does not seem the best solution...Ideally, I would tell textscan "skip everything until a new line starts with #HELLOMATHWORKS"

filetext = fileread('data.txt');
expr = '[^\n]*#HELLOMATHWORKS[^\n]*';
% Find and return all lines that contain the text '#HELLOMATHWORKS'.
matches = regexp(filetext,expr,'match');
% Make it a 1xN char to feed textscan
goodlines = sprintf('%s\n', matches{:});
tmp = textscan(goodlines,'#HELLOMATHWORKS,%[^,],%n');

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

How to make textscan robust against non-matching lines?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

2 件のコメント
なしを表示なしを非表示

その他の回答 (1 件)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

How to make textscan robust against non-matching lines?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

2 件のコメント なしを表示なしを非表示

その他の回答 (1 件)

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示