read empty line by textscan

Question

zhiwen wan 2019 年 4 月 10 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/455574-read-empty-line-by-textscan

編集済み: Rik 2019 年 4 月 11 日

Hi Everyone,

I am trying to organize a txt file with 12000 lines, which is too large to use readtable. And i choose to use textscan.

But the problem is textscan just skip all the empty lines, but i need to the exact lines number of certain element in the original file.

I searched a lot online but didn't help. i tried code like this to delete all whitespace but doesn't help.

default = textscan(fid,'%s%s','Delimiter','=','whitespace', '')

Thank you for your help!

2 件のコメント
なしを表示なしを非表示

Rik 2019 年 4 月 11 日

Did you try either suggested solution? If you still have issues, we'll be happy to help.

Jeremy Hughes 2019 年 4 月 11 日

I know someone has already added a solution, and it's a fine solution for what you're doing. But I'm surprised that READTABLE has a problem. Can you attach a sample?

12,000 lines isn't all that large especially if there are only two columns.

If you have 19a, you might also try:

M = readmatrix(filename,'OutputType','string','Delimiter','=','Whitespace','')

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Rik 2019 年 4 月 10 日

2
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/455574-read-empty-line-by-textscan#answer_370050

編集済み: Rik 2019 年 4 月 10 日

MATLAB Online で開く

If your file doesn't contain any special characters, you could try fileread (which reads a file as one long char array), then split it with regexp. If you aren't sure about the encoding of special characters, you may consider my readfile function (which returns a cell array with 1 element per line, also for empty lines).

default = fileread(filename);
default = regexp(default,'\n','split');
%or:
default = readfile(filename);

The output of those two methods is equivalent if there are no special characters encoded in the file. The allowed characters are shown below. (readfile doesn't have this restriction)

% $%&'()*+,-./0123456789:;<=>?@
% ABCDEFGHIJKLMNOPQRSTUVWXYZ
% [\]^_`abcdefghijklmnopqrstuvwxyz{|}~

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

Jeremy Hughes 2019 年 4 月 11 日

編集済み: Jeremy Hughes 2019 年 4 月 11 日

default = regexp(default,'\n','split');

This won't work if there are \r\n windows new lines (or at least you'll have trailing \r characters.)

If you're using 16b or later, try:

https://www.mathworks.com/help/matlab/ref/splitlines.html

default = splitlines(default);

It's a little more robust, and since it has only one job to do, probably slightly faster than regexp.

Rik 2019 年 4 月 11 日

編集済み: Rik 2019 年 4 月 11 日

MATLAB Online で開く

To make the regexp splitting more robust (which will be in my nest version of readfile):

CRLF=[13 10];
CRLF=CRLF([any(default==13) any(default==10)]);
if isempty(CRLF),CRLF=10;end
default = regexp(default,CRLF,'split');

splitlines will probably be faster, while the code I showed here is backwards compatible to R14 (v7.0, which was when regexp was expanded to support outkeys).

Edit:

I just noticed I had this line already in my function:

str(str==13)='';

So readfile already splits it correctly for \r\n files.

サインインしてコメントする。

Answer 2

Bob Thompson 2019 年 4 月 10 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/455574-read-empty-line-by-textscan#answer_370045

編集済み: Rik 2019 年 4 月 10 日

MATLAB Online で開く

I'm going to guess that the extra lines are not consistent?

Generally, I would suggest reading the entire file in as one string, then splitting it at the new line characters. The exact coding may be a bit off from the below example, but it should put you on the right track.

default = textscan(fid,'%s'); % Read the file as one block
default = regexp(default,'\n','split'); % Split the string into multiple cells at each new line character

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

Bob Thompson 2019 年 4 月 10 日

Yes, I do. Thank you for catching that, I was using repmat for other things recently.

zhiwen wan 2019 年 4 月 11 日

Thank you very much Bob, problem solved:)

サインインしてコメントする。

read empty line by textscan

2 件のコメント
なしを表示なしを非表示

採用された回答

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

その他の回答 (1 件)

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

read empty line by textscan

2 件のコメント なしを表示なしを非表示

採用された回答

5 件のコメント 3 件の古いコメントを表示3 件の古いコメントを非表示

その他の回答 (1 件)

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

2 件のコメント
なしを表示なしを非表示

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示