Whitespace problem with textscan

While using textscan, it doesn't seem to treat multiple whitespace as a single delimiter. The file has whitespace and tab delimeters,
2012-10-15 K01 5.83 5.05 5.73 6.41 4.28
2012-10-15 K01 5.25 5.80 6.41 4.28
2012-10-15 K01 4.28
Using data = textscan(fd, form,'HeaderLines',2); so data{:,3} becomes
5.83, 5.25 and 4.28
but should be
5.83, NaN and NaN.
I've tried using the whitespace function but cannot get it to work

1 件のコメント

Stephen23
Stephen23 2014 年 10 月 8 日
MATLAB does not have a whitespace function.

サインインしてコメントする。

 採用された回答

Stephen23
Stephen23 2014 年 10 月 8 日
編集済み: Stephen23 2014 年 10 月 8 日

0 投票

It does if you use the 'MultipleDelimsAsOne' option:
This is explained in the section "Treat Repeated Delimiters as One".
Although, given your example output, you probably want to use the 'EmptyValue' option, which is explained in the section "Specify Delimiter and Empty Value Conversion".

8 件のコメント

Stefán
Stefán 2014 年 10 月 8 日
Thanks for the reply. Unfortunately neither option seemed to help. My problem is that for some reason, matlab doesn't assign NaN to the empty space.
My file has both tab and space as a delimiter which might be causing this. In line two, after the K01, there is empty space and Matlab seems to ignore it and assigns the next value, 5.25 to that slot.
So I don't want to treat multiple delims as one. I have tried.
data = textscan(fd, form,'HeaderLines',2,'MultipleDelimsAsOne',0,'EmptyValue',NaN);
But it doesn't seem to work.
SHould I change something in that command?
Stephen23
Stephen23 2014 年 10 月 8 日
Can you please show the formatSpec that you are using.
Stefán
Stefán 2014 年 10 月 8 日
Thanks for the help, I specified tab as the single delimiter and it worked.
Stephen23
Stephen23 2014 年 10 月 8 日
編集済み: Stephen23 2014 年 10 月 8 日
Define a string A, with tab delimiters and newlines:
A = {'2012-10-15 K01 5.83 5.05 5.73 6.41 4.28','2012-10-15 K01 5.25 5.80 6.41 4.28','2012-10-15 K01 4.28'};
A = sprintf('%s\n',A{:});
Run textscan :
C = textscan(A,'%s%s%f%f%f%f%f');
C is a 1x7 cell array, containing two cells of strings (i.e. columns 1:2), and four numeric arrays (i.e. columns 3:7). The last array should be [4.28;4.28;4.28], but is [4.28;NaN;NaN].
Weird.
Stephen23
Stephen23 2014 年 10 月 8 日
Aha, those "empty" fields are not empty, they contain space characters.
Stefán
Stefán 2014 年 10 月 8 日
Yes, sorry if I didn't make that point. The reason I didn't want to specify a delimiter is that I thought there was space inbetween the date and K01.
Again thanks for the help
Stephen23
Stephen23 2014 年 10 月 8 日
This works correctly:
>> S = sprintf('anna\t123.45\t \t67.890\nbob\t \t \t9.7');
>> textscan(S,'%s%f%f%f','Whitespace','','TreatAsEmpty',' ','Delimiter','\t')
ans = {{'anna';'bob'},[1.2345;NaN],[NaN;NaN],[67.89;9.7]}
Try the above options with your formatSpec, and just one line, then try it with two lines...
Stephen23
Stephen23 2014 年 10 月 8 日
The file format uses mixed delimiters (a space and tab), and also uses the space as a placeholder for missing data. I would suggest that the easiest way to deal with this would be to do some pre-processing:
  • replace the space between the date and 'K01' with a tab. This would be easy to achieve using regexprep (eg regexprep(str,' K','\tK') ).
  • replace the place-holder spaces with nothing... keep those fields empty! This might be the better option, as then you might be able to use (almost) the default settings for textscan.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

ヘルプ センター および File ExchangeData Type Conversion についてさらに検索

製品

質問済み:

2014 年 10 月 8 日

コメント済み:

2014 年 10 月 8 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by