Textscan doesn't do what it's told
3 ビュー (過去 30 日間)
古いコメントを表示
michael warshowsky
2017 年 7 月 13 日
編集済み: michael warshowsky
2017 年 7 月 13 日
I have a function to read large data sets and extract the data but every once in a while there is a random string in there and it just stops so i told it to treat as empty but it didn't work and when it gave me the error and showed me the line the treat as empty part wasn't even shown.
This is the error
Error using textscan
Mismatch between file and format character vector.
Trouble reading 'Numeric' field from file (row number 14535, field number 46) ==> #IND00 13.038326 89.999989 53.000000 66.605282 359.999965
13.013328 90.000023 36.000000 210.258485 359.016463 15.170686 270.000018 48.000000 265.060631 180.778297 14.631801 270.000025 1.#QNAN0 1.#QNAN...
Error in importfile (line 13)
dataArray = textscan(fid, formatSpec, endRow(1)-startRow(1)+1, 'Delimiter', delimiter, 'MultipleDelimsAsOne', true, 'TextType', 'string',
'EmptyValue', NaN, 'HeaderLines', startRow(1)-1, 'ReturnOnError', false, 'EndOfLine', '\r\n');
0 件のコメント
採用された回答
Walter Roberson
2017 年 7 月 13 日
I have made the same mistake myself, thinking that EmptyVal instructed textscan to treat strings in numeric fields as if they were NaN. Instead, EmptyVal tells textscan which value to substitute when it detects an empty field.
The key to this is the 'TreatAsEmpty' option.
S = '13.013328 90.000023 36.000000 210.258485 359.016463 15.170686 270.000018 48.000000 265.060631 180.778297 14.631801 270.000025 1.#QNAN0 -1.#IND00 7.1234';
textscan(S,fmt,'TreatAsEmpty', {'1.#QNAN0', '-1.#IND00'})
ans =
1×15 cell array
Columns 1 through 9
{[13.013328]} {[90.000023]} {[36]} {[210.258485]} {[359.016463]} {[15.170686]} {[270.000018]} {[48]} {[265.060631]}
Columns 10 through 15
{[180.778297]} {[14.631801]} {[270.000025]} {[NaN]} {[NaN]} {[7.1234]}
You should watch out for 1.#INF00 and -1.#INF00 which stand for +infinity and -infinity . You could add them to the list to TreatAsEmpty, but then they would come out as NaN instead of as infinities.
If you need to process the #INF00 as infinities then you are going to need to read the file in as text and do text replacements before you do textscan(). You can pass a string to textscan in place of a file identifier in order to process the content of the string.
3 件のコメント
Walter Roberson
2017 年 7 月 13 日
The error message shows that your file has 1.#QNAN0 in it. You are only processing -1.#IND00 . I show treating both as empty, and I used your actual data and showed the result of testing on it.
その他の回答 (1 件)
Star Strider
2017 年 7 月 13 日
I cannot follow your code. However, using frewind repositions the file pointer at the beginning of the file. I would use the fseek function instead.
This is from a previous Answer (that worked) and illustrates the sort of approach I would take with a file such as yours:
st = fseek(fidi, 1, 'bof'); % Position File After First Line
k1 = 1; % Counter
while (st == 0) && (~feof(fidi)) % Test For End Of File Or Unsuccessful ‘fseek’ File Position
data{k1} = textscan(fidi, '%f%f', 'Delimiter','\t', 'HeaderLines',4, 'CollectOutput',1);
st = fseek(fidi, 50, 'cof'); % Position File Pointer To Next Line After Stop
k1 = k1 + 1;
end
You will obviously have to experiment with it to get it to work with your file. I would retain the while conditions, since the combination will prevent an infinite loop if the file does not have a valid end-of-file indicator.
5 件のコメント
Star Strider
2017 年 7 月 13 日
One option might be to add to your textscan arguments:
'TreatAsEmpty',{'-1#IND00'}
and any other weird strings that exist in your file that you can find and define. Try that first.
If you still have problems, use your textscan format descriptor instead of mine, then see if my code works with your file.
参考
カテゴリ
Help Center および File Exchange で Large Files and Big Data についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!