Import Tab Delimited Data with Missing Values

6 ビュー (過去 30 日間)
Chris
Chris 2011 年 11 月 1 日
Hello,
I am looking to import data (txt file) that is separated by tab, but has missing values. The file location is here to see what I mean <ftp://ftp.ncdc.noaa.gov/pub/data/paleo/contributions_by_author/zachos2001/zachos2001.txt>
Ideally I'd like to import this just to make simple x-y graphs, replacing the missing values as NaN; unfortunately, they just left the missing values blank and so Matlab throws in data from one column into another. Someone recommended excel but I'd prefer to avoid this since I am working on a Linux and am using the OpenOffice version of "excel" which I don't like using. Help appreciated.
Thanks

回答 (4 件)

sco1
sco1 2011 年 11 月 1 日
EDIT: Never mind what I had originally posted, I didn't look closely enough at what MATLAB gave me, I'll see if I can figure out a workaround.
EDIT2: I know you don't like using excel, but you can save the data as a text file, read it into excel, and then read the excel file into Matlab (I used fixed width 'delimiting' and deleted the 'Genus' column). Not the most elegant solution, but it seems to work.

Chris
Chris 2011 年 11 月 1 日
Alright thanks, it's only a problem because I can't use excel on this computer, but I can probably get Matlab access on another mac (which has excel), unless someone knows how to do it more directly.

sco1
sco1 2011 年 11 月 1 日
Another less elegant solution would be to loop fgetl(), which will return each line of the file
For example: document = fopen(file); testline = fgetl(document);
If you put the fgetl into a loop, it will read through the entire document line by line and output the entire string of characters. For example, line 1 would be '607 0 3.28 0.88' and line 3 would be '607 0.002 3.16 0.96 3.252 0.72'. However, since there is no constant delimiter in the file you linked, you would need to have some sort of logic for figuring out where there are holes in the data, and I don't think it's a trivial task unless I'm using a methodology that's more complex than is necessary.
It seems like using excel to at least format the data would save you a lot of coding time. Then you can use the time you saved go whack the scientists at NASA in the head and tell them to use comma delimited files like normal people.
  1 件のコメント
Walter Roberson
Walter Roberson 2011 年 11 月 1 日
It _is_ a trivial task... in FORTRAN.

サインインしてコメントする。


Walter Roberson
Walter Roberson 2011 年 11 月 1 日
When I fetch that file, including by using ftp directly, I do not get any tab characters. Instead, I see columns of fixed width. In order the widths are
3, 19, 16, 11, 11, 11, 11
Unfortunately, textscan() and fscanf() and sscanf() do not support field widths, so I suggest you proceed in a manner similar to this, where LinesToSkip is the number of header lines:
IN = char(textscan(fid, '%s', 'HeaderLines', LinesToSkip, 'Delimiter', '\n')) ;
close(fid);
N{1} = str2double(cellstr(IN(:,1:3)));
N{2} = str2double(cellstr(IN(:,4:22)));
N{3} = IN(:,23:38);
N{4} = str2double(cellstr(IN(:,39:50)));
N{5} = str2double(cellstr(IN(:,51:61)));
N{6} = str2double(cellstr(IN(:,62:72)));
N{7} = str2double(cellstr(IN(:,73:83)));
clear IN
Going via char() and then using cellstr() all over the place is a mechanism to work around the fact that the occasional line is shorted because one or more trailing fields is not present. The char() forces all the lines to be the same size, allowing you to split the text by column numbers. str2double() can then be used, but it will not work on character arrays, only on cell string arrays, so the cellstr() call.

カテゴリ

Help Center および File ExchangeText Data Preparation についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by