How to read in a large mixed csv data file
3 ビュー (過去 30 日間)
古いコメントを表示
Hi,
Please excuse me if this is difficult to understand - I am new to Matlab and coding in general.
I’m trying to read in a large mixed data file that I can then manipulate. The file is 53 columns by about 8.8 million observations (rows) and is in csv format separated by commas. It is arranged as follows:
Numbers only: Columns 1-28, 30-34, 36, 50, 53
All other columns are text only (e.g. Johannesburg) or mixed text and numbers (e.g. E44). Some include spaces (e.g.Cape Town) and others symbols like slashes (e.g. A00-A09).
It is not clear to me if the first row is headings or not.
I’m assuming I need to use either readtable or textscan, but so far have been unsuccessful with the code.
Thanks for the help! Noah
2 件のコメント
Image Analyst
2015 年 7 月 1 日
Crop out just a few rows, say 20, and upload the cropped file here with the paper clip icon. I think readtable should work, unless it got confused at some columns that have both numbers only in them and letters only in them and perhaps a mix. Post your readtable() call.
回答 (1 件)
Walter Roberson
2015 年 7 月 1 日
fmts = repmat({'%^[,]'}, 1, 53); % %^[,] includes spaces but %s ends at spaces
fmts([1:28, 30:34, 36, 50, 53]) = {'%g'};
fmt = [fmts{:}];
headers = 1; %maybe 0?
fid = fopen('YouFile.csv','rt');
datacell = textscan(fids, fmt, 'Headerlines', headers, 'Delimiter', ',');
fclose(fid);
and now datacell{1} is the first column, datacell{2} is the second column, and so on. The text columns will be a cell array of strings, one per row.
2 件のコメント
Walter Roberson
2015 年 7 月 1 日
Sorry, I had two errors. The correction is
fmts = repmat({'%[^,]'}, 1, 53); % %[^,] includes spaces but %s ends at spaces
fmts([1:28, 30:34, 36, 50, 53]) = {'%f'};
参考
カテゴリ
Help Center および File Exchange で Large Files and Big Data についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!