restructure cell array from textscan

6 ビュー (過去 30 日間)
Peter Lobato
Peter Lobato 2018 年 8 月 3 日
編集済み: Jan 2018 年 8 月 5 日
Hi folks, I'm using textscan to bring in data from a .csv file to turn it into a cell array. The cell array consists of one row with multiple columns, each containing one row with its own {M x N} cell in it. (i.e., each column in my .csv dataset gets compressed into its own cell).
C =
Columns 1 through 3
{5x1 cell} {5x1 cell} {5x1 cell}
Is there a way to instead generate a cell array with a single column and multiple rows, each row containing one row of data from my .csv file? (i.e., each row in my .csv dataset gets compressed into its own cell). Tying to get it to look like this (if its even possible):
C =
{3x1 cell}
{3x1 cell}
{3x1 cell}
{3x1 cell}
{3x1 cell}
Reason being is I'm bringing in a large number of very large .csv files, concatenating them and exporting as a tab-delimited .txt file using fprintf. Due to memory limitations, I can only import one file at a time to write to the output file.
Thanks!
  3 件のコメント
Image Analyst
Image Analyst 2018 年 8 月 3 日
Why not just simply use csvread() to read the data into a double array? Why hassle with the complications of a cell array???
Peter Lobato
Peter Lobato 2018 年 8 月 4 日
Normally using csvread or dlmread would be way easier, but those (I think) can only import numeric data. The problem with the data I have is each cell can randomly have either numbers or text inside them (e.g. if a thermocouple starts to fail, a column will intermittently switch between a number and "INF", also I want to keep the intermittent "INF" so I can see if/when a thermocouple is failing). I figured the easiest way would be to import everything as a cell array so it doesn't matter what is inside each cell.
I attached an example data file. All files have the exact same headers, same number of columns, but variable number of rows (usually around 100000).

サインインしてコメントする。

採用された回答

Jan
Jan 2018 年 8 月 4 日
Some testdata:
C = {};
for i1 = 1:3
for i2 = 1:5
C{i1}{i2} = i1*10+i2;
end
end
Now the conversion:
CC = num2cell(cat(1, C{:}), 1)
Maybe you want to transpose CC. But is this really useful?
Reason being is I'm bringing in a large number of very large .csv files,
concatenating them and exporting as a tab-delimited .txt file using fprintf.
It seems to be much easier to import the files as text, use strrep to replace the commas by tabs and append the result as string. The conversion and reshaping of cells is most likely a waste of time.
  3 件のコメント
Peter Lobato
Peter Lobato 2018 年 8 月 5 日
Think I figured it out from the last thing you said - replacing commas with \t. One thing I didn't see earlier that completely threw me off is in my data file, there is whitespace before every non-number character, such as "INF" or "NAN" (didn't see it in excel, happened to see it in notepad). Evidently, textscan starts a new cell when it encounters whitespace as default. To get around it, I just set the 'whitespace' parameter to something I know isn't anywhere in the file (in this case '|', sort of a cheesy work-around, but it works). Also, got rid of 'Delimiter', so each row of data is a string, including commas. function looked like this:
C = textscan(fid,repmat('%s',1,311),'HeaderLines',6,'Whitespace','|');
Then to print,
fid2 = fopen('RPECS.txt','a+');
for c = 1:length(C{1})
CC = [strrep(char(C{1}(c)),',','\t'),'\r\n'];
fprintf(fid2,CC,'Delimiter','\t');
clear CC
end
Thanks so much for the help!
Jan
Jan 2018 年 8 月 5 日
編集済み: Jan 2018 年 8 月 5 日
fprintf does not have a "Delimiter" argument.
What about:
Str = fileread(FileName);
Str = strrep(Str, ',', sprintf('\t'));
Str = strrep(Str, ' ', ''); % remove spaces?
fid = fopen(OutputFile);
fwrite(fid, Str, 'char');
fclose(fid);
This replaces all commas by tabs. But is this useful at all? What do you want to achieve actually? Which transformation is wanted? I have the impression that converting the output of textscan is a confusion indirection only.

サインインしてコメントする。

その他の回答 (1 件)

jonas
jonas 2018 年 8 月 4 日
編集済み: jonas 2018 年 8 月 4 日
What about this solution? I've replaced one of the numbers in the first column with a string, just to make sure it works.
%%Read data
opts = detectImportOptions('example_data.csv','NumHeaderLines',6);
opts = setvartype(opts,'double')
T=readtable('example_data.csv',opts);
%%Save 15 columns only
T=T{:,1:15};
%%Save to cell array if you really want
B=mat2cell(T,15,ones(1,15))'
B =
15×1 cell array
[15×1 double]
[15×1 double]
[15×1 double]
...
Strings are stored as NaN, except if the string is Inf, in which case it is stored as Inf. If you, for some reason want strings instead of doubles, then change the argument of setvartype to 'string'.

カテゴリ

Help Center および File ExchangeData Type Conversion についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by