Hello, I got a csv-file that looks like this. * text here * more text... 1,20,3,4 2,30,4,5 * text again 3,4,6,7 *text And so it goes on. How do I read the csv-file and only get the numeric data. Everything that has a "*" and text after should be disgarded. Thank you.

doc textscan % NB: optional 'commentstyle' parameter

Read numeric data with csvread

Daniel 2015 年 2 月 12 日

編集済み: dpb 2015 年 2 月 12 日

Okey. I created TestFile.csv with the data and text in as in my question.

Now my code is:

fileID=fopen('TestFile.csv')
N=4
cdata=textscan(fileID,'%f %f %f %f', ...
               N,'CollectOutput',1,'CommentStyle','*')
I get:
cdata = 
      [1x4 double]

I cant figure out how to get the data from each column in "cdata"?

Thank you.

dpb 2015 年 2 月 12 日

MATLAB Online で開く

For these cases where there's no need for a cell array at all I wrap textscan in cell2mat as--

cdata=cell2mat(textscan(fileID,'%f %f %f %f', ...
             N,'CollectOutput',1,'CommentStyle','*'));

In general you dereference a cell array with the "curlies" as

cdata(:)

for the full array or "nested indexing" of

cdata(1){r,c)

for a given array element.

See the doc on cell arrays for the fuller details.

But the short story here is that there's no need for the cell arrray and it's unfortunate there's not a way to tell textscan to forego the needless creation of one when isn't needed.

Daniel 2015 年 2 月 17 日

編集済み: Daniel 2015 年 2 月 17 日

MATLAB Online で開く

Thank you! My cdata looks like below when I use cell2mat:

cdata =

1 NaN NaN NaN

"1" is from row 1 and column 1 in my TestFile.csv I thought that it could be a bad csv-file but I tried to open other files to but it gives the same answer.

Am I using the wrong formatSpec?

dpb 2015 年 2 月 17 日

Dunno...you don't show what you did in context...w/ the sample file copied into a text file here the example worked fine. NaN indicates a conversion of something not recognizable as a number so perhaps there's an embedded hidden character in the file or somesuch???

Daniel 2015 年 2 月 18 日

Okey. There should not be andy hidden characters in the file. That is confirmed.

This is my script:

---

fileID=fopen('TestFile.csv')

N=4

cdata=cell2mat(textscan(fileID,'%f %f %f %f',N,'CollectOutput',1,'CommentStyle','*'))

---

And this is the result from Matlab:

---

fileID = 8

N = 4

cdata = 1 NaN NaN NaN

---

And you have the exact same thing and it works for you? That is strange.

Thanks anyway!

dpb 2015 年 2 月 18 日

編集済み: dpb 2015 年 2 月 18 日

MATLAB Online で開く

Ayup...

>> type test.csv
* text here
* more text...
1,20,3,4
2,30,4,5
* text again
3,4,6,7
*text
>> fid=fopen('test.csv');
>> cell2mat(textscan(fid,repmat('%f',1,4),'delimiter',',',    ...
                                          'commentstyle','*', ...
                                          'collectoutput',1))
ans =
   1    20     3     4
   2    30     4     5
   3     4     6     7
>>

ADDENDUM

Oh, I see it isn't exact same thing; you don't need/want the repeat count specifier. That tells it to apply the format string N times but your file isn't consistent so it breaks when finds a non-numeric form. It would possibly work that way if 'commentstyle' were to force the whole file to be processed, the comment lines removed, then that file processed, but textscan works sequentially, not globally, simply skipping a line beginning with the comment character when it finds one and trying to convert the next line.

Daniel 2015 年 2 月 20 日

編集済み: Daniel 2015 年 2 月 20 日

Thank you for your help! It works fine now. So if I had five columns instead of four i would write "1,5". Now I get how it works.

dpb 2015 年 2 月 20 日

Ayup; it's the silly way C implemented it's format strings ignoring the long-existing pattern used in Fortran wherein there can be a repeat specifier. Just to show they were smarter; the implementers reversed the order of the width field and the conversion type so there's no way to now write a repeat count unambiguously. In Fortran FORMAT it would be 4F8.0; in Matlab which uses C i/o libraries one has to use repmat to double up or write them all explicitly. On the newsgroup am working with a guy at this instant with a 159-column file...writing %f 159 separate times is rather painful as his initial plea noted until one either has the "a-ha!" moment one's self or somebody shows you the trick (S Lord pointed it out to me years ago; I had never thought of repmat for strings for the purpose despite complaining for years. At one time I wrote a mex file that accepted Fortran FORMAT strings and used the Fortran i/o and passed the values back. Unfortunately I lost the source in the retirement move and haven't had the gumption to re-invent it since.

OK, enough geezer stories/griping... :)

Read numeric data with csvread

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

8 件のコメント
6 件の古いコメントを表示 6 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

タグ

Community Treasure Hunt

Read numeric data with csvread

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

8 件のコメント 6 件の古いコメントを表示 6 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

8 件のコメント
6 件の古いコメントを表示 6 件の古いコメントを非表示