Read numeric data with csvread
現在この質問をフォロー中です
- フォローしているコンテンツ フィードに更新が表示されます。
- コミュニケーション基本設定に応じて電子メールを受け取ることができます。
エラーが発生しました
ページに変更が加えられたため、アクションを完了できません。ページを再度読み込み、更新された状態を確認してください。
古いコメントを表示
Hello,
I got a csv-file that looks like this.
* text here
* more text...
1,20,3,4
2,30,4,5
* text again
3,4,6,7
*text
And so it goes on.
How do I read the csv-file and only get the numeric data. Everything that has a "*" and text after should be disgarded.
Thank you.
採用された回答
dpb
2015 年 2 月 12 日
doc textscan % NB: optional 'commentstyle' parameter
8 件のコメント
Okey. I created TestFile.csv with the data and text in as in my question.
Now my code is:
fileID=fopen('TestFile.csv')
N=4
cdata=textscan(fileID,'%f %f %f %f', ...
N,'CollectOutput',1,'CommentStyle','*')
I get:
cdata =
[1x4 double]
I cant figure out how to get the data from each column in "cdata"?
Thank you.
dpb
2015 年 2 月 12 日
For these cases where there's no need for a cell array at all I wrap textscan in cell2mat as--
cdata=cell2mat(textscan(fileID,'%f %f %f %f', ...
N,'CollectOutput',1,'CommentStyle','*'));
In general you dereference a cell array with the "curlies" as
cdata(:)
for the full array or "nested indexing" of
cdata(1){r,c)
for a given array element.
See the doc on cell arrays for the fuller details.
But the short story here is that there's no need for the cell arrray and it's unfortunate there's not a way to tell textscan to forego the needless creation of one when isn't needed.
Thank you! My cdata looks like below when I use cell2mat:
cdata =
1 NaN NaN NaN
"1" is from row 1 and column 1 in my TestFile.csv I thought that it could be a bad csv-file but I tried to open other files to but it gives the same answer.
Am I using the wrong formatSpec?
dpb
2015 年 2 月 17 日
Dunno...you don't show what you did in context...w/ the sample file copied into a text file here the example worked fine. NaN indicates a conversion of something not recognizable as a number so perhaps there's an embedded hidden character in the file or somesuch???
Daniel
2015 年 2 月 18 日
Okey. There should not be andy hidden characters in the file. That is confirmed.
This is my script:
---
fileID=fopen('TestFile.csv')
N=4
cdata=cell2mat(textscan(fileID,'%f %f %f %f',N,'CollectOutput',1,'CommentStyle','*'))
---
And this is the result from Matlab:
---
fileID = 8
N = 4
cdata = 1 NaN NaN NaN
---
And you have the exact same thing and it works for you? That is strange.
Thanks anyway!
Ayup...
>> type test.csv
* text here
* more text...
1,20,3,4
2,30,4,5
* text again
3,4,6,7
*text
>> fid=fopen('test.csv');
>> cell2mat(textscan(fid,repmat('%f',1,4),'delimiter',',', ...
'commentstyle','*', ...
'collectoutput',1))
ans =
1 20 3 4
2 30 4 5
3 4 6 7
>>
ADDENDUM
Oh, I see it isn't exact same thing; you don't need/want the repeat count specifier. That tells it to apply the format string N times but your file isn't consistent so it breaks when finds a non-numeric form. It would possibly work that way if 'commentstyle' were to force the whole file to be processed, the comment lines removed, then that file processed, but textscan works sequentially, not globally, simply skipping a line beginning with the comment character when it finds one and trying to convert the next line.
Thank you for your help! It works fine now. So if I had five columns instead of four i would write "1,5". Now I get how it works.
dpb
2015 年 2 月 20 日
Ayup; it's the silly way C implemented it's format strings ignoring the long-existing pattern used in Fortran wherein there can be a repeat specifier. Just to show they were smarter; the implementers reversed the order of the width field and the conversion type so there's no way to now write a repeat count unambiguously. In Fortran FORMAT it would be 4F8.0; in Matlab which uses C i/o libraries one has to use repmat to double up or write them all explicitly. On the newsgroup am working with a guy at this instant with a 159-column file...writing %f 159 separate times is rather painful as his initial plea noted until one either has the "a-ha!" moment one's self or somebody shows you the trick (S Lord pointed it out to me years ago; I had never thought of repmat for strings for the purpose despite complaining for years. At one time I wrote a mex file that accepted Fortran FORMAT strings and used the Fortran i/o and passed the values back. Unfortunately I lost the source in the retirement move and haven't had the gumption to re-invent it since.
OK, enough geezer stories/griping... :)
その他の回答 (0 件)
カテゴリ
ヘルプ センター および File Exchange で Data Type Conversion についてさらに検索
タグ
参考
Web サイトの選択
Web サイトを選択すると、翻訳されたコンテンツにアクセスし、地域のイベントやサービスを確認できます。現在の位置情報に基づき、次のサイトの選択を推奨します:
また、以下のリストから Web サイトを選択することもできます。
最適なサイトパフォーマンスの取得方法
中国のサイト (中国語または英語) を選択することで、最適なサイトパフォーマンスが得られます。その他の国の MathWorks のサイトは、お客様の地域からのアクセスが最適化されていません。
南北アメリカ
- América Latina (Español)
- Canada (English)
- United States (English)
ヨーロッパ
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
