How to import some large data please

Hi all I have a file called DJ.csv which has 5 columns. 1) Dates (01/02/2007), 2) Times (30.42.0), 3) prices 12553, 12442, 4) Codes (DJ123) and 5) trade size.
I want to take column 3 and 5 (price and trade size into matlab). I am having some trouble as the csv is quite big.
I tried this:
fileID = fopen('K:\test\test\DJ.csv');
A = fread(fileID,'double');
fclose(fileID);
But it only gives me a vector of values which are not the same as my data. Any help would be very much appreciated.
Thanks.

1 件のコメント

Mate 2u
Mate 2u 2013 年 12 月 24 日
As a note, importdata works, but it is not suitable for very large files.

サインインしてコメントする。

 採用された回答

dpb
dpb 2013 年 12 月 25 日
編集済み: dpb 2013 年 12 月 26 日

0 投票

fread is for stream unformatted files; you have formatted delimited file--
doc textscan % and friends
If you really only want/need the two columns sotoo (air-code, untested)
[p,s]=textread('K:\test\test\DJ.csv','%*s%*s,%f%*f%f','delimiter',',');
ought to do unless the third column is indeed a comma-for-a-decimal point as well as a comma-delimited file. In that case you've got a problem. You'll have to read three values instead of just two or preprocess the file or otherwise handle the decimal separator as Matlab can't (and you can't expect it to) know the difference between comma-delimiters and decimal places.

7 件のコメント

Mate 2u
Mate 2u 2014 年 1 月 2 日
編集済み: Mate 2u 2014 年 1 月 2 日
Hi, unfortunately this does not work.
My data is in this form:
01/02/2007 21:58.0 12541 DJH07 1
01/02/2007 22:50.0 12541 DJH07 1
01/02/2007 30:42.0 12545 DJH07 1
01/02/2007 11:31.0 12553 DJH07 2
01/02/2007 51:48.0 12554 DJH07 2
01/02/2007 13:30.0 12554 DJH07 1
01/02/2007 16:14.0 12554 DJH07 3
Could somebody help me please?
Walter Roberson
Walter Roberson 2014 年 1 月 2 日
fid = fopen('K:\test\test\DJ.csv', 'r');
datacell = textscan(fid,'%*s%*s,%f%*s%f','delimiter',',');
fclose(fid);
prices = datacell{1};
tradesize = datacell{2};
Mate 2u
Mate 2u 2014 年 1 月 2 日
Hi there,
Still not working.
When I run above, I get fid=3, datacell [1x2 cell] which is blank, and blank for prices and tradesize.
To note the above data was pasted from excel. Thanks for all your help.
dpb
dpb 2014 年 1 月 2 日
What's the actual file look like is the question. Is there a header row, perhaps, ahead of the data so you also need 'headerlines',1 as an argument pair to the textscan call?
At least the fid=3 indicates did open the file successfully.
Remember when you're testing to always either
frewind(fid)
or
fid= fclose(fid);
and then reopen between attempts or you'll leave the file pointer somewhere besides the beginning which will be bound to cause confusion at best.
Mate 2u
Mate 2u 2014 年 1 月 2 日
編集済み: Walter Roberson 2014 年 1 月 3 日
Hi there, There is no header row just data. The data as open in notepad is the following:
01/02/2007,00:15:00.000,12540,DJH07,1
01/02/2007,00:21:58.000,12541,DJH07,1
01/02/2007,00:22:50.000,12541,DJH07,1
01/02/2007,00:30:42.000,12545,DJH07,1
01/02/2007,01:11:31.000,12553,DJH07,2
01/02/2007,01:51:48.000,12554,DJH07,2
01/02/2007,02:13:30.000,12554,DJH07,1
01/02/2007,02:16:14.000,12554,DJH07,3
01/02/2007,02:21:40.000,12554,DJH07,1
01/02/2007,02:26:48.000,12558,DJH07,1
01/02/2007,02:50:44.000,12555,DJH07,1
01/02/2007,03:14:57.000,12557,DJH07,1
01/02/2007,03:22:41.000,12559,DJH07,1
But each data entry is different lines within the notepad file. Thanks so much for your help.
Walter Roberson
Walter Roberson 2014 年 1 月 3 日
datacell = textscan(fid,'%*s%*s%f%*s%f','delimiter',',');
The previous version had a stray comma in the format.
Mate 2u
Mate 2u 2014 年 1 月 5 日
Thank you, worked well.

サインインしてコメントする。

その他の回答 (0 件)

質問済み:

2013 年 12 月 24 日

コメント済み:

2014 年 1 月 5 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by