Big data processing, Datastore function

2 ビュー (過去 30 日間)
Vincent Thevenot
Vincent Thevenot 2015 年 5 月 31 日
回答済み: Aaditya Kalsi 2015 年 6 月 1 日
Hi,
I have to deal with big files with 3 tabular spaced columns. But I’m out of memory, the files contain several millions of rows. So I try to use "datastore" function, and it works very well, but Matlab return an error when the file contains more than 594000 rows.
Here is the message :
Error using matlab.io.datastore.TabularTextDatastore/read (line 41)
The data in Files does not appear to be tabular, with the same number of fields in each row and in each column. Verify the Text Format and Advanced Text Format
Properties.
Error in test_datastore (line 17)
s=read(ds);
It seems to be a problem with the format, but I tried with different part of the file, and Matlab always return this message if there is more than 594000 rows.
Here is my code (very simple, just to test the function) :
ds=datastore('essai_RM7_1_test_3.txt','ReadVariableNames',0,'TextscanFormats',{'%q','%f','%f'},'RowDelimiter',' ');
ds.RowsPerRead = 100000;
count = 0;
while hasdata(ds)
s=read(ds);
count = count + 1
end
count
Here is some rows of the file :
24/04/2015 09:58:06.220351 -1.143072E-2 1.277841E-1
24/04/2015 09:58:06.220957 2.736964E-3 9.289337E-2
24/04/2015 09:58:06.221562 -7.244674E-3 3.169246E-2
24/04/2015 09:58:06.222167 2.487282E-2 -6.050338E-2
24/04/2015 09:58:06.222773 1.344811E-1 -1.312878E-1
24/04/2015 09:58:06.223378 7.464026E-2 -1.944335E-1
24/04/2015 09:58:06.223984 -6.966816E-2 -2.088179E-1
24/04/2015 09:58:06.224589 -5.196927E-2 -1.842140E-1
24/04/2015 09:58:06.225195 6.998909E-2 -1.819939E-1
So, does anybody encountered this kind of problem ? Is there a different way to deal with such a big file ? I have to perform different calculus (FFT, RMS, …)
Thanks in advance for your help

回答 (1 件)

Aaditya Kalsi
Aaditya Kalsi 2015 年 6 月 1 日
It seems like there is an issue with the data within the file at around row 594000. You could try:
while hasdata(ds)
[s, info]=read(ds);
disp(info); % DISPLAY CURRENT STATE
count = count + 1
end
This will tell you where was the last successful read.
I have a suspicion that the second file is different from the first and that is the error you are seeing.

カテゴリ

Help Center および File ExchangeEntering Commands についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by