Why Fread a 2 GB file needs more than 8 GB of Ram?

Question

Gabriel 2013 年 6 月 4 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/77960-why-fread-a-2-gb-file-needs-more-than-8-gb-of-ram

textscan is too slow.

Thus, I want to load a 2 GB file in RAM with fread (fast), then scan it.

Fread works well with small files, but if I try to fread(filename,'*char') a 2 GB file, RAM spikes for some reason over my 8 GB limit and I get out of memory.

Ideas?

2 件のコメント
なしを表示なしを非表示

Jan 2013 年 6 月 4 日

Please post the full code, because there might be unexpected problems.

Gabriel 2013 年 6 月 4 日

MATLAB Online で開く

Well, the code is simple:

fid = fopen(filename);
test = fread(fid, '*char');

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Jan 2013 年 6 月 4 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/77960-why-fread-a-2-gb-file-needs-more-than-8-gb-of-ram#answer_87665

Reading a 2GB-file into a CHAR required 4GB of RAM, because Matlab uses 2-byte-chars. Then it is possible depending on the way you store the data, that the contents of a temporary array is copied, such that 8GB is the expected memory consumption. But actually I'd expect that this copy could be avoided, so it might be helpful, if you show us the code fragment.

2 件のコメント
なしを表示なしを非表示

Gabriel 2013 年 6 月 4 日

Precisely, I expect it to require 4GB, yet watching system monitor, the whole things goes over 8GB and into swap.

I also get the copied into functions parts, etc. But shouldnt FREAD be able to load a 2 GB file into a 4GB char array without needing more than 8GB of Ram?

Jan 2013 年 6 月 4 日

編集済み: Jan 2013 年 6 月 4 日

I've seen an equivalent behavior for another FREAD implementation (not in Matlab): The required final size was not determined by FSEEK, but the file was read in chunks until the buffer was filled. Then the buffer was re-allocated with the double size. After the obvious drawbacks have been mentioned in a discussion, the author decided to replace the doubling method by a smarter Fibonacci sequence. :-)

サインインしてコメントする。

Answer 2

Iain 2013 年 6 月 4 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/77960-why-fread-a-2-gb-file-needs-more-than-8-gb-of-ram#answer_87670

As Jan implied, passing around variables often leads to memory duplication - 2GB arrays get COPIED when put into functions.

The Out of memory error normally comes up when matlab cannot find a single chunk of RAM big enough for a variable.

Use much smaller chunks of memory, and read the file in and parse it in chunks of, say, 64MB.

2 件のコメント
なしを表示なしを非表示

Walter Roberson 2013 年 6 月 4 日

The arrays will only get copied if they are modified; otherwise the data pointer will point to the original storage.

Gabriel 2013 年 6 月 4 日

I think I did not express myself well, I apologize. Parsing is not the issue. I fully expect scanning functions to be memory hogs (relatively).

Fread on the other hand, I don't quite get why it needs so much overhead to load a 2GB+ file in the workspace?

サインインしてコメントする。

Answer 3

Gabriel 2013 年 6 月 4 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/77960-why-fread-a-2-gb-file-needs-more-than-8-gb-of-ram#answer_87689

編集済み: Gabriel 2013 年 6 月 4 日

MATLAB Online で開く

In any case, I have found a workaround for textscanning large ascii files (4GB and beyond) that contain numbers

The trick is padding the numbers with PERL or SED before trying to read them into matlab. If you pad your numbers with leading 0s, every line has the same ammount of chars, thus FREAD is easy to execute in chunks.

ex:

While not eof
 tmp = fread X lines
 data = textscan(tmp)
 process(data)
end

With this trick, I went from 3 MB/sec to 130 MB/sec for processing a file.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Why Fread a 2 GB file needs more than 8 GB of Ram?

2 件のコメント
なしを表示なしを非表示

回答 (3 件)

2 件のコメント
なしを表示なしを非表示

2 件のコメント
なしを表示なしを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

Why Fread a 2 GB file needs more than 8 GB of Ram?

2 件のコメント なしを表示なしを非表示

回答 (3 件)

2 件のコメント なしを表示なしを非表示

2 件のコメント なしを表示なしを非表示

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

2 件のコメント
なしを表示なしを非表示

2 件のコメント
なしを表示なしを非表示

2 件のコメント
なしを表示なしを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示