How to efficiently integrate big data without using memory / (How to create big data)
9 ビュー (過去 30 日間)
古いコメントを表示
- in a study i will produce large arrays.
- Each array will have at least 500 MB size.
- Each array will have the same number of rows.
- the total size of dataset will be approximately 20 GB or over.
- Somehow I have to create a single variable/array which includes all data and size of 20 GB.
matfile seems a good solution. However when the size of file increases, it gets slower. How can i handle this problem?
9 件のコメント
Walter Roberson
2015 年 8 月 18 日
I wonder if compression is leading to slowdowns? I do not know whether -v7.3 with matfile uses compression; see discussion http://www.mathworks.com/matlabcentral/answers/15521-matlab-function-save-and-v7-3 and http://www.mathworks.com/matlabcentral/answers/137592-compress-only-selected-variables-when-saving-to-mat
採用された回答
JMP Phillips
2015 年 8 月 19 日
編集済み: Walter Roberson
2015 年 8 月 19 日
Here are some things you could try:
Use the matfile function, which allows you to access and change variables directly in MAT-files, without loading into memory: http://au.mathworks.com/help/matlab/large-mat-files.html http://au.mathworks.com/help/matlab/ref/matfile.html
Structure your data differently: - if you are representing the data as doubles, maybe you can afford less accuracy e.g. use int32. For example, you can use scaling of 1e4 to represent a double value such as 100.3425 as an integer 1003425.
With MATLAB:
- use 64 bit matlab version
- try disabling compression when saving the files, with the -v6 option
Optimize your PC for your task:
- in task manager, close any unnecessary processes running at the same time, including taskbar junk (adobe update, java update etc)
- disable your anti-virus which might be trying to scan the file and slowing it down
- under task manager, give higher priority to the MATLAB process (see http://www.sevenforums.com/tutorials/83361-priority-level-set-applications-processes.html)
- increase your virtual memory or page file size http://windows.microsoft.com/en-au/windows/change-virtual-memory-size#1TC=windows-7
- defragment your hard drive
- run MATLAB from your local hard drive and not a network drive or external harddrive
- save the .mat file to your local hard drive where it has plenty of space, not a network drive or external harddrive.
- For faster hard drive access, use a Solid State Drive (SSD)
2 件のコメント
Walter Roberson
2015 年 8 月 19 日
The -v6 option is incompatible with matfile and with objects over 2 Gb.
その他の回答 (0 件)
参考
カテゴリ
Help Center および File Exchange で Standard File Formats についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!