Save file sizes not lining up

2 ビュー (過去 30 日間)
aboharbf 2019 年 12 月 18 日
コメント済み: Walter Roberson 2019 年 12 月 19 日
So I am attempting to save a relatively large work space. Before realizing I could use the '-7.3v' tag, I instead wrote a script to chop up the struct MATLAB wouldn't save, and save it in 5 slices. According to the warning prompt, MATLAB says the size of this struct is over 2 GB. The pecuilar part is this
  • The 'slice and save' method produces 6 files with a sum size of about 1.3 GB, including the aforementioned 'Over 2 GB' struct (5 of the pieces actually come out to under a GB.
  • When I did use the '-7.3v' tag in save, it took 12 minutes to save and resulted in a file which was 10 GB. Absolutely no idea where all this data is coming from.
Any insight would be appreciated.


dpb 2019 年 12 月 18 日
Any insight would depend upon knowing precisely what the "slice and dice" consisted of and the specifics of the struct
There's overhead in a struct; the more fields etc in it the more overhead to keep track. Presuming you saved pieces as arrays to rebuild fields would be one way I'd see. And, save has some overhead itself that is also undoubtedly somewhat more for higher level data forms than simple arrays. That is counteracted by compression algorithms for at least arrays; I'd presume also effective for content of struct variables. So, there is no easy answer.
>> x=linspace(0,1,1001); % arbitrary double vector
>> whos x % memory footprint
Name Size Bytes Class Attributes
x 1x1001 8008 double
>> save x x
>> whos -file x % reflects in memory size in storage
Name Size Bytes Class Attributes
x 1x1001 8008 double
>> dx=dir('x.mat') % file size < half actual in memory
dx =
struct with fields:
name: 'x.mat'
folder: 'C:\Users\Duane\Documents\MATLAB\Work'
date: '18-Dec-2019 13:46:01'
bytes: 3515
isdir: 0
datenum: 7.377775736226852e+05
>> clear s % let's compare struct to array
>> s.x=x; % same content
>> whos s
Name Size Bytes Class Attributes
s 1x1 8184 struct
>> save s s % as before, reflects memory but 8184-8008 = 176 bytes overhead
>> whos -file s.mat
Name Size Bytes Class Attributes
s 1x1 8184 struct
>> ds=dir('s.mat') % memory on disk
ds =
struct with fields:
name: 's.mat'
folder: 'C:\Users\Duane\Documents\MATLAB\Work'
date: '18-Dec-2019 13:38:54'
bytes: 3542
isdir: 0
datenum: 7.377775686805556e+05
>> ds.bytes-dx.bytes % additional overhead of 27 bytes. Recaptured most of the 176 in fact
ans =
>> s.y=rand(size(s.x)); % let's add another field
>> whos s % exactly double in memory
Name Size Bytes Class Attributes
s 1x1 16368 struct
>> save s s % now save new struct w/ two fields...
>> ds2=dir('s.mat')
ds2 =
struct with fields:
name: 's.mat'
folder: 'C:\Users\Duane\Documents\MATLAB\Work'
date: '18-Dec-2019 13:42:18'
bytes: 11132
isdir: 0
datenum: 7.377775710416667e+05
the last shows sizable jump to support the second field in the struct relative to one field--of course the randomized data would likely contribute to that as well in not compressing so effectively, but I didn't pursue that difference
  5 件のコメント
Walter Roberson
Walter Roberson 2019 年 12 月 19 日
The version 5 .mat file format is documented, and the version 7 format is an extension of it (possibly a couple of more object types but otherwise compatible). The version 7.3 format is completely different though.
Unfortunately every struct entry needs to be stored separately, becuase it is permitted for S(1).field to be a different size and datatype from S(2).field
struct can be convenient for organizing storage, but they are not the most efficient of data storage. More efficient is arrays. You might even consider using tables instead of struct, as long as all of the fields with the same name are scalars of the same datatype: MATLAB stores tables as one cell array with an array for each variable, so there is only one datatype per variable instead of one datatype for each indexed instance of the variable.


その他の回答 (0 件)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by