How can I efficiently save and access large arrays generated in nested loops?

3 ビュー (過去 30 日間)
Luqman Saleem
Luqman Saleem 2024 年 8 月 22 日
回答済み: Walter Roberson 2024 年 8 月 22 日
I need to run nested for-loops over the variables J1 and J2. The range for J1 is 1 to 41, and the range for J2 is 1 to 9. Inside these loops, I evaluate 16 functions, each of which returns an array of complex numbers with a size of 500 by 502.
I used the following given method to save the data, and it produced an 11 GB file, which seems very large. Is this normal? What is an efficient way to save this data at the end of the calculation?
What I want to do with this data afterward:
I will need to access the 16 arrays, A1 to A16, within the same J1 and J2 loop to perform other operations. Therefore, I want to store the data in a way that allows easy access to these 16 arrays within the loops.
My method to store data:
all_data = cell(41,9);
for J1 = 1:41
for J2 = 1:9
%evaluate 16 function to get 16 arrays (A1 to A16) of size 500 x 502:
all_data{J1,J2} = struct("A1", A1,...
"A2", A2,...
"A3", A3,...
"A4", A4,...
"A5", A5,...
"A6", A6,...
"A7", A7,...
"A8", A8,...
"A9", A9,...
"A10", A10,...
"A11", A11,...
"A12", A12,...
"A13", A13,...
"A14", A14,...
"A15", A15,...
"A16", A16);
end
end
save('Saved_Data.mat','-v7.3');

採用された回答

Matt J
Matt J 2024 年 8 月 22 日
編集済み: Matt J 2024 年 8 月 22 日
I used the following given method to save the data, and it produced an 11 GB file, which seems very large.
The memory consumption is about right if you are using double floats,
numGB=prod([500,502,16, 41,9])*8/2^30
numGB = 11.0410
In terms of RAM access, it would probably be faster to organize it is a multidimensional array, as below, and as single floats if you don't need double precision.
all_data=rand(500,502,16, 41,9,"single");
for J2 = 1:9
for J1 = 1:41
for J3=1:16 %evaluate 16 functions func{J3}
all_data(:,:,J3,J1,J2)=func{J3}(___) ;
end
end
end
  3 件のコメント
Luqman Saleem
Luqman Saleem 2024 年 8 月 22 日
編集済み: Luqman Saleem 2024 年 8 月 22 日
Alright, I tried saving the data in 41*9=369 folders with 16 csv files each. The total size of all the files combined is again 12 GB.
Matt J
Matt J 2024 年 8 月 22 日
編集済み: Matt J 2024 年 8 月 22 日
Yes, I don't think you hope for much compression on disk. Unless perhaps the data is sparse, or consists of integers?

サインインしてコメントする。

その他の回答 (1 件)

Walter Roberson
Walter Roberson 2024 年 8 月 22 日
all_data{J1,J2} = struct("A1", A1,...
You are creating a separate struct for each {J1,J2}, complete with all of the struct overhead. It would be more efficient if you use
all_data(J1,J2) = struct("A1", A1,...
so as to create a struct array. struct arrays have lower overhead compared to creating a seperate struct for each case.
You will need to initialize all_data differently. I suggest,
clear all_data
for J1 = 41:-1:1
for J2 = 9:-1:1
%evaluate 16 function to get 16 arrays (A1 to A16) of size 500 x 502:
all_data(J1,J2) = struct("A1", A1,...
Counting backwards like this will have the side effect of initializing the struct array to its largest size, and then to fill in the pieces. This approach avoids growing the struct array dynamically.

カテゴリ

Help Center および File ExchangeWhos についてさらに検索

製品


リリース

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by