Best practice for asynchronous data saving
23 ビュー (過去 30 日間)
古いコメントを表示
I have a simulation code which exhibits the usual bursty IO behavior of FDTD simulations. To wit,
x = lotsOfData();
for iter = 1:bigNumber
x = takeTimestep(x);
if mod(iter, stepsPerSave) == 0
dumpDataToDisk(x, iter);
end
end
For context, x is expected to potentially be of order 50-100GB. The long gaps during blocking calls to dumpDataToDisk() in which no timesteps can be taken represent FLOPS down the drain, and of course tempus fugit.
Granted that sufficient memory to make a copy of x is available, in e.g. C I would use $THREADING_MODEL create a saver thread, hand that thread a copy of x and let it call dumpDataToDisk() so the main thread can resume calculating. The Google intertube suggests (correct me if I'm wrong!) that
- This isn't an unusual performance problem with Matlab and I/O
- There's several IPC options available: The PCT, or various other socket, mmap/shm or file implementations
- There's no official solution to the asynchronous file I/O question (apparently there is async write to sockets?)
It appears that system("command & ") can provide the fork-and-execute like operation I'd like at a high level, so I might then write e.g.
x = lotsOfData();
for iter = 1:bigNumber
x = takeTimestep(x);
if mod(iter, stepsPerSave) == 0
while checkBusyFileFlag(); pause(.025); end
dumpDataToBurstBuffer(x, iter);
system("~/bufferDrainAssistant.sh & "); % writes "1" to busy file, does mv, writes "0"
end
end
Which appears to provide pretty well what I'd wish for in the threaded model, predicated on the availability of a high-speed IO device. Is there an official or best-practice way to handle this?
1 件のコメント
Gavin
2025 年 5 月 1 日
I can't believe something as simple as asynchronous writing to disk isn't a simple standard operation. All those cores in a modern computer and Matlab only uses one! I'm not ever writing a lot of data at a time and still see significant (and increasingly slow over time) slow down of my time critical code. We are losing whole seconds for a writetable even when only writing a single row!
回答 (1 件)
Raymond Norris
2025 年 6 月 20 日 19:34
@Erik Keever in R2021b we introduceed into MATLAB (i.e., doesn't require Parallel Computing Toolbox) backgroundPool for running code in the background pool, using parfeval. Would this suffice?
0 件のコメント
参考
カテゴリ
Help Center および File Exchange で General Applications についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!