Distributed and spmd not running faster
59 ビュー (過去 30 日間)
古いコメントを表示
I think I'm missing something fundamental about using distributed arrays with spmd. If I run the following the distributed version takes ~0.04s while the non-distributed version completes in ~0.2s (with a process pool matching the cores on my machine).
x = ones(10000, 10000);
tic
x = x * 2.3;
toc
x
x = distributed(ones(10000, 10000));
tic
spmd
x = x * 2.3;
end
toc
gather(x)
What am I missing?
Edit: I moved the tic and toc to after the array initialization and before displaying x to not include that as I realized calling distributed is taking longer while it distributes the array across the processes and gather is taking time.
2 件のコメント
Walter Roberson
2025 年 1 月 25 日 18:43
x = distributed(ones(10000, 10000));
Better would be
x = ones(10000, 10000, 'distributed');
That should reduce the overall execution, but should not change the parts you are timing.
採用された回答
Edric Ellis
2025 年 1 月 27 日 8:00
You're not missing anything. If you're only using the cores on your local machine, distributed is unlikely to be much use to you. The primary goal of distributed is to run on the memory of multiple machines, and enable computations that would otherwise not be possible. A simple breakdown would be:
- Desktop MATLAB is generally good for large array operations that fit in memory
- gpuArray can be even better, if you have a suitable GPU (better still if you can run in lower precision such as single)
- distributed is best for array operations that fit in the memory of multiple machines
- tall works well for operations on data backed by some form of storage (e.g. disk), and whole arrays can never fit in memory even across a cluster
Desktop MATLAB already runs many suitable operations in a multi-threaded manner - there is no way even in principle that distributed could perform better. In fact, for basic operations - if desktop MATLAB cannot multi-thread it - that may well mean that a distributed implementation is either not possible or not efficient.
In your case, one of the main things you're timing is the overhead of going into and out of an spmd context.
その他の回答 (0 件)
参考
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!