Distributed arrays unevenly distributed

2 ビュー (過去 30 日間)
Maria
Maria 2021 年 10 月 13 日
コメント済み: Oli Tissot 2021 年 11 月 12 日
Hi,
I have a remote cluster with 8 nodes, and each node has 16 GB of memory.
I am running an example with a big 3D matrix of size around 10000x 4500 x 8. I tried now to launch a batch job. The matrix is created directly in the function as distributed array, as
H_sym = zeros(m,m,LENGTH_BETA,'distributed')+1j*zeros(m,m,LENGTH_BETA,'distributed');
However, if I look at each node status (in Linux, with htop), I see that all cores of all nodes are working, and all nodes have 4 GB of memory occupied that does not change, all except the 1st node. The 1st node shows an allocation of memory that changes between 8GB and 13 GB.
Why is only the first node that has a larger occupation of memory, that changes over time? Shouldn't the "distributed" distribute the matrix in the same way among all nodes?
Best
Maria
  1 件のコメント
Oli Tissot
Oli Tissot 2021 年 11 月 12 日
Hi Maria,
When distributed arrays are constructed, they are distributed as evenly as possible along the second dimension. In your case, it means 4500 is spread into 8 parts and some workers end up getting 10000x562x8 local parts whereas others are getting 10000x563x8 local parts. So not all workers are using the exact same amount of memory, but I believe that do not explain the discrepancy you're seeing. I suspect the computation you're doing afterwards on H_sym involves communication between workers, thus workers receiving messages use more memory. And that could explain what you are seeing. What computation are doing on H_sym after creating it?
Finally, the way you're building H_sym is correct but there is more efficient here:
H_sym = zeros(m,m,LENGTH_BETA,'like',distributed(1i));
Cheers,
Oli

サインインしてコメントする。

回答 (0 件)

カテゴリ

Help Center および File ExchangeMATLAB Parallel Server についてさらに検索

製品


リリース

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by