Is there an easy way to find out which workers are running on the same host in a Generic Cluster job so I can efficiently allgather?

3 ビュー (過去 30 日間)
Say I have the following script which submits a job to a Generic parallel cluster, which has procsPerNode=2:
j.NumWorkersRange=[4 4];
What this will do, is reques 2 nodes from my cluster, each of which will individually run 2 MATLAB workers in paralel, which alltogether will run mySpmdFunction as though it was launched within an spmd statement (so they can do stuff like labSend to communicate and use labindex to get an id, etc).
My question is, is there any way for the nodes to know which other workers are 'local'--i.e., which ones reside on the same piece of hardware versus which ones are remote? A way to use reflection to find this information is preferred, but if that's not available will MATLAB consistently assign workers to nodes sequentially (so then workers 1 and 2 will always share a node and workers 3 and 4 will always share a node in the example)? If there's no way to inquire what workers share nodes, is there a way to inquire and find the GenericCluster the workers are running on so I can find the procsPerNode property?
For that matter, is there a built-in allgather function? I'm really only investigating this to implement my own allgather from scratch...


Edric Ellis
Edric Ellis 2022 年 8 月 30 日
You can use gop to perform general MPI-style all-reduce operations, and a special case of that is gcat which can operate as an MPI-style gather or all-gather. For example:
Starting parallel pool (parpool) using the 'local' profile ... Connected to the parallel pool (number of workers: 2).
x = gcat(labindex);
1 2

その他の回答 (0 件)


Help Center および File ExchangeMATLAB Parallel Server についてさらに検索




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by