Parpool worker distribution on HPC cluster in Windows Server 2012. How to choose/constraint the number of cores in each compute node?

2 ビュー (過去 30 日間)
In my University HPC cluster there are 32 computes nodes x 16 cores each, for a total of 512 cores, and each node has 64 GB RAM. So using the cluster at full capacity, my parallel (SPMD) code running in each worker should not use much more than 4 GB RAM, or the computations will be slowed down due to system page swapping, or pool will crash with "out of memory" error.
The cluster runs on Windows Server 2012, and I noticed that MATLAB allocate workers consecutively. That is, that the cores on the compute node N+1 are only assigned after all the cores in nodes 1, 2, ..., N, have been assigned.
My issue is that I am developing a program that will probably demand 8 GB per core, and I need to use as many cores as possible, so I was planning to call a parpool in a way that I will get only 8 cores from node 1, 8 cores from node 2, ..., and so, but I don't know how to do it without wasting resources.
Is there is any way of solving this in Windows?
  1 件のコメント
Alvaro
Alvaro 2022 年 12 月 21 日
This might depend on how your university cluster is setup, but you could try this at first:
Maybe elaborate a bit more onto how your university cluster is setup, are you sending batch jobs through some custom or specific scheduler? Are you using Matlab Parallel Server by any chance?

サインインしてコメントする。

回答 (0 件)

カテゴリ

Help Center および File ExchangeParallel Computing Fundamentals についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by