フィルターのクリア

Running parfor on multiple nodes using Slurm

93 ビュー (過去 30 日間)
Siamak Abolhassani
Siamak Abolhassani 2022 年 6 月 28 日
コメント済み: Abdolrazzagh 2023 年 12 月 8 日
Hello everyone,
What I really like to do is to take advantage of running a Matlab script (written to be run in parallel on multiple cores using "parfor" concept) remotely on a supercomputer. The issue is not to run the script on just one node (ex. the node includes 48 cores) but is to run it on multiple nodes (more than 48 cores).
Attached you can find a simple 10-line Matlab script (parEigen.m) written by the "parfor" concept. I have attached the corresponding shell script I used, and the Slurm output from the supercomputer as well. From the Slurm output, you clearly see that I could succeed to run the script on 48 cores (1 node) on the supercomputer. However, I am looking for a solution to run the Matlab script remotely on more cores (multiple nodes) on the supercomputer.
I really appreciate any help you could provide with this. Please consider that I am a normal user of the supercomputer. I do not have access to the Matlab GUI and the parallel computing toolbox setting.
My Matlab script:
function [elapsedTime] = test_for()
nworker = str2double(getenv('SLURM_NTASKS')) - 1
defaultProfile = parallel.defaultClusterProfile
myCluster = parcluster(defaultProfile);
parpool(myCluster, nworker)
N = 1000;
A = zeros(N,1);
tic;
parfor i = 1 : N
E = eig(rand(100))+i;
A(i) = E(1);
end
elapsedTime = toc;
end
My shell script:
#!/bin/bash
#SBATCH --job-name="pforTest"
#SBATCH --time=00:15:00
#SBATCH --ntasks=48
#SBATCH --cpus-per-task=1
#SBATCH --partition=compute
#SBATCH --mem-per-cpu=1GB
#SBATCH --account=researcher
module load matlab
matlab -r parEigen
Slurm output:
MATLAB is selecting SOFTWARE OPENGL rendering.
< M A T L A B (R) >
Copyright 1984-2021 The MathWorks, Inc.
R2021b (9.11.0.1769968) 64-bit (glnxa64)
September 17, 2021
To get started, type doc.
For product information, visit www.mathworks.com.
defaultProfile =
'local'
Starting parallel pool (parpool) using the 'local' profile ...
Connected to the parallel pool (number of workers: 47).
ans =
ProcessPool with properties:
Connected: true
NumWorkers: 47
Cluster: local
AttachedFiles: {}
AutoAddClientPath: true
IdleTimeout: 30 minutes (30 minutes remaining)
SpmdEnabled: true
ans =
0.9884
  1 件のコメント
Abdolrazzagh
Abdolrazzagh 2023 年 12 月 8 日
Dear Siamak, I am facing with this problem too. I would like to know how did you solve this problem?

サインインしてコメントする。

採用された回答

Raymond Norris
Raymond Norris 2022 年 6 月 28 日
The local scheduler will only spawn workers on the same machine running the MATLAB client (e.g., on a Slurm compute node). In order to run a parallel job that spawns across mulitple nodes, you'll need the MATLAB Parallel Server. In doing so, you'll have the option to submit the job from MATLAB running on your desktop machine or from MATLAB running on your Slurm cluster (as you're doing now with the local scheduler).
  3 件のコメント
Raymond Norris
Raymond Norris 2022 年 6 月 28 日
  1. Yes, the local scheduler is defined by the "local" profile.
  2. Regarding MATLAB Parallel Server, your organization would first need to have purchased it. Then it needs to be installed on your Slurm cluster by someone with privilages to install it. Lastly, you need to create a Slurm profile. You can find more information here; however, it's focused mostly using a GUI. https://www.mathworks.com/help/matlab-parallel-server/install-and-configure-matlab-parallel-server-for-slurm-1.html
However, in the most simpliest case, you can just call the following
slurm = parallel.cluster.Slurm;
pool = slurm.parpool(96);
I would suggest setting certain values, such as
jsl = fullfile(getenv('HOME'),'.matlab','SlurmJobStorageLocation');
mkdir(jsl);
slurm.JobStorageLocation = jsl;
% Change "192" to the number of MATLAB Parallel Server licenses you own
slurm.NumWorkers = 192;
slurm.saveAsProfile('slurm')
Now in the future when you want to run your jobs, reference the new profile, slurm. For example
slurm = parcluster('slurm');
pool = slurm.parpool(96);
There's more you can add via SubmitArgs, such as walltime, partition, GPUs, etc. Contact Technical Support for more help.
Siamak Abolhassani
Siamak Abolhassani 2022 年 6 月 28 日
Thank you for your answer, Raymond! That was great!

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeCluster Configuration についてさらに検索

製品


リリース

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by