Parpool errors on SLURM computing system

2 ビュー (過去 30 日間)
Caleb_Holt
Caleb_Holt 2018 年 8 月 14 日
I'm running a script that relies on a parfor loop. To initialize my parpool I use the command
parpool(20) % where 20 is the number of cores I have requested.
Occasionally I get the error
Error using parpool (line 113)
Failed to start a parallel pool. (For information in addition to the causing
error, validate the profile 'local' in the Cluster Profile Manager.)
Caused by:
Error using parallel.internal.pool.InteractiveClient>iThrowWithCause (line
675)
Failed to start pool.
Error using parallel.Job/submit (line 351)
Error closing file
/gpfs/home/cholt/.matlab/local_cluster_jobs/R2017b/Job47.in.mat.
The file may be corrupt.
Sometimes my code works fine. And sometimes I get that error. I don't really understand what's happening and why that error appears sometimes but not always. How can I fix it? Thanks for your help -C

回答 (2 件)

Zenin Easa Panthakkalakath
Zenin Easa Panthakkalakath 2018 年 8 月 17 日
Hey Caleb,
It is possible that one or more of the workers never managed to fully start. This may have been caused by certain preference settings. Try to regenerate the MATLAB preferences.
Also can you please check the "startup.m" file if it contains any commands which will alter the MATLAB preferences. If yes, try commenting out these and run the program again.
Regards
Zenin
  1 件のコメント
Caleb_Holt
Caleb_Holt 2018 年 8 月 17 日
Hey Zenin
I haven't messed with the preference settings at all, that file is still empty. I also don't have a startup.m file. I think the problem is that it is trying to access a job file that doesn't exist. I'm not sure if I should make those files or what needs to happen.

サインインしてコメントする。


Zenin Easa Panthakkalakath
Zenin Easa Panthakkalakath 2018 年 8 月 20 日
Hey Caleb,
I understand that you haven't made any changes to the preference settings. In that case, it could mean that the MATLAB Cluster may not have been validated. Please refer to these pages in order to do the same:
Please let me know if this works for you.
Regards
Zenin
  3 件のコメント
Caleb_Holt
Caleb_Holt 2018 年 8 月 20 日
In particular these lines of the error message
Error using parallel.Job/submit (line 351) Error closing file /gpfs/home/cholt/.matlab/local_cluster_jobs/R2017b/Job47.in.mat. The file may be corrupt.
I think the fact that I've reached my space limitation on my home directory, where the .matlab file is stored, is what's throwing this error. How can I change where that .out.mat file is stored?
Zenin Easa Panthakkalakath
Zenin Easa Panthakkalakath 2018 年 8 月 21 日
Hey Caleb,
The directory that you've mentioned is the preference directory.
>>prefdir % this would return the preference directory.
In order to set a custom preference directory, please try the solution in this article .
Regards
Zenin

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeCluster Configuration についてさらに検索

タグ

製品


リリース

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by