Issue with launching Parallel Workers when using TORQUE or PBS

1 回表示 (過去 30 日間)
Mohammad Abouali
Mohammad Abouali 2016 年 12 月 2 日
コメント済み: Mohammad Abouali 2016 年 12 月 5 日
Hi,
I am attaching the full output of MATLAB when trying to set useParallel option of GA to true.
This been causing some trouble in launching parallel pool.
Note that I am scheduling 100s of such runs on a computer cluster using Torque (or PBS, well I use qsub command).
So, I am not running them interactively. I am attaching both the full MATLAB Output and the scheduler script that I use to submit the job using qsub.
Any help is appreciated.

採用された回答

Edric Ellis
Edric Ellis 2016 年 12 月 5 日
One problem might be that the job storage locations are colliding, and you're ending up with many processes trying to write data to the same location. You could work around this by creating a local cluster instance using a unique job storage location. Something like this:
tempLoc = tempname;
mkdir(tempLoc);
clus = parallel.cluster.Local('JobStorageLocation', tempLoc);
parpool(clus);
  2 件のコメント
Mohammad Abouali
Mohammad Abouali 2016 年 12 月 5 日
Thanks Edric. I am also suspecting that this is what might be happening, i.e. two different scheduled jobs trying to use same storage locations.
I check the solution you have provided and get back to you if it resolved the issue.
Thank you again.
Mohammad Abouali
Mohammad Abouali 2016 年 12 月 5 日
Your suggestion helped that problem. But then some of the runs had problem to execute the GA-Objective function.
One thing that I noticed is that I am requesting nodes=1:ppn=4. But then I launch 4 workers. That makes total of 5 process, (1 main matlab + 4 parallel workers).
I rescheduled the jobs using the work around that you provided and also increased the ppn to 5. So far things seems going fine.
So, I think we got the issue resolved. I am going to accept the answer at this point. and if later I found out it is not working I will post another question.
Thank you so much for your help. I really appreciate it.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeStartup and Shutdown についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by