How to restart a CommunicatingJob using only the MATLAB workspaces?

1 回表示 (過去 30 日間)
Alberto Brandl
Alberto Brandl 2021 年 2 月 11 日
コメント済み: Alberto Brandl 2021 年 2 月 12 日
My code runs several jobs on an HPC using createCommunicatingJob and then assigning a task in it. After submission, the main code exit. It works nicely, however sometimes I ask for the wrong amount of time or RAM and some jobs do not save any results. The question is: I see that MATLAB creates a folder and a workspace with the same name of the Job, is it possible to load those workspaces and requeue the job with a different SubmitArguments, without doing it manually? I'd do it from the HPC manager but it is not possible to requeue a TIMEOUT job in my Slurm configuration.

採用された回答

Raymond Norris
Raymond Norris 2021 年 2 月 11 日
There's no automated process for reading the Job files and recreating the job. It'd be much easier to recreate the steps you've already run. For that reason, I'd suggest keeping all of this in a script and simply rerunning the script.
  1 件のコメント
Alberto Brandl
Alberto Brandl 2021 年 2 月 12 日
That's a pity, it should be considered in the future because it would be very convenient. In my case, for instance, each job has different input arguments and I have to discover manually which job actually failed, even knowing the job ID. Nothing too hard, however it might be faster to load and re-execute the job. Thanks!

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeCluster Configuration についてさらに検索

製品


リリース

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by