- Which platform is MATLAB Parallel Server running on, Linux or Windows?
- Which scheduler are you using (MJS, PBS, etc.)?
- What size pool are you running?
- How many cores per node?
- How much RAM per node?
Unable to submit task result (Matlab parallel server)
4 ビュー (過去 30 日間)
古いコメントを表示
Hi,
I am running some tests on a cluster. I create a job, and I submit several tasks. But, I get the following error
Error: Cannot rerun task because there are no rerun attempts left (The task has no rerun attempts left.).
Original cancel message:
java.lang.Exception: Unable to submit task result - MATLAB will now exit and restart.
Where shall I start to look at? What does practically this error mean? Is it a problem on the client side, or on the cluster side?
0 件のコメント
回答 (1 件)
Raymond Norris
2021 年 12 月 2 日
Hi Maria,
A few questions first:
If you're running non-MJS, try the following. I'll show using both batch and parpool.
setenv('MDCE_DEBUG','true')
cluster = parcluster;
% If you're using batch
job = cluster.batch();
job.wait
cluster.getDebug(job)
% If you're using parpool
pctconfig('preservejobs',true);
pool = cluster.parpool();
cluster.getDebug(cluster.Jobs(end))
If you're using MJS
mjs = parcluster;
mjs.ClusterLogLevel = 4;
% Call either batch or parpool
mjs.getClusterLogs()
Perhaps the log file will display something else. If I had to guess, I'm betting you're running out of memory.
0 件のコメント
参考
カテゴリ
Help Center および File Exchange で Parallel Computing Fundamentals についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!