Parpool Fail 2015a HPC
1 回表示 (過去 30 日間)
古いコメントを表示
Hi,
We have recently purchased a HPC for simulation research. It has 64 cores (4 AMD Opteron) and 256 gb ram. The OS is linux CentOS 6.
We have installed MatLab 2015a 64-bit on the machine.
I am trying to open 64 parallel workers using parpool(64) command, but it gives me error. Even the profile cannot be validated. When I reduce the number to 25 it works. But any number more than 25 I get the error. The error is attached. I will really appriciate it, if you can help me in this regard. It is worth to mention that with the same machine in Windows 10, I can use all 32 cores and 32 hyperthreads (as windows only detect 2 physical CPUs). It means in Windows i can open up to 64 workers.
Stage: SPMD job test (createCommunicatingJob)
Status: Failed
Description:The job errored or did not reach state finished.
Command Line Output:(none)
Error Report:(none)
Debug Log:
LOG FILE OUTPUT:
[14] < M A T L A B (R) >
[14] Copyright 1984-2015 The MathWorks, Inc.
[14] R2015a (8.5.0.197613) 64-bit (glnxa64)
[14] February 12, 2015
[21] < M A T L A B (R) >
[21] Copyright 1984-2015 The MathWorks, Inc.
[21] R2015a (8.5.0.197613) 64-bit (glnxa64)
[21] February 12, 2015
[6] < M A T L A B (R) >
[6] Copyright 1984-2015 The MathWorks, Inc.
[6] R2015a (8.5.0.197613) 64-bit (glnxa64)
[6] February 12, 2015
[16] < M A T L A B (R) >
[16] Copyright 1984-2015 The MathWorks, Inc.
[16] R2015a (8.5.0.197613) 64-bit (glnxa64)
[16] February 12, 2015
[30] < M A T L A B (R) >
[30] Copyright 1984-2015 The MathWorks, Inc.
[30] R2015a (8.5.0.197613) 64-bit (glnxa64)
[30] February 12, 2015
[7] < M A T L A B (R) >
[7] Copyright 1984-2015 The MathWorks, Inc.
[7] R2015a (8.5.0.197613) 64-bit (glnxa64)
[7] February 12, 2015
[24] < M A T L A B (R) >
[24] Copyright 1984-2015 The MathWorks, Inc.
[24] R2015a (8.5.0.197613) 64-bit (glnxa64)
[24] February 12, 2015
[2] < M A T L A B (R) >
[2] Copyright 1984-2015 The MathWorks, Inc.
[2] R2015a (8.5.0.197613) 64-bit (glnxa64)
[2] February 12, 2015
[12] < M A T L A B (R) >
[12] Copyright 1984-2015 The MathWorks, Inc.
[12] R2015a (8.5.0.197613) 64-bit (glnxa64)
[12] February 12, 2015
[0] < M A T L A B (R) >
[0] Copyright 1984-2015 The MathWorks, Inc.
[0] R2015a (8.5.0.197613) 64-bit (glnxa64)
[0] February 12, 2015
[20] < M A T L A B (R) >
[20] Copyright 1984-2015 The MathWorks, Inc.
[20] R2015a (8.5.0.197613) 64-bit (glnxa64)
[20] February 12, 2015
[31] < M A T L A B (R) >
[31] Copyright 1984-2015 The MathWorks, Inc.
[31] R2015a (8.5.0.197613) 64-bit (glnxa64)
[31] February 12, 2015
[9] < M A T L A B (R) >
[9] Copyright 1984-2015 The MathWorks, Inc.
[9] R2015a (8.5.0.197613) 64-bit (glnxa64)
[9] February 12, 2015
[10] < M A T L A B (R) >
[10] Copyright 1984-2015 The MathWorks, Inc.
[10] R2015a (8.5.0.197613) 64-bit (glnxa64)
[10] February 12, 2015
[28] < M A T L A B (R) >
[28] Copyright 1984-2015 The MathWorks, Inc.
[28] R2015a (8.5.0.197613) 64-bit (glnxa64)
[28] February 12, 2015
[18] < M A T L A B (R) >
[18] Copyright 1984-2015 The MathWorks, Inc.
[18] R2015a (8.5.0.197613) 64-bit (glnxa64)
[18] February 12, 2015
[13] < M A T L A B (R) >
[13] Copyright 1984-2015 The MathWorks, Inc.
[13] R2015a (8.5.0.197613) 64-bit (glnxa64)
[13] February 12, 2015
[23] < M A T L A B (R) >
[23] Copyright 1984-2015 The MathWorks, Inc.
[23] R2015a (8.5.0.197613) 64-bit (glnxa64)
[23] February 12, 2015
[25] < M A T L A B (R) >
[25] Copyright 1984-2015 The MathWorks, Inc.
[25] R2015a (8.5.0.197613) 64-bit (glnxa64)
[25] February 12, 2015
[15] < M A T L A B (R) >
[15] Copyright 1984-2015 The MathWorks, Inc.
[15] R2015a (8.5.0.197613) 64-bit (glnxa64)
[15] February 12, 2015
[26] < M A T L A B (R) >
[26] Copyright 1984-2015 The MathWorks, Inc.
[26] R2015a (8.5.0.197613) 64-bit (glnxa64)
[26] February 12, 2015
[5] < M A T L A B (R) >
[5] Copyright 1984-2015 The MathWorks, Inc.
[5] R2015a (8.5.0.197613) 64-bit (glnxa64)
[5] February 12, 2015
[11] < M A T L A B (R) >
[11] Copyright 1984-2015 The MathWorks, Inc.
[11] R2015a (8.5.0.197613) 64-bit (glnxa64)
[11] February 12, 2015
[17] < M A T L A B (R) >
[17] Copyright 1984-2015 The MathWorks, Inc.
[17] R2015a (8.5.0.197613) 64-bit (glnxa64)
[17] February 12, 2015
[19] < M A T L A B (R) >
[19] Copyright 1984-2015 The MathWorks, Inc.
[19] R2015a (8.5.0.197613) 64-bit (glnxa64)
[19] February 12, 2015
[27] < M A T L A B (R) >
[27] Copyright 1984-2015 The MathWorks, Inc.
[27] R2015a (8.5.0.197613) 64-bit (glnxa64)
[27] February 12, 2015
[8] < M A T L A B (R) >
[8] Copyright 1984-2015 The MathWorks, Inc.
[8] R2015a (8.5.0.197613) 64-bit (glnxa64)
[8] February 12, 2015
[22] < M A T L A B (R) >
[22] Copyright 1984-2015 The MathWorks, Inc.
[22] R2015a (8.5.0.197613) 64-bit (glnxa64)
[22] February 12, 2015
[29] < M A T L A B (R) >
[29] Copyright 1984-2015 The MathWorks, Inc.
[29] R2015a (8.5.0.197613) 64-bit (glnxa64)
[29] February 12, 2015
[3] < M A T L A B (R) >
[3] Copyright 1984-2015 The MathWorks, Inc.
[3] R2015a (8.5.0.197613) 64-bit (glnxa64)
[3] February 12, 2015
[4] < M A T L A B (R) >
[4] Copyright 1984-2015 The MathWorks, Inc.
[4] R2015a (8.5.0.197613) 64-bit (glnxa64)
[4] February 12, 2015
[1] < M A T L A B (R) >
[1] Copyright 1984-2015 The MathWorks, Inc.
[1] R2015a (8.5.0.197613) 64-bit (glnxa64)
[1] February 12, 2015
[21]
[14]
[6]
[21]To get started, type one of these: helpwin, helpdesk, or demo.
[21]For product information, visit www.mathworks.com.
[21]
[14]To get started, type one of these: helpwin, helpdesk, or demo.
[14]For product information, visit www.mathworks.com.
[14]
[16]
[6]To get started, type one of these: helpwin, helpdesk, or demo.
[6]For product information, visit www.mathworks.com.
[6]
[30]
[2]
[16]To get started, type one of these: helpwin, helpdesk, or demo.
[16]For product information, visit www.mathworks.com.
[16]
[30]To get started, type one of these: helpwin, helpdesk, or demo.
[30]For product information, visit www.mathworks.com.
[30]
[0]
[20]
[9]
[28]
[2]To get started, type one of these: helpwin, helpdesk, or demo.
[2]For product information, visit www.mathworks.com.
[2]
[7]
[31]
[10]
[0]To get started, type one of these: helpwin, helpdesk, or demo.
[0]For product information, visit www.mathworks.com.
[0]
[24]
[25]
[12]
[23]
[9]To get started, type one of these: helpwin, helpdesk, or demo.
[9]For product information, visit www.mathworks.com.
[9]
[26]
[20]To get started, type one of these: helpwin, helpdesk, or demo.
[20]For product information, visit www.mathworks.com.
[20]
[28]To get started, type one of these: helpwin, helpdesk, or demo.
[28]For product information, visit www.mathworks.com.
[28]
[17]
[31]To get started, type one of these: helpwin, helpdesk, or demo.
[31]For product information, visit www.mathworks.com.
[31]
[5]
[13]
[7]To get started, type one of these: helpwin, helpdesk, or demo.
[7]For product information, visit www.mathworks.com.
[7]
[11]
[18]
[10]To get started, type one of these: helpwin, helpdesk, or demo.
[10]For product information, visit www.mathworks.com.
[10]
[19]
[8]
[15]
[25]To get started, type one of these: helpwin, helpdesk, or demo.
[24]To get started, type one of these: helpwin, helpdesk, or demo.
[25]For product information, visit www.mathworks.com.
[25]
[24]For product information, visit www.mathworks.com.
[24]
[12]To get started, type one of these: helpwin, helpdesk, or demo.
[12]For product information, visit www.mathworks.com.
[12]
[29]
[23]To get started, type one of these: helpwin, helpdesk, or demo.
[26]To get started, type one of these: helpwin, helpdesk, or demo.
[23]For product information, visit www.mathworks.com.
[23]
[26]For product information, visit www.mathworks.com.
[26]
[17]To get started, type one of these: helpwin, helpdesk, or demo.
[17]For product information, visit www.mathworks.com.
[17]
[27]
[13]To get started, type one of these: helpwin, helpdesk, or demo.
[13]For product information, visit www.mathworks.com.
[13]
[19]To get started, type one of these: helpwin, helpdesk, or demo.
[19]For product information, visit www.mathworks.com.
[19]
[11]To get started, type one of these: helpwin, helpdesk, or demo.
[11]For product information, visit www.mathworks.com.
[11]
[5]To get started, type one of these: helpwin, helpdesk, or demo.
[5]For product information, visit www.mathworks.com.
[5]
[18]To get started, type one of these: helpwin, helpdesk, or demo.
[3]
[18]For product information, visit www.mathworks.com.
[18]
[22]
[8]To get started, type one of these: helpwin, helpdesk, or demo.
[8]For product information, visit www.mathworks.com.
[8]
[15]To get started, type one of these: helpwin, helpdesk, or demo.
[15]For product information, visit www.mathworks.com.
[15]
[29]To get started, type one of these: helpwin, helpdesk, or demo.
[29]For product information, visit www.mathworks.com.
[29]
[1]
[4]
[27]To get started, type one of these: helpwin, helpdesk, or demo.
[27]For product information, visit www.mathworks.com.
[27]
[3]To get started, type one of these: helpwin, helpdesk, or demo.
[3]For product information, visit www.mathworks.com.
[3]
[22]To get started, type one of these: helpwin, helpdesk, or demo.
[22]For product information, visit www.mathworks.com.
[22]
[1]To get started, type one of these: helpwin, helpdesk, or demo.
[1]For product information, visit www.mathworks.com.
[1]
[4]To get started, type one of these: helpwin, helpdesk, or demo.
[4]For product information, visit www.mathworks.com.
[4]
[14] Academic License
[21] Academic License
[6] Academic License
[14]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[21]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[0] Academic License
[16] Academic License
[14]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core
[14]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core/iSetup
[14]2015-11-07 17:01:29 | This process will exit on any fault.
[21]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core
[21]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core/iSetup
[30] Academic License
[14]2015-11-07 17:01:29 | This process will exit when its parent process dies.
[21]2015-11-07 17:01:29 | This process will exit on any fault.
[14]2015-11-07 17:01:29 | About to initialize MPI.
[21]2015-11-07 17:01:29 | This process will exit when its parent process dies.
[21]2015-11-07 17:01:29 | About to initialize MPI.
[6]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[9] Academic License
[2] Academic License
[13] Academic License
[0]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[28] Academic License
[12] Academic License
[31] Academic License
[10] Academic License
[6]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core
[6]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core/iSetup
[6]2015-11-07 17:01:29 | This process will exit on any fault.
[29] Academic License
[6]2015-11-07 17:01:29 | This process will exit when its parent process dies.
[16]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[26] Academic License
[6]2015-11-07 17:01:30 | About to initialize MPI.
[19] Academic License
[30]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[5] Academic License
[0]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[0]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[9]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[0]2015-11-07 17:01:30 | This process will exit on any fault.
[0]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[2]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[25] Academic License
[0]2015-11-07 17:01:30 | About to initialize MPI.
[7] Academic License
[13]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[20] Academic License
[16]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[16]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[8] Academic License
[18] Academic License
[16]2015-11-07 17:01:30 | This process will exit on any fault.
[30]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[28]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[31]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[30]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[24] Academic License
[12]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[16]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[30]2015-11-07 17:01:30 | This process will exit on any fault.
[16]2015-11-07 17:01:30 | About to initialize MPI.
[23] Academic License
[10]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[9]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[15] Academic License
[30]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[9]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[17] Academic License
[29]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[9]2015-11-07 17:01:30 | This process will exit on any fault.
[30]2015-11-07 17:01:30 | About to initialize MPI.
[9]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[2]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[2]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[9]2015-11-07 17:01:30 | About to initialize MPI.
[2]2015-11-07 17:01:30 | This process will exit on any fault.
[26]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[19]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[2]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[11] Academic License
[13]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[13]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[2]2015-11-07 17:01:30 | About to initialize MPI.
[5]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[13]2015-11-07 17:01:30 | This process will exit on any fault.
[28]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[28]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[31]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[31]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[28]2015-11-07 17:01:30 | This process will exit on any fault.
[12]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[13]2015-11-07 17:01:30 | Unexpected error setting up process monitor. Error returned:
[13]Unexpected Standard exception from MEX file.
[13]What() is:boost::thread_resource_error
[13]..
[13]Error in distcomp_evaluate_filetask_core>iSetupProcessMonitoringThreads (line 622)
[13] dct_psfcns('pidwatch', pidToWatch)
[13]Error in distcomp_evaluate_filetask_core>iMaybeSetupProcessMonitoringThreads (line 256)
[13] iSetupProcessMonitoringThreads;
[13]Error in distcomp_evaluate_filetask_core>iSetup (line 506)
[13]iMaybeSetupProcessMonitoringThreads();
[13]Error in distcomp_evaluate_filetask_core (line 25)
[13] runprop = iSetup(handlers, mdceDebugEnabled, outputWriterStack, isSyncTaskEvaluation, varargin);
[12]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[13]2015-11-07 17:01:30 | About to exit with code: 1
[31]2015-11-07 17:01:30 | This process will exit on any fault.
[28]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[12]2015-11-07 17:01:30 | This process will exit on any fault.
[10]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[10]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[28]2015-11-07 17:01:30 | About to initialize MPI.
[31]2015-11-07 17:01:30 | This process will exit when its parent process dies.
job aborted:
rank: node: exit code[: error message]
0: 127.0.0.1: -2
1: 127.0.0.1: -2
2: 127.0.0.1: -2
3: 127.0.0.1: -2
4: 127.0.0.1: -2
5: 127.0.0.1: -2
6: 127.0.0.1: -2
7: 127.0.0.1: -2
8: 127.0.0.1: -2
9: 127.0.0.1: -2
10: 127.0.0.1: -2
11: 127.0.0.1: -2
12: 127.0.0.1: -2
13: 127.0.0.1: -2: process 13 exited without calling init while other processes have called init
14: 127.0.0.1: -2
15: 127.0.0.1: -2
16: 127.0.0.1: -2
17: 127.0.0.1: -2
18: 127.0.0.1: -2
19: 127.0.0.1: -2
20: 127.0.0.1: -2
21: 127.0.0.1: -2
22: 127.0.0.1: -2
23: 127.0.0.1: -2
24: 127.0.0.1: -2
25: 127.0.0.1: -2
26: 127.0.0.1: -2
27: 127.0.0.1: -2
28: 127.0.0.1: -2
29: 127.0.0.1: -2
30: 127.0.0.1: -2
31: 127.0.0.1: -2
Stage: Pool job test (createCommunicatingJob)
Status: Skipped
Description:Validation skipped due to previous failure.
Command Line Output:(none)
Error Report:(none)
Debug Log:(none)
Stage: Parallel pool test (parpool)
Status: Skipped
Description:Validation skipped due to previous failure.
Command Line Output:(none)
Error Report:(none)
Debug Log:(none)
0 件のコメント
回答 (1 件)
Edric Ellis
2015 年 11 月 9 日
This looks like your machine ran out of resources while trying to start up the workers. Do you have any ulimit in effect?
4 件のコメント
Darwin
2016 年 10 月 17 日
I manage Matlab on Linux HPC machines and can use the number of workers equal to the number of cores on 1 node with parpool. Hyperthreading does not work right under CentOS.
参考
カテゴリ
Help Center および File Exchange で Parallel Computing Fundamentals についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!