Ackward Behavior on Matlab Distributed Computing Server.

2 ビュー (過去 30 日間)
Reynaldo
Reynaldo 2013 年 3 月 26 日
I am witnessing a weird behavior when using the Matlab Distributed Computing Server. I have 8 nodes. Each node has 4 cores. The issue appears when trying to open a large pool for the first time. I am only able to open a large Matlab pool if I keep opening smaller pools, each time increasing the number of workers used.
Heres what I am doing:
  • I add all cores on all nodes to the available pool using Admin Center.
  • If I try to open a 32 worker Matlab pool. It gives me an error.
  • Next thing I do is open a 2 worker Matlab pool. It succeeds.
  • I close the pool.
  • I open a 4 worker Matlab pool. It succeeds.
  • I close the pool.
  • I keep doing these steps until I reach 32 workers.
Also, when I finally open 32 workers, if I close the pool and try to open again a 32 worker Matlab pool, it would always succeeds unless I power off the server. If I power off the server, then I would need to perform the same steps I mentioned earlier until I am able to open a 32 worker pool. Any clue as to what might be happening? Thanks!
  4 件のコメント
Friedrich
Friedrich 2013 年 3 月 27 日
What error do you get in the first place?
Jason Ross
Jason Ross 2013 年 3 月 27 日
編集済み: Jason Ross 2013 年 3 月 27 日
Are you using a single MATLAB install (meaning you install it on one node or shared file system and use that via the network)
What OS are you running on?
Also, as noted above, the error and MATLAB version are also needed.

サインインしてコメントする。

回答 (1 件)

Sam Marshalik
Sam Marshalik 2013 年 5 月 10 日
Do you receive any errors when you try to open 32 workers right off the bat?
Before running matlabpool 32 or similar, can you please run setSchedulerMessageHandler(@disp). This will print out what MATLAB is trying to do and there may be a helpful hint in there.
Lastly, I would suggest for you to shut down MDCE on each node and then to restart it. After doing that, if you are using MJS, restart MJS and the workers in the Admin Center.
NOTE: When starting the MDCE service back up, run it with the -clean flag to make sure it is not being held back by previous issues: mdce start -clean.
- Sam

カテゴリ

Help Center および File ExchangeMATLAB Parallel Server についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by