Very bad performance of parallel tasks calling a Matlab precompiled executable

1 回表示 (過去 30 日間)
Daniel
Daniel 2012 年 8 月 25 日
Hi all,
I’m having trouble understanding the behaviour of Matlab executables in a parallel environment. I’m not talking at all about the Parallel Toolbox of Matlab, but about a much simpler procedure.
My C++ program creates 128 MPI tasks. Each task contains a system call that invokes an executable that was written in Matlab and compiled with mcc (with the explicite flag “-singleCompThread”). Each one of these syscalls will apply the same function to an independent subset of my data, and there is no need for communication between processors.
That should be the ideal setting for parallel computing… but my performance times are miserable. The speed-up for 128 processors is not even near 100x: it is about 15x.
This I cannot understand…. I would expect either rather good speed up numbers (as typically obtained in this “embarrisingly parallelizable ” applications…) or speedups of 1x (the MCR allows a single instance). This behaviour in-between puzzles me…
Has somebody come across this problem at some point?
Thanks, Daniel

採用された回答

Daniel
Daniel 2012 年 8 月 27 日
Well, I find the answer to my own question, in case some other people come across the problem: I just needed to provide a path for a temporal folder used by the MCR library.
I included the lines in the script that I submit to the cluster:
export MCR_CACHE_ROOT=./tmp/mcr_cache mkdir -p $MCR_CACHE_ROOT
... and that's it...
  1 件のコメント
Walter Roberson
Walter Roberson 2012 年 8 月 27 日
Would that have the effect of having them all looking to unpack in the same directory?

サインインしてコメントする。

その他の回答 (1 件)

Walter Roberson
Walter Roberson 2012 年 8 月 25 日
There is a quite high start-up time for MATLAB executables. They have to do the internal equivalent of starting up MATLAB.
Note that there is expected to be little speed-up for a MATLAB executable compared to starting MATLAB. You do get the benefit of not having to parse the routines (but you could pcode to avoid that within MATLAB), but on the other hand each executable could end up unpacking all the CTF components into a directory, which would usually be more work than the parsing.
You might get better performance by using MATLAB as the coordinating routine, starting a pool of workers.
  2 件のコメント
Daniel
Daniel 2012 年 8 月 26 日
Thanks Walter, but I guess that should'nt be the problem in this case: each one of the MPI tasks assigned to a single core takes hours to complete, so that the starting time should be neglectable against the computing time.
Walter Roberson
Walter Roberson 2012 年 8 月 26 日
What is the 15x measured relative to?

サインインしてコメントする。

カテゴリ

Help Center および File ExchangePerformance and Memory についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by