parallelization and mex-compiled code

I want to optimize a mex-compiled function (fortran-90 source) defined over an 1D interval by computing its values on a sufficiently fine sampling. It works fine with a for-loop but when I try parfor (for speed) I get crashes in the mex-compiled code (getting a error from one of the workers). Is this a documented problem, and does anyone have suggestions how to localize what goes wrong?
I run MatlabR2013a and Ubuntu 13.10 on a 16 core (32 virtual) machine and I get 12 workers when I do matlabpool.

1 件のコメント

Matt J
Matt J 2014 年 2 月 6 日
No, there is no general prohibition against using mex files with parfor. Show us the plain for-loop and the parfor version.

サインインしてコメントする。

 採用された回答

Matt J
Matt J 2014 年 2 月 6 日
編集済み: Matt J 2014 年 2 月 6 日

0 投票

You should try running a plain for-loop first, but with the iterations in random order, i.e., instead of
for i=1:n
...
end
run as
for i=randperm(n)
...
end
This is a good way to test whether your code is independent of the order of the iterations (a basic requirement of parfor) before the Parallel Computing Toolbox even gets involved.

5 件のコメント

martin
martin 2014 年 2 月 6 日
編集済み: Matt J 2014 年 2 月 6 日
My script:
cat timeme_straightsearch.m
stupidvector=zeros(1,360);
disp('search with conventional for-loop:')
tic
for countindex = 1:360
stupidvector(countindex)=localsearchstrul_valuespectral(countindex);
end
[foundpsi,whichelement]=min(stupidvector)
toc
disp('search with parfor-loop:')
tic
parfor countindex = 1:360
stupidvector(countindex)=localsearchstrul_valuespectral(countindex);
end
[foundpsi,whichelement]=min(stupidvector)
toc
and in matlab:
>> matlabpool
Starting matlabpool using the 'local' profile ... connected to 12 workers.
>> timeme_straightsearch
search with conventional for-loop:
foundpsi =
-5.3948
whichelement =
121
Elapsed time is 186.328337 seconds.
search with parfor-loop:
Error using distcomp.remoteparfor/getCompleteIntervals (line 22)
The session that parfor is using has shut down.
Error in timeme_straightsearch (line 13)
parfor countindex = 1:360
Caused by:
Error using distcomp.remoteparfor/getCompleteIntervals (line
22)
The session that parfor is using has shut down.
The client lost connection to lab 1. This might be due to network
problems, or the interactive matlabpool job might have errored.
Matt J
Matt J 2014 年 2 月 6 日
Are your parallel labs on remote machines? It rather does look to me like a network error like the error message suggests.
martin
martin 2014 年 2 月 6 日
No, its one machine. My interpretation of the error was that the the process generating the error just was aware that the process on the worker had died
Matt J
Matt J 2014 年 2 月 6 日
Can you try it on a different machine to see if it's hardware problem? I don't see anything wrong with the code.
martin
martin 2014 年 2 月 9 日
Thanks for your input, I will try another machine asap. Just an additional observation: The program crashes on the fortran90 statement "call mxCopyPtrToReal8(inptr_xdim,realxdim,1)" i.e a standardconstruction right out of the manualmapges for mex

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

ヘルプ センター および File ExchangeParallel Computing Fundamentals についてさらに検索

質問済み:

2014 年 2 月 6 日

コメント済み:

2014 年 2 月 9 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by