How to avoid broadcast variable while optimizing a cost function in parallel computing?

Question

Shmuel Lorber 2022 年 12 月 6 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1872057-how-to-avoid-broadcast-variable-while-optimizing-a-cost-function-in-parallel-computing

回答済み: Edric Ellis 2022 年 12 月 7 日

I'm trying to minimize a heavy cost function (2500X2500 is the biggest matrix in it) using PSO in parallel computing. It takes me a couple of days for only one (!) iteration and I'm not sure why. Will be very thankfull for any help.

I use parallel computing in order to fasten things, but for now I get the message "The entire array or structure 'CostFunction' is a broadcast variable. This might result in unnecessary communication overhead". This are the problematic lines:

parfor i=1:nPop

% Evaluation (position value in the cost function)

particle(i).Cost = CostFunction(particle(i).Position);

end

While CostFunction is a function handle I defined earlier in the code, and it's input changes each iteration.

Using MATLAB profiler I managged to get statistics of the running time of my code, pointing that most of running time is in that single parfor loop

While ICF is my original cost function, and diss+null are the children of it. As I understand from the flame graph ICF and it's children are not children of the parfor loop, hence the running time is divided between the loop and the cost function seperately. And the time consuming Java method I dont know, but I do know it's part of the parallel process.

So I'm basically asking two questions:

Is the broadcast variable problem the cause for the long running time?
how can I avoid broadcasting my cost function?

thanks in advance

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Edric Ellis 2022 年 12 月 7 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1872057-how-to-avoid-broadcast-variable-while-optimizing-a-cost-function-in-parallel-computing#answer_1122132

Investigating performance of parfor loops can be a bit tricky. Here are a few pointers:

Do you happen to know if your function already benefits from MATLAB's intrinsic multi-threading? (Check using your system's "Task Manager" or equivalent). If so, using only local workers with PCT will not speed things up as you are already using all your machine's resources. (Process workers run in single-threaded mode so each worker might well process things more slowly than your client - but if you've got several of them, you can still get speedup overall)
You can check the data transfer size using ticBytes and tocBytes. However, 2500x2500 is not particularly large, and I wouldn't expect it to cause things to take that long
You can use mpiprofile to profile the execution time on the workers - the client profile only shows that you're waiting for workers to complete their work.(This works fine with parfor, despite the name)