About parallel computation and inter process communication

Question

Ash 2014 年 7 月 12 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/141518-about-parallel-computation-and-inter-process-communication

コメント済み: Ash 2014 年 7 月 15 日

Hello all!

There is a piece of code that deals with finding patterns in sequences of strings of varying length. Nothing overly complex - except that the main code includes three loops. Anyway - the basic premise is as follows:

Load the entire data set (essentially as a cell array) consisting of rows of these sequences.
Run the main code
Write the output to a file.

Sequentially this process when running without any parallel directives takes "x" seconds.

Now: if I change this to:

Load the entire data set
Start matlabpool
invoke spmd(n)
Run the main code.
Write the output to file.

The run time is approximately "10x"!!

The machine on which this is being run: 12GB RAM, i7 with 6cores etc. etc.

From my understanding, upon invoking spmd (since I just am interested in letting different workers perform the same job on different sets of data), the total data set is automatically divided. So - logically the run time should decrease.

However, while trying to figure this out: I also divided the data set into process specific files which are loaded based on respective "labindex". That also - did not provide any relief nor answers.

I have some background with MPI and F90 so I am assuming that the significantly increased run time with more than one worker is probably due to inter-process communication. If that is so: is there any way to prevent this?

The problem I am trying to solve is a disjointed one. One set of data has no bearing on the other - so there is no real need for one worker to talk to another.

Any insight would be greatly appreciated. This really has me intrigued.

Cheers!

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Edric Ellis 2014 年 7 月 14 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/141518-about-parallel-computation-and-inter-process-communication#answer_144970

MATLAB Online で開く

What sort of data are you passing into SPMD? Inside SPMD, only distributed arrays are automatically operated on in parallel. For example:

x = rand(5000);
xd = distributed.rand(5000);
spmd
  x = x * x; % all workers operate on their own total copy of 'x'
  xd = xd * xd; % each worker has a slice of 'xd', and they collaborate
end

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

Edric Ellis 2014 年 7 月 15 日

編集済み: Edric Ellis 2014 年 7 月 15 日

MATLAB Online で開く

Unless you need the (MPI-style) communication available within SPMD, you might be better off using PARFOR which can automatically divide up your problem. For example:

% build 'c' which is a 50x1 cell array where each cell is 100x100
c = mat2cell(rand(5000, 100), 100 * ones(50,1), 100);
% operate on 'c' in parallel
parfor idx = 1:numel(c)
  out{idx} = max(abs(eig(c{idx})));
end

The key to getting PARFOR working in this case is that you index into your cell array ("c" in the above example) using the loop variable - this ensures the data is 'sliced', and therefore can be operated on efficiently in parallel.

Ash 2014 年 7 月 15 日

I had looked at parfor earlier. However, let me make some changes to the code, and get back with my findings. I really appreciate your inputs. Thanks...

サインインしてコメントする。

About parallel computation and inter process communication

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

About parallel computation and inter process communication

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示