How are iterations assigned to workers in parfor?

7 ビュー (過去 30 日間)
Yi-xiao Liu
Yi-xiao Liu 2020 年 1 月 8 日
コメント済み: Edric Ellis 2020 年 7 月 31 日
I am currently using parfor to process multiple raw data files, in the statement, it first checks if the raw file have already been processed, and only process if it does not see an existing output, like this:
RawDatalist=dir(fullfile(RawDataFolder,'*.txt'));
NumRawData=length(RawDatalist);
parfor i =1:NumRawData
if %output for RawDatalist(i).name already exist
Execute=false;
else
Execute=true;
end
if Execute
%Process RawDatalist(i).name
end
end
Obviously some iterations will take less time than others because there is no calculation involved. I am just wondering if iterations are 1)devided among works at the start of parfor, or 2)handed out one by one once a worker become available? If it's the first case then some workers will be just sitting idle while some others busy working, and I need to move the existance check out of the parfor loop.
  2 件のコメント
Mohammad Sami
Mohammad Sami 2020 年 1 月 8 日
編集済み: Mohammad Sami 2020 年 1 月 8 日
Please see here. Your code block in parfor should be independent of other iterations. This will ensure correct parallel execution. https://www.mathworks.com/help/releases/R2019b/parallel-computing/decide-when-to-use-parfor.html
Each execution of the body of a parfor-loop is an iteration. MATLAB workers evaluate iterations in no particular order and independently of each other. Because each iteration is independent, there is no guarantee that the iterations are synchronized in any way, nor is there any need for this. If the number of workers is equal to the number of loop iterations, each worker performs one iteration of the loop. If there are more iterations than workers, some workers perform more than one loop iteration; in this case, a worker might receive multiple iterations at once to reduce communication time.
Yi-xiao Liu
Yi-xiao Liu 2020 年 1 月 8 日
"a worker might receive multiple iterations at once to reduce communication time."
I would like to know more details, exactally when will a worker receive multiple iterations at once?

サインインしてコメントする。

採用された回答

Edric Ellis
Edric Ellis 2020 年 1 月 8 日
As @Mohammad already commented, the parfor implementation automatically divides up the iterations of the loop onto the workers. Since R2019a, you can have some control over this division using parforOptions. The default division works well in most situations, even when the loop iterations do not take equal amounts of time. However, if there is a large imbalance, the division might not work well, and it may indeed be worth pre-computing which iterations need real work to be done.
  4 件のコメント
Isidro Losada López
Isidro Losada López 2020 年 7 月 31 日
Hi,
One question related to that: I'm trying to run a parfor loop with more iterations than workers are available in my cluster. The thing is that, some of the iterations are much more expensive than others, so it would be better to calculate first the heavy iterations than the lighter ones. As you told above, there is a way in Matlab/R2019a, parforOptions. The problem is that I'm using Matlab/R2014b hahaha. So, the order of the iterations are totally random or Matlab has any pattern to decide the order?
For example, I have a set of different molecules and their information in a cell array. Every component is an structure with atomic positions, atomic numbers, etc. So, every iteration is going to find the energy and their derivatives and store them in the same cell array. But because some molecules are bigger than others, it would be better to calculate first the biggest molecules. Here I show you a dummy code to illustrate what I'm trying to say:
nw = 20; % number of cores
nm = 40; % number of molecules
myCluster = parcluster('local');
myParpool = parpool(myCluster,nw); % start parallel pool
mol = cell(1,nm);
parfor im = 1:nm
mol{im} = energyFunction(mol{im});
end
What could I do to force Matlab to follow some especific order when calculating iterations? Maybe changing the order of the molecules in cell array "mol" could change something? Or for this version the order is totally random?
Thanks in advance!
Edric Ellis
Edric Ellis 2020 年 7 月 31 日
If you know which computations are likely to take longest, you can force the ordering by using parfeval and initiating those computations first.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeParallel for-Loops (parfor) についてさらに検索

タグ

製品


リリース

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by