why innerjoin does not work in parfor?

Question

Boram Lim 2018 年 5 月 4 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/399159-why-innerjoin-does-not-work-in-parfor

回答済み: Edric Ellis 2018 年 5 月 8 日

While trying to use parfor, I am trying to find an error. I found that using a innerjoin (line 10-12 below) makes a problem. It is okay when I use just for-loop but it does not work with parfor. Why it causes a problem? I used innerjoin as a way of randomly sampling 'id' (one of a variable in my data) and merge it with original dataset (dta2 is here). Any idea or solution? please let me know if there is anything to be cleared here to understand.

parpool(4)
N_boot = 5;
coeff_out2 = zeros(N_boot,N_coef);
parfor i = 1:N_boot
dta2 = dta;
decisions2 = unique(dta2.decision_id);
Ndecisions2 = size(decisions2,1);
sampled_id01 = randsample(decisions2,Ndecisions2,true);
sampled_id2 = dataset2table(mat2dataset(sampled_id01));
sampled_id2.Properties.VariableNames{1} = 'decision_id';
resample_dta = innerjoin(sampled_id2,dta2,'Keys','decision_id');
resample_dta = table2array(resample_dta);
result1 = mean(resample_dta(:,1:4));
coeff_out2(i,:) = result1;
end

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

Boram Lim 2018 年 5 月 5 日

Error using mat2dataset (line 63) Transparency violation error. See Parallel Computing Toolbox documentation about Transparency

Error in Model01_interpolated_May1 (line 62) parfor i = 1:N_boot

Boram Lim 2018 年 5 月 5 日

This is the error message.

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Edric Ellis 2018 年 5 月 8 日

2
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/399159-why-innerjoin-does-not-work-in-parfor#answer_319246

MATLAB Online で開く

(x-post from identical question on stackoverflow)

Unfortunately, innerjoin uses the inputname function, which is causing the "transparency violation" error. There's a simple workaround, which is to wrap the call to innerjoin, like so:

innerjoinFcn = @(varargin) innerjoin(varargin{:});
parfor ...
    ...
    resample_dta = innerjoinFcn(sampled_id2,dta2,'Keys','decision_id00');
end

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Answer 2

Walter Roberson 2018 年 5 月 5 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/399159-why-innerjoin-does-not-work-in-parfor#answer_318776

MATLAB Online で開く

I can get further:

decision_id = randi([1 9], 50, 1);
d1 = randi([-10 10], 50, 1);
d2 = randi([-2 2], 50, 1);
d3 = randi([0 255], 50, 1);
dta = table(decision_id, d1, d2, d3);
N_coef = 4;
cp = gcp('nocreate');
if isempty(cp); parpool(4); end
N_boot = 5;
coeff_out2 = zeros(N_boot,N_coef);
parfor i = 1:N_boot
    dta2 = dta;
    decisions2 = unique(dta2.decision_id);
    Ndecisions2 = size(decisions2,1);
    decision_id = randsample(decisions2,Ndecisions2,true);
    sampled_id2 = table(decision_id, 'VariableNames', {'decision_id'});
    resample_dta = innerjoin(sampled_id2,dta2,'Keys','decision_id');
    resample_dta = table2array(resample_dta);
    result1 = mean(resample_dta(:,1:4));
    coeff_out2(i,:) = result1;
end

This gives up on the innerjoin instead of earlier.

The conversion to table was running into problems when it was not being told variable names when the table was constructed, which could hypothetically be explained if the variable names themselves were not guaranteed to be the same in the workers (because the default creation of tables involves using the name of the variable being converted as the column name.)

We could hypothesize that something similar might be happening with the innerjoin.

I am not sure how to fix it yet, as I am still trying to figure out what the intention of the code is, especially in regard to what should happen when there are multiple table entries with the same key.

Or is it safe to assume that the decision_id values will be unique? If so then the call to unique would seem to be redundant ?

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

Walter Roberson 2018 年 5 月 5 日

Right but to do this efficiently I need to know if decision_id is unique in dta or not, and if it is not then what the meaning of sampling with it should be.

Boram Lim 2018 年 5 月 5 日

it is not unique. As shown in the example in the link, first I need to sample 5 ids from unique decision_id. and then need to produce a new data set (it's for bootstrapping). Do you understand what I want to do in the link? Using a innerjoin worked in just-loop as an answer of the question in the link. but it seems I need to find alternative way for the work in parlor

サインインしてコメントする。

why innerjoin does not work in parfor?

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

回答 (2 件)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

why innerjoin does not work in parfor?

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

回答 (2 件)

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示