How to perform feval function operation correctly on multiple GPUs

Question

轶凡 2024 年 11 月 14 日 13:33

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2166328-how-to-perform-feval-function-operation-correctly-on-multiple-gpus

コメント済み: 轶凡 2024 年 11 月 15 日 2:30

採用された回答: Joss Knight

MATLAB Online で開く

Hello!

Thank you in advance for your help!

I am currently using Matlab 2021b on a Linux system with 4 GPUs (I only want to use the first two). The issue I am facing is that I created a parallel pool with the first two GPUs, and I am using a CUDAKernel object within a parfor loop. However, the CUDAKernel object cannot be serialized.

Following the suggestion from Joss Knight(Thank you very much for his help), I used parallel.pool.Constant to construct a Constant object and passed the constant to the worker processes once. This solved the serialization issue, but I am still unable to generate the desired output using feval.

My bug information is:Attempting to access the property or method of an invalid object

Here is my code，I use parpool to create a parallel pool for the first two cards and use CUDAKernel objects from the parfor loop：

parent_addr = '/data/mice_data';
date_list = '20230410mice';
if ~isempty(gcp('nocreate'))
    delete(gcp('nocreate')); 
end
device_num = gpuDeviceCount;
using_num = device_num/2;
parpool('local',using_num); % Use the first two GPUs
system(['nvcc -ptx CF.cu' cl_addr]); % Compile PTX
kCF = parallel.gpu.CUDAKernel('CF.ptx', 'CF.cu', '_CF');
kCF.ThreadBlockSize = [1024,1,1];
kCF.GridSize = [ceil(2000/kCF.ThreadBlockSize(1)),64*64, 128];
setConstantMemory(kCF,...
    Some constant data...
    );
kernel_const = parallel.pool.Constant(@()kCF);
data_addr = fullfile(parent_addr,date_list);
[mice_addr_list,mice_name_list] = findMiceFolder(data_addr); %my function：Retrieve folders for all mice
parfor no_mice = 1:numel(mice_addr_list)
    rf_addr = fullfile(data_addr,mice_name_list{no_mice},'RF'); %RF path
    [rf_addr_list,rf_name_list] = findMat(rf_addr,'RFData');% my function：Retrieve mat for all rf
    for no_frame = 11:numel(rf_addr_list)
        temp_rf_addr_name = rf_addr_list{no_frame};
        rf_matrix = load(temp_rf_addr_name);
        RFData_gpu = gpuArray(single(rf_matrix));
        output_gpu = gpuArray(single(zeros(2000,128,64*64)));
        gd = gpuDevice(2-mod(no_mice,2));
        wait(gd);
        [output_gpu] = feval(kernel_const.Value, output_gpu,RFData_gpu); % Bug location
    end
    
end

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Joss Knight 2024 年 11 月 14 日 13:50

2
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2166328-how-to-perform-feval-function-operation-correctly-on-multiple-gpus#answer_1545333

MATLAB Online で開く

What you have passed to parallel.pool.Constant is a function that returns the captured object; what you wanted was a function that constructs the object:

kernel_const = parallel.pool.Constant(@construct_kernel);
...
    
function kernel = construct_kernel()
kernel = parallel.gpu.CUDAKernel('CF.ptx', 'CF.cu', '_CF');
kernel.ThreadBlockSize = [1024,1,1];
kernel.GridSize = [ceil(2000/kernel.ThreadBlockSize(1)),64*64, 128];
setConstantMemory(kernel,...
    Some constant data...
    );
end

Now each worker will construct its own CUDAKernel object rather than you trying to pass it from the client.

If there is dynamic information for building the kernel then you can pass that into construct_kernel as additional arguments.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

轶凡 2024 年 11 月 15 日 2:30

Thank you for your help, I have resolved the issue!

サインインしてコメントする。

How to perform feval function operation correctly on multiple GPUs

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

Community Treasure Hunt

How to perform feval function operation correctly on multiple GPUs

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示