Can I run Matlab function on multiple GPU?
7 ビュー (過去 30 日間)
古いコメントを表示
I work on image processing and handle big array of images. I need to use Matlab function, let, median filter on multiple GPU. I can run on 1 GPU but I have 4 GPUs so I want to run each Matlab function on all 4 GPUs. Is it possible to run every Matlab function in all 4 GPUs instead of just 1 GPU?
4 件のコメント
Joss Knight
2020 年 2 月 6 日
Well, if it's a colour image you could process each channel independently.
Walter Roberson
2020 年 2 月 6 日
True, and sometimes large images are multichannel / multispectra, so there might plausibly be more than 3 channels.
It does become questionable about whether it is efficient to transfer large planes to other processes that would then have to transfer them to gpu and filtered response would have to go through the two transfer steps to get back. But it would be worth the experiment.
採用された回答
Jason Ross
2020 年 2 月 4 日
2 件のコメント
Muhammad Abir
2020 年 2 月 4 日
Thanks for your commnet. However, I don't see any example. All I found is how to run the multi-gpu on a for loop (https://www.mathworks.com/help/parallel-computing/examples/run-matlab-functions-on-multiple-gpus.html). For a specific function such as medfilt2, I have no for loop. In that case, how can I run it on multi-gpu? I'd really appreciate if you kindly give me an example. Thank you.
Jason Ross
2020 年 2 月 4 日
You would need to put the function in a parfor loop and iterate over your image array. This example reads all jpg files in /tmp, opens a parallel pool equal to the number of GPUs in your host, then processes the images on the pool, and closes it after it's done.
As Walter indicates, this isn't sharing resources between four GPUs, under the hood it's launching four MATLAB workers and scheduling the work on each GPU as they are available. But it may be sufficient for your needs?
filePattern = fullfile('/tmp', '*.jpg');
jpegFiles = dir(filePattern);
for k = 1:length(jpegFiles)
baseFileName = jpegFiles(k).name;
fullFileName = fullfile('/tmp', baseFileName);
fprintf(1, 'Now reading %s\n', fullFileName);
images{k} = imread(fullFileName);
end
parpool('local',gpuDeviceCount);
parfor ii = 1:numel(images)
G = gpuArray(images{ii});
gs = rgb2gray(G);
filtoutput{ii}=medfilt2(gs);
end
delete(gcp('nocreate'));
その他の回答 (1 件)
Walter Roberson
2020 年 2 月 4 日
No, there is no way to run a user function on multiple GPU at this time.
At this time, each process can use gpuDevice to select a single GPU (this is done automatically for parpool members when the pool is no larger than the number of GPU.) Selecting a second GPU device would reset the first device.
Mathworks is doing work on sharing load between two hardware connected NVIDIA, and has implemented it for one particular kind of deep learning, and for one other task that is not coming to mind at the moment. Unfortunately I am having difficulty finding the appropriate postings.
2 件のコメント
shital shinde
2020 年 2 月 13 日
Actually I am trying to use parallelization for image processing. And I take task to parallelize denoising of image. simple method for denoising I know, but after that what have to do, for that I am confuse. Please any one help me for code.
k=imread('cameraman.jpg');
k = rgb2gray(k);
figure;
subplot(221);
imshow(k);
n=imnoise(k,'salt & pepper',0.01);
subplot(222);
imshow(n);
v=medfilt2(n,[3 3]);
subplot(223);
imshow(v);
I done with this. but how to use parallelization that i dont know. Please help me for that.
thank you.
Walter Roberson
2020 年 2 月 13 日
You can replace
n=imnoise(k,'salt & pepper',0.01);
with
per_side = 4;
fun = @(block_struct) imnoise(block_struct.data, 'salt & pepper', 0.01);
blocksize = floor([size(k, 1)/per_side, size(k,2)/per_side]);
n = blockproc(fun, blocksize, 'UseParallel', true);
This would break up the image into per_side by per_side pieces (and possibly a fractional piece as well) and would run imnoise upon each piece, using the Parallel Computing Toolbox, and you would have achived your goal of using parallelization as part of the process.
You should, by the way, expect that this would be notably slower than just using imnoise for the entire matrix. imnoise() is completely vectorized; the overhead of splitting up the file and calling multiple functions and dispatching to the parpool processes, and so on, is going to overwhelm the gains from using multiple CPUs.
v=medfilt2(n,[3 3]);
medfilt2, on the other hand, cannot simply break the values into blocks and process the blocks independently, because two adjacent values that happen to fall into different fixed-sized blocks influence each other. You would have to use blockproc with overlap turned on, and be careful on how you dealt with the edges, and how you dealt with with the right edge and bottom partial blocks. You would also have the overlap of splitting up the file and calling multiple functions and dispatching to the parpool processes, and so on. It would not be at all surprising if the result was slower than just working with a single CPU.
参考
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!