Why for loop in CPU faster than in GPU

1 回表示 (過去 30 日間)

古いコメントを表示

Khieu 2017 年 6 月 21 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/345626-why-for-loop-in-cpu-faster-than-in-gpu

回答済み: Joss Knight 2017 年 6 月 22 日

MATLAB Online で開く

I am running the following command:

p = zeros(7000000,1); g_p = gpuArray(p);

1. CPU version

for i=1:size(p,1)
p(i) = 10; % just simple to see
end

2. GPU version

for i=1:size(g_p,1)
g_p(i) = 10; % just simple to see
end

It is amazing that the execution time in CPU really more faster than it in GPU. Why?

4 件のコメント
2 件の古いコメントを表示2 件の古いコメントを非表示

Khieu 2017 年 6 月 21 日

Thanks Adam

Adam 2017 年 6 月 21 日

It is very difficult often to determine where this crossover is. I have occasionally dabbled with a GPU implementation of some computation in my code, but it is always hit and miss whether it is faster or slower on a particular run since data sizes vary so I often just end up commenting it out with a vague idea I might come back to it one day and consider some fancy code to chose the CPU or GPU version of my code based on input size and any other appropriate metrics, but I never have done so far.

サインインしてコメントする。

サインインしてこの質問に回答する。

回答 (1 件)

Joss Knight 2017 年 6 月 22 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/345626-why-for-loop-in-cpu-faster-than-in-gpu#answer_271520

This isn't amazing at all. You are launching 7 million kernels, each to copy one value into GPU memory. The kernels cannot overlap because they are operating on the same data, which means each must wait for the last to finish. This is not at all an efficient way to do parallel programming!