3D gpuArray, length of each dimension has influence on speed?

Question

Hao Zhang 2014 年 5 月 21 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/130553-3d-gpuarray-length-of-each-dimension-has-influence-on-speed

コメント済み: Joss Knight 2014 年 6 月 16 日

Hi,

I am doing 3D-FDTD simulations using gpuArray.

example 1: grid size: 32.000000 X 289.000000 X 289.000000, this means each field component is a 3D matrix of size 32.000000 X 289.000000 X 289.000000 In this case, the speed of the code is 17.1 million FDTD cells per second

example 2: simply change the grid size to 289.000000 X 289.000000 X 32.000000 The speed of the same code has increased to 24.6 million cells per second, nearly 50% speed gain!

Does anyone understand how the matlab distributes the matrix into each graphic processors? I always need the first dimension to be the smallest due to the geometry of the device I need to simulate, yes I can change my code to make the 3rd dimension to be the current first dimension of my code, but this needs a lot efforts to revise the code. So is there a way to distribute gpuarrays along a specified dimension like the codistributor1d does but with gpuArray? Any help would be appreciated. Thanks everybody!

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

John D'Errico 2014 年 5 月 21 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/130553-3d-gpuarray-length-of-each-dimension-has-influence-on-speed#answer_137743

編集済み: John D'Errico 2014 年 5 月 21 日

No. I sincerely doubt that you could ever simply tell MATLAB to think of an array as being stored in memory in a different order WITHOUT actually changing that order.

You get the speed bump because of the order elements are stored in memory. If you need that speed bump, then do the work to get it. (In fact, all it takes is a permute when you make the appropriate call, so I don't totally see the problem. But maybe there is an issue in your code.)

2 件のコメント
なしを表示なしを非表示

Hao Zhang 2014 年 5 月 21 日

Hi John, thanks for your answer, but I mean to distribute the gpuArray to each GPU processing unit like the codistributor1d does

Joss Knight 2014 年 6 月 16 日

What do you mean by 'each GPU processing unit'? Do you have more than one GPU card? Or are you referring to the thousands individual GPU cores or SMs on that card? If the latter then this is perhaps the source of your confusion. These cores to all intents and purposes share memory; the only meaningful way to 'distribute' parts of the array to each thread or thread block is to permute the array.

サインインしてコメントする。

3D gpuArray, length of each dimension has influence on speed?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

2 件のコメント
なしを表示なしを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

3D gpuArray, length of each dimension has influence on speed?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

2 件のコメント なしを表示なしを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示