gpuArray slower on newer graphics card in double precision

Question

2 投票

I have been making the following speed test in R2015a on two different computers running two different graphics cards,

>> A=gpuArray(rand(5e3));
>> T=gputimeit(@()A*A)

The first computer is an older model (Dell Precision T7500) running an older graphics card (GTX 580). The second, newer computer (Dell Precision Tower 7910) is running a newer graphics card (Titan X).

Oddly, I find that the older configuration outperforms the newer by about 20%. The GTX 580 gives T=1.1178 seconds, whereas the Titan X gives T=1.3097 seconds. When I redo the test in single precision,

    >> A=gpuArray(rand(5e3,'single'));
    >> T=gputimeit(@()A*A)

the results are more in line with my expectations. The GTX 580 gives T=0.2121 seconds, whereas the Titan X gives T=0.0491 seconds.

I'm wondering what could account for this difference. One thing that might be worth mentioning is that the Titan X is not using a fully updated driver. At the time of this writing, there is some bug in its newest driver release, making it unusable, and I am instead using driver version 353.62. Could this be the reason? If not, any other ideas?

7 件のコメント
5 件の古いコメントを表示 5 件の古いコメントを非表示

Matt J 2015 年 8 月 3 日

編集済み: Matt J 2015 年 8 月 3 日

Brendan's response does indeed look like an answer, and is supported by this article so, Brendan, if you resubmit as an Answer, I will accept.

Ultimately, though, my computationally intensive work will mainly be single precision. I was just curious about the behavior I was seeing, and whether it might be due to a bad driver. So, I don't know if "the Titan X is a terrible card" is applicable to me.

Brendan Hamm 2015 年 8 月 3 日

Added double precision to that terrible line :)

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Brendan Hamm 2015 年 8 月 3 日

2 投票

The Titan X is a terrible card to use for double precision GPGPU as it was designed as a cheaper alternative to other Titans with a focus on single precision (gaming). You will see that the GFLOPS for double precision is about 1/32 that of single precision on the Maxwell chips. Compare that with the Fermi architecture used on the GTX 580 which has 1/5 the GFLOPS for double precision compared with its single precision. If you intended to use this for double precision I would highly recommend using the Titan Z (or Black) which uses the Kepler architecture. Therefore if you have a Titan Black, this would not be rolling back at all, but rather using a card which considered double precision as being important.

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

Brendan Hamm 2015 年 8 月 3 日

編集済み: Brendan Hamm 2015 年 8 月 3 日

More info can be found here as well: NVidia Comparisson Wiki.

For single precision work, the Titan X is the card to use, so looks like you made a good choice. It does have less cores than the Titan Z, but a higher clock rate and a lower price point.

サインインしてコメントする。

gpuArray slower on newer graphics card in double precision

7 件のコメント
5 件の古いコメントを表示 5 件の古いコメントを非表示

採用された回答

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

製品

タグ

Community Treasure Hunt

gpuArray slower on newer graphics card in double precision

7 件のコメント 5 件の古いコメントを表示 5 件の古いコメントを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示 -1 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

製品

タグ

参考

Community Treasure Hunt

7 件のコメント
5 件の古いコメントを表示 5 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示