gpuArray slower on newer graphics card in double precision

I have been making the following speed test in R2015a on two different computers running two different graphics cards,
>> A=gpuArray(rand(5e3));
>> T=gputimeit(@()A*A)
The first computer is an older model (Dell Precision T7500) running an older graphics card (GTX 580). The second, newer computer (Dell Precision Tower 7910) is running a newer graphics card (Titan X).
Oddly, I find that the older configuration outperforms the newer by about 20%. The GTX 580 gives T=1.1178 seconds, whereas the Titan X gives T=1.3097 seconds. When I redo the test in single precision,
>> A=gpuArray(rand(5e3,'single'));
>> T=gputimeit(@()A*A)
the results are more in line with my expectations. The GTX 580 gives T=0.2121 seconds, whereas the Titan X gives T=0.0491 seconds.
I'm wondering what could account for this difference. One thing that might be worth mentioning is that the Titan X is not using a fully updated driver. At the time of this writing, there is some bug in its newest driver release, making it unusable, and I am instead using driver version 353.62. Could this be the reason? If not, any other ideas?

7 件のコメント

Cedric
Cedric 2015 年 7 月 31 日
編集済み: Cedric 2015 年 7 月 31 日
Surprising. My older EVGA Titan Black gives T=1.0277 in double, and T=0.0737 in single.
No answer but an anecdote: when I first installed the Titan Black, the power unit was not supporting the load and I got terrible performance, crashes, etc. I spent a good 4 hours cursing at Windows, MATLAB (not proud of that though ;-)), EVGA, until I got lucky enough that a heavy computation triggered a power off of the machine.. which helped me finding the cause.
Matt J
Matt J 2015 年 8 月 3 日
Thanks, Cedric. It makes me wonder if I should roll back to the Titan Black.
Not sure why the power failure was a "lucky" thing, though. I'm seeing a fair amount of crashes both on the Titan X and the GTX 580 as well. What was the solution? You just needed a computer with a stronger power supply?
Cedric
Cedric 2015 年 8 月 3 日
編集済み: Cedric 2015 年 8 月 3 日
Hi Matt, it was a good thing because otherwise I would never have thought about the power supply. My PSU has two pairs of 6 and 8 pins PCI-E power outputs; one pair is white-black (6+8) and the other is blue-black (6+8). I used all white-black at first and it crashed. Then I mixed and it worked (I also tried with dual 4 + adapter and it went well), which seems to indicate that they are wired to separate circuits internally and mixing just splits the load.
PS :
Brendan Hamm
Brendan Hamm 2015 年 8 月 3 日
The Titan X is a terrible card to use for GPGPU as it was designed as a cheaper alternative to other Titans with a focus on single precision (gaming). You will see that the GFLOPS for double precision is about 1/32 that of single precision on the Maxwell chips. Compare that with the Fermi architecture used on the GTX 580 which has 1/5 the GFLOPS for double precision compared with its single precision. If you intended to use this for double precision I would highly recommend using the Titan Z (or Black) which uses the Kepler architecture. Therefore if you have a Titan Black, this would not be rolling back at all, but rather using a card which considered double precision as being important.
Cedric
Cedric 2015 年 8 月 3 日
This looks like an answer!
Matt J
Matt J 2015 年 8 月 3 日
編集済み: Matt J 2015 年 8 月 3 日
Brendan's response does indeed look like an answer, and is supported by this article so, Brendan, if you resubmit as an Answer, I will accept.
Ultimately, though, my computationally intensive work will mainly be single precision. I was just curious about the behavior I was seeing, and whether it might be due to a bad driver. So, I don't know if "the Titan X is a terrible card" is applicable to me.
Brendan Hamm
Brendan Hamm 2015 年 8 月 3 日
Added double precision to that terrible line :)

サインインしてコメントする。

 採用された回答

Brendan Hamm
Brendan Hamm 2015 年 8 月 3 日

2 投票

The Titan X is a terrible card to use for double precision GPGPU as it was designed as a cheaper alternative to other Titans with a focus on single precision (gaming). You will see that the GFLOPS for double precision is about 1/32 that of single precision on the Maxwell chips. Compare that with the Fermi architecture used on the GTX 580 which has 1/5 the GFLOPS for double precision compared with its single precision. If you intended to use this for double precision I would highly recommend using the Titan Z (or Black) which uses the Kepler architecture. Therefore if you have a Titan Black, this would not be rolling back at all, but rather using a card which considered double precision as being important.

1 件のコメント

Brendan Hamm
Brendan Hamm 2015 年 8 月 3 日
編集済み: Brendan Hamm 2015 年 8 月 3 日
More info can be found here as well: NVidia Comparisson Wiki.
For single precision work, the Titan X is the card to use, so looks like you made a good choice. It does have less cores than the Titan Z, but a higher clock rate and a lower price point.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

ヘルプ センター および File ExchangeLanguage Fundamentals についてさらに検索

質問済み:

2015 年 7 月 31 日

編集済み:

2015 年 8 月 3 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by