フィルターのクリア

Why does my GTX Titan Black GPU underperform in double precision calculations in MATLAB R2015a?

6 ビュー (過去 30 日間)
I experience unexpectedly slow performance of the GPU in double precision benchmarks.
I have a fast PC (Intel i7-4790 3.6GHz, 16GB of 1600MHz memory, Windows 7 64bit, and a nVidia GeForce GTX Titan Black GPU card, in PCIe 3.0x16 slot, with 850W power supply. I have downloaded the video drivers and CUDA toolkit and installed matlab Parallel Computing Toolbox:
>> gpuDeviceans =CUDADevice withproperties:Name: 'GeForce GTX TITAN Black'Index: 1ComputeCapability: '3.5'SupportsDouble: 1DriverVersion: 7ToolkitVersion: 6.5000MaxThreadsPerBlock: 1024MaxShmemPerBlock: 49152MaxThreadBlockSize: [1024 1024 64]MaxGridSize: [2.1475e+09 65535 65535]SIMDWidth: 32TotalMemory: 6.4425e+09AvailableMemory: 6.2105e+09MultiprocessorCount: 15ClockRateKHz: 980000ComputeMode: 'Default'GPUOverlapsTransfers: 1KernelExecutionTimeout: 1CanMapHostMemory: 1DeviceSupported: 1DeviceSelected: 1
I then downloaded the GPU benchmarking tool by by the MathWorks Parallel Computing Toolbox Team (version of Updated 05 Jan 2015), from http://www.mathworks.com/matlabcentral/fileexchange/34080-gpubenchand executed the “gpuBench”.
The results show that my GPU performs similarly to Quadro K6000 in single precision benchmarks (with deviations up to 40%, as expected: both the cards have the same no of CUDA cores but the memory bandwidth is higher for my Titan Black and the amount of memory is higher K6000)
However, the GeForce GTX Titan Black performs 4 times (!) slower than Quadro K6000 in the double precision benchmarks! This is unexpected for several reasons.A) both cards are fairly similar:Specification type K6000 / Titan BlackCUDA cores: 2880 / 2880Clock: 902MHz /889MHzMemory clock: 6 Gbps/ 7GbpsMemory bandwidth: 288GB/s / 336GB/s
B) There are benchmarking tests done by the MathWorksParallel Computing Toolbox Team shown in the file “Older benchmarks for GPUs” attached. From those results, a GPU very similar to mine, GeForce GTX Titan (anolder GPU with 2688 CUDA cores, 837MHz clock, 6Gbps memory clock and 288GB/s memory bandwidth) shows benchmarks very much similar to Quadro K6000:
Card                        DOUBLE                         SINGLE               Benchmark MTimes,Backlash, FFT,  MTimes,Backlash,FFTK6000                       1092       421         160      3017      831         334GTX Titan                  1106      352         150      2933      582         298My GPU                      252      163         110      4221      994         409
These results indicate that my GPU card (GeForce GTX Titan Black) should be faster than or similar to the Quadro K6000. However, the performance in the double precision is terrible (4x slower).

採用された回答

MathWorks Support Team
MathWorks Support Team 約23時間 前
編集済み: MathWorks Support Team 約19時間 前
In this particular case, double precision computing needs to be enabled which can be done using the NVIDIA Control Panel. The below external article show how this may be done.​https://forums.evga.com/When-to-Use-Double-Precision-under-NVIDIA-Control-Panel-Manage-3D-Settings-m2252867.aspx
In general, double precision can often be much slower across GPUs as some of them are optimized by design for single precision computation only and not scientific calculations involving double precision numbers.
As we are unable to provide recommendation for GPU hardware, please contact NVIDIA directly for further information on this disparity in performance. 

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeIntroduction to Installation and Licensing についてさらに検索

タグ

製品


リリース

R2015a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by