parallel.gpu.CUDAKernel slow on GTX 1080

Question

0 投票

I executed this matlab command to load a cuda kernel.

KNNSearchGPU = parallel.gpu.CUDAKernel('Search.ptx','Search.cu');

It took about a minute on a computer with GTX 1080 but less than a sec on one with GTX TITAN. Both of them have cuda 8.0rc installed on ubuntu 14.04.

Even for an empty function like this in Search.cu.

__global__ void Search( float * result, const int * args, const float * pc1, const float * pc2)
{
}

I've notice the problem that matlab may not yet support this new card from this discussion. http://www.mathworks.com/matlabcentral/answers/79275-gpudevice-command-very-slow

If that's the case, when will matlab support GTX 1080? Will it be in 2016b?

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

Joss Knight 2016 年 6 月 15 日

You need to use the toolkit supported by MATLAB, namely CUDA 7.5. If you still see the problem on your GTX 1080 then can you

Let us know what commands you are executing on the command line to compile your PTX code.
Let us know whether the performance problems occur every time you load the kernel or just once; and whether running another GPU function first (e.g. gpuDevice) resolves the performance problem.

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Ritesh Naik 2016 年 6 月 15 日

2 投票

Hi Kuan-Ting,

The reason for the slow performance that you have observed is because of the one time compilation of the CUDA and MATLAB GPU libraries which may take several minutes. In this case, MATLAB is using a CUDA toolkit(7.5) which does not support the new Pascal architecture(GTX 1080).

The slowness should be once after which it will improve. So the observation of the other answer post being a similar issue(in the past) is correct.

At this moment it would be difficult to say when we will extend support for GTX 1080 since it seems like CUDA 8.0 toolkit is released very recently and also in this case since CUDA 8 card was released before CUDA 8 toolkit it did not give good amount of buffer time to extend support. We might extend support in one of the future releases of MATLAB but at this moment it would be difficult to say the exact release.

-Ritesh

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

Answer 2

Bosco Tjan 2016 年 9 月 5 日

1 投票

Thank you, Ritesh for your timely answer! We installed a Titan X (Pascal) board and are experiencing the same issue. A follow-up question: by a "one-time compilation", do you mean one-time per matlab session? When I exit and restart matlab, the same slowdown reoccurs. Is there anyway to make the compiled code persistent across sessions?

8 件のコメント
6 件の古いコメントを表示 6 件の古いコメントを非表示

Shawn Healey 2016 年 9 月 14 日

Confirmed on system with Titan X and 780.

Nick Chng 2016 年 9 月 17 日

I found it at the second one of the threads you linked.. the parallel for all blog. Glad it's working, cheers everyone.

サインインしてコメントする。

Answer 3

Wajahat Kazmi 2016 年 11 月 2 日

編集済み: Wajahat Kazmi 2016 年 11 月 2 日

1 投票

Hi

I had the same problem with GTX 1080 wih Matlab R2016a and b. However, when I used CUDA 8.0 with Matlab 2014b, the problem was solved (Windows 7 and 10).

Best Regards Wajahat

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

Answer 4

Alexander K 2016 年 12 月 6 日

0 投票

Dear colleges and MathWorks professionals,

I have almost the same problem with very long loadings (probably JIT re-compilations) in every new session of Matlab and even occasional crashes when trying to execute the command to reset gpu.

My configuration: - GTX 1070 (Pascal) on corei7 6700, 64GB RAM; - Win 10 Pro, Matlab 2016_b_ and CUDA 8.0 (installed very recently from Nvidia site; after the installation of the Matlab).

Many thanks for the above discussion and advices including the above-mentioned "pair of threads" which are also very informative!

My question is: what if variables CUDA_CACHE_MAXSIZE and CUDA_CACHE_DISABLE does NOT seem to exist in the registry on my workstation (Win 10) ???

How should I find or create them correctly ?

Regedit does NOT find them at all! (Although, the following sections: HKEY_LOCAL_MACHINE\SOFTWARE\NVIDIA Corporation\GPU Computing Toolkit\CUDA\v8.0 do exist).

Many thanks to all of you in advance!

Alexander K, PhD.

3 件のコメント
1 件の古いコメントを表示 1 件の古いコメントを非表示

Alexander K 2017 年 2 月 8 日

Many thanks for your helpful answer!

yingkun yang 2019 年 4 月 3 日

Excuse me ,Alexander.

My question is: How to set the CUDA cache by setting the environment variable (Win 10) ?

I create a System variables named CUDA_CACHE_MAXSIZE and set the value to 536870912.

But I think I'm wrong!

Many thanks to you in advance!

サインインしてコメントする。

parallel.gpu.CUDAKernel slow on GTX 1080

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

採用された回答

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

その他の回答 (3 件)

8 件のコメント
6 件の古いコメントを表示 6 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示 1 件の古いコメントを非表示

カテゴリ

製品

タグ

Community Treasure Hunt

parallel.gpu.CUDAKernel slow on GTX 1080

1 件のコメント -1 件の古いコメントを表示 -1 件の古いコメントを非表示

採用された回答

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

その他の回答 (3 件)

8 件のコメント 6 件の古いコメントを表示 6 件の古いコメントを非表示

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

3 件のコメント 1 件の古いコメントを表示 1 件の古いコメントを非表示

カテゴリ

製品

タグ

参考

Community Treasure Hunt

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

8 件のコメント
6 件の古いコメントを表示 6 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示 1 件の古いコメントを非表示