"CUDA_ERRO​R_LAUNCH_F​AILED" error when trying to use a kernel in ".cu" file.

6 ビュー (過去 30 日間)
Heping Wan
Heping Wan 2020 年 8 月 31 日
回答済み: Pankhuri Kasliwal 2020 年 10 月 8 日
All the other kernels work when being used isolatedly. However, after running this problematic kernel, "CUDA_ERROR_LAUNCH_FAILED" error comes out when I try to run any kernel in my ".cu" file again.
After runing reset(gpuDevice), the error becomes "all CUDA-capable devices are busy or unavailable".
Some extra details of this problem:
  1. I can run the problematic kernel for the first time, but no outputs come out. It shows "Cannot display summaries of variables with more than 524288 elements.", when checking the variables window.
  2. The gpu kernels provided by matlab still work after running the problematic kernel.
  3. The matlab version I'm using is 2020a, the cuda and gpu driver are the latest.

回答 (1 件)

Pankhuri Kasliwal
Pankhuri Kasliwal 2020 年 10 月 8 日
Hi,
The error msg indicates the MEX code is causing a CUDA error during kernel launch (invalid argument). This is not then being picked up until the next time the CUDA error flag is checked by MATLAB. You can confirm this by adding call to cudaDeviceSynchronize in your code after the kernel launch and retrieving the error code returned by that function.
Invalid argument typically refers to an invalid kernel launch parameter, such as a grid or block size larger than the maximum, or a shared memory allocation larger than the maximum.
While it is very difficult at the moment to determine the exact cause of the GPU errors, I can provide you with more information about GPU errors and how you can best prevent them in the future:
  • From the output of 'gpuDevice' which you supplied, I see that the Kernel Execution Timeout is set to 1 (enabled). This is a common situation for display graphics cards operating in the WDDM mode in Windows.
  • This flag indicates that operating system is placing an upper bound on the time allowed for the CUDA kernel to execute, after which the CUDA driver times out the kernel and returns an error. This upper bound is intended to help the operating system maintain rendering to the display, as the GPU is being used both for computation and display purposes.
Suggestions to avoid hitting this error include:
  • Using smaller batches of computation (e.g. by reducing mini-batch size) to reduce the risk of hitting the timeout.
  • Changing Windows' Timeout Detection and Recovery (TDR) settings, for instance by changing the TdrDelay from 2 seconds (default) to 4 seconds.
  • Using an alternative GPU card which can be placed in TCC mode (a compute-only mode). This is available on GPU models like Titan X or Tesla computer cards.
The TdrDelay is controlled by registry keys in Windows. Please see these resources on TDR settings here:

製品


リリース

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by