how to use printf inside a CUDA kernel?

Question

0 投票

Hi,

I wonder why I cannot use printf in cuda kernels. The code inside my file test.cu (adapted from the Mathworks help)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <iostream>
__global__ void add2( double * v1, const double * v2) 
{
    int idx = threadIdx.x;
    v1[idx] += v2[idx];
    printf("identity: %d \n",idx);
    
}

compiles nicely with mexcuda with

mexcuda -ptx test.cu

but trying to runt it from the command line as

k = parallel.gpu.CUDAKernel("test.ptx","test.cu");
N = 8;
k.ThreadBlockSize = N;
in1 = ones(N,1,"gpuArray");
in2 = ones(N,1,"gpuArray");
result = feval(k,in1,in2);
gather(result);

does not put any result on screen.

this link suggests some operations with the header, as #undef printf to avoid conflicts with mex.h... but it didn't work for me.

5 件のコメント
3 件の古いコメントを表示 3 件の古いコメントを非表示

Umar 2024 年 6 月 28 日

編集済み: Walter Roberson 2024 年 6 月 28 日

MATLAB Online で開く

Hi Daniel,

A common workaround is to redirect the output of printf from the CUDA kernel to a buffer and then retrieve the buffer contents for display. Here's a modified version of the code to demonstrate this approach:

#include stdio.h 
#include stdlib.h 
#include string.h 
#include iostream
_global_ void add2(double* v1, const double* v2, char* output) { 
int idx = threadIdx.x; 
v1[idx] += v2[idx]; 
sprintf(output, "identity: %d \n", idx); }

mexcuda -ptx test.cu

In the above code snippet, the sprintf function is used to write the output of printf to a character buffer (output). This buffer can then be accessed to retrieve the output generated by the CUDA kernel.

Please bear in mind when working with CUDA kernels in MATLAB and needing to display output, avoid using printf directly within the kernel. Instead, consider using buffer variables to store the output and retrieve it for display outside the kernel.

Hope that answers your question.

Daniel Castaño Díez 2024 年 7 月 4 日

MATLAB Online で開く

Hi Umar,

thanks, but why do you think snprintf is allowed in device code? mexcuda does protest:

/mnt/stor0hdd/castano/workplace/gpu/kernel/test.cu(21): error:
calling a __host__ function("snprintf") from a __global__
function("add2") is not allowed

Should I do something special to specify that it is some special version of a host device?

cheers,

D.

Umar 2024 年 7 月 4 日

Hi Daniel,

In CUDA C/C++, snprintf is allowed in device code because it is a host function that can be used in device code without any special modifications and facilitating easier code development. Also, when you use snprintf in device code, CUDA automatically handles the necessary translations and optimizations for device execution. Therefore, you do not need to specify snprintf as a special version for host or device; it can be used directly in device code as you would in host code.

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Joss Knight 2024 年 7 月 6 日

編集済み: Joss Knight 2024 年 7 月 6 日

0 投票

Just use it, and launch MATLAB from a terminal. On Linux, the output will appear in the terminal window. On Windows you will need to launch MATLAB with the undocumented options -wait -log.

3 件のコメント
1 件の古いコメントを表示 1 件の古いコメントを非表示

Joss Knight 2024 年 7 月 8 日

If you want to redirect the output into the MATLAB command window you are going to need to use one of the tricks off the internet or mentioned here to redirect the kernel stdout into a buffer so you can invoke mexPrintf with it on the host. For debugging purposes, this is more trouble than it's worth.

Daniel Castaño Díez 2024 年 7 月 11 日

Thanks, Joss. As you say, for debugging purposes I don't need it... I'm already happy with the output in the terminal :-)

サインインしてコメントする。

Answer 2

Udit06 2024 年 7 月 1 日

0 投票

Hi Daniel,

One more suggestion that I found in the following discussion is to use "cudaDeviceSynchronize" to ensure that the kernel finishes and the driver flushes the output buffer.

https://stackoverflow.com/questions/15669841/cuda-hello-world-printf-not-working-even-with-arch-sm-20

If the issue still persists, you can refer to the solution given in the following discussion:

https://forums.developer.nvidia.com/t/a-simple-question-about-printf-inside-a-kernel-with-no-convincing-answer-on-google-or-nvidia-docs/79308

I hope this helps.

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

how to use printf inside a CUDA kernel?

5 件のコメント
3 件の古いコメントを表示 3 件の古いコメントを非表示

採用された回答

3 件のコメント
1 件の古いコメントを表示 1 件の古いコメントを非表示

その他の回答 (1 件)

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

カテゴリ

製品

リリース

タグ

Community Treasure Hunt

how to use printf inside a CUDA kernel?

5 件のコメント 3 件の古いコメントを表示 3 件の古いコメントを非表示

採用された回答

3 件のコメント 1 件の古いコメントを表示 1 件の古いコメントを非表示

その他の回答 (1 件)

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

カテゴリ

製品

リリース

タグ

参考

Community Treasure Hunt

5 件のコメント
3 件の古いコメントを表示 3 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示 1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示