CUDA Unexpected Error for nndata2gpu

Question

0 投票

Hi, I am currently trying to train a fitnet on a GPU (NVIDIA Titan Xp). However, whenever I try to format my data using nndata2gpu and gpu2nndata, I run into the following error:

Error using gpuArray/gather An unexpected error occurred during CUDA execution. The CUDA error was: CUDA_ERROR_ILLEGAL_ADDRESS

The code used is:

 % tinput=nndata2gpu(input);
ttarget=nndata2gpu(target);
     fundnet=configure(fundnet,input,target);
     tic
     fundnet=train(fundnet,tinput,ttarget,'useGPU','yes','showResources','yes');
     toc
     ty=fundnet(tinput);
     y=gpu2nndata(ty);
     fundnet=perform(fundnet,target,y);

The device is recognized without any problems (gpuDevice loads in less than a second), drivers are up to date. Using Matlab R2018a. Any idea what could be the source of this issue?

Many thanks in advance!

2 件のコメント
なしを表示なしを非表示

Joss Knight 2018 年 5 月 2 日

This doesn't look good. Could you provide a standalone example - i.e. generate some data that triggers the error and include it in your code?

LukasR 2018 年 5 月 2 日

編集済み: LukasR 2018 年 5 月 2 日

MATLAB Online で開く

Many thanks for the answer. While generating the sample data, I found what I believe is the source of the issue: For

input=rand(20,6000000);
target=rand(1,6000000);

the error occurs while for

input=rand(20,2000000);
target=rand(1,2000000);

it doesn't. The size of my own dataset amounts to approx. 5400000x25 (inputs) and 5400000x1 (targets).

Here is the entire executable code (which triggers the error):

input=rand(20,6000000);
  target=rand(1,6000000);
  nneurons=10;
  technet=fitnet(nneurons,'trainscg');
  technet.trainParam.epochs=10000;    
  technet.trainParam.goal=0;  
  technet.trainParam.min_grad=1e-6;  
  technet.trainParam.max_fail=200;  
  technet.trainParam.sigma=5.0e-7;  
  technet.trainParam.lambda=5.0e-7;  
  technet.trainParam.show=25;  
  technet.trainParam.showCommandLine=false;  
  technet.trainParam.showWindow=true;
  technet.trainParam.time=inf;
  technet.divideParam.trainRatio = 70/100; 
  technet.divideParam.valRatio = 15/100;
  technet.divideParam.testRatio = 15/100;
  for i=1:technet.numLayers
    if strcmp(technet.layers{i}.transferFcn,'tansig')
      technet.layers{i}.transferFcn = 'elliotsig';
    end
  end
       tinput=nndata2gpu(input);
       ttarget=nndata2gpu(target);
       technet=configure(technet,input,target);
       tic
       technet=train(technet,tinput,ttarget,'useGPU','yes','showResources','yes');
       toc
       ty=technet(tinput);
       technetout=gpu2nndata(ty);
       technetperformance=perform(technet,target,technetout);

Another note: The GPU training DOES work normally without the nndata2gpu command, albeit quite disappointingly (only a 1.5x speedup compared to an i7-7500U for the dataset described above). Furthermore, after the error occurs once, it will also occur for smaller datasets until I restart the whole program (in fact, I am not able to create any gpuArrays before restarting MATLAB).

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Joss Knight 2018 年 5 月 2 日

0 投票

Looks like you found a bug, many thanks. We will investigate. Meanwhile, best guess for now, this is caused by using more data than the GPU train function can handle. If you can reduce the size of the input without compromising your application, then that is the work-around.

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

Harley Edwards 2018 年 8 月 18 日

I think I found a similar/related error. I have a data set in which all the data trains well separately but will not together, despite having sufficient memory, and turning off kernel execution timeout. I have Inputs 200X844000, and Targets of 6X844000. I can only train 325000 samples at a time on a Geforce 1080. Please let me know how I can contribute to solving this problem, if you want my code.

サインインしてコメントする。

CUDA Unexpected Error for nndata2gpu

2 件のコメント
なしを表示なしを非表示

採用された回答

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

製品

タグ

Community Treasure Hunt

CUDA Unexpected Error for nndata2gpu

2 件のコメント なしを表示 なしを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示 -1 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

製品

タグ

参考

Community Treasure Hunt

2 件のコメント
なしを表示なしを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示