How do I know how large an array can fit on the GPU?

Question

Jae-Hee Park 2022 年 8 月 26 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1786790-how-do-i-know-how-large-an-array-can-fit-on-the-gpu

コメント済み: Joss Knight 2022 年 9 月 1 日

採用された回答: Mike Croucher

Hi

I am trying some analysis on gpu like fft() functions.

But the array is too large to calulate on my GPU(TITAN Xp).

So, I thought slicing array and put it on GPU then collecting and reshape after calculating.

But, I don't know what size is fit on my GPU.

Please how can I know the fit array size on my GPU.

thank you.

Jae-Hee Park

2 件のコメント
なしを表示なしを非表示

KSSV 2022 年 8 月 26 日

REad about gpuDevice

Jae-Hee Park 2022 年 8 月 26 日

@KSSV

My gpuDevice return like this. and then What can I do?

Name: 'NVIDIA TITAN Xp'

Index: 1

ComputeCapability: '6.1'

SupportsDouble: 1

DriverVersion: 11.7

ToolkitVersion: 11

MaxThreadsPerBlock: 1024

MaxShmemPerBlock: 49152

MaxThreadBlockSize: [1024 1024 64]

MaxGridSize: [2.1475e+09 65535 65535]

SIMDWidth: 32

TotalMemory: 1.2885e+10

AvailableMemory: 1.1665e+10

MultiprocessorCount: 30

ClockRateKHz: 1582000

ComputeMode: 'Default'

GPUOverlapsTransfers: 1

KernelExecutionTimeout: 1

CanMapHostMemory: 1

DeviceSupported: 1

DeviceAvailable: 1

DeviceSelected: 1

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Mike Croucher 2022 年 8 月 26 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1786790-how-do-i-know-how-large-an-array-can-fit-on-the-gpu#answer_1034300

編集済み: Mike Croucher 2022 年 8 月 26 日

MATLAB Online で開く

As you've seen, gpuDevice() gives you information about your GPU. This is what I get for mine

>> gpuDevice()
ans = 
CUDADevice with properties:
Name: 'NVIDIA GeForce RTX 3070'
Index: 1
ComputeCapability: '8.6'
SupportsDouble: 1
DriverVersion: 11.6000
ToolkitVersion: 11.2000
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 8.5894e+09
AvailableMemory: 7.2955e+09
MultiprocessorCount: 46
ClockRateKHz: 1725000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceAvailable: 1
DeviceSelected: 1

The important parameter here is AvailableMemory. I have 7.2955e+09 bytes (you have rather more!). What does this mean in terms of matrix size?

A double precision number is 8 bytes so in theory I can have 7.2955e+09/8 = 911937500 doubles on the card. This is my hard, nothing I can do about it, limit. There simply isn't the capacity on my GPU to have more than that. Consider this an upper bound. In terms of a square matrix its roughly 30,000 x 30,000 since

sqrt(911937500)
ans =
3.0198e+04

Let's transfer a matrix that big to my GPU and see if I'm successful

a = zeros(3.0198e+04);
>> gpuA = gpuArray(a);
>> gpuDevice()
ans = 
  CUDADevice with properties:
                      Name: 'NVIDIA GeForce RTX 3070'
                     Index: 1
         ComputeCapability: '8.6'
            SupportsDouble: 1
             DriverVersion: 11.6000
            ToolkitVersion: 11.2000
        MaxThreadsPerBlock: 1024
          MaxShmemPerBlock: 49152
        MaxThreadBlockSize: [1024 1024 64]
               MaxGridSize: [2.1475e+09 65535 65535]
                 SIMDWidth: 32
               TotalMemory: 8.5894e+09
           AvailableMemory: 110592
       MultiprocessorCount: 46
              ClockRateKHz: 1725000
               ComputeMode: 'Default'
      GPUOverlapsTransfers: 1
    KernelExecutionTimeout: 1
          CanMapHostMemory: 1
           DeviceSupported: 1
           DeviceAvailable: 1
            DeviceSelected: 1

Worked! and I had 110592 bytes left over.

However, the useful limit will be rather lower than this. If I stuff my card full of data then there's no room for any GPU algorithm to do any computation. Even adding 1 to all the elements of a GPU array this big is too much. Clearly matrix addition isn't done completely in place.

gpuA = gpuA +1;
Error using  + 
Out of memory on device. To view more detail about available memory on the GPU,
use 'gpuDevice()'. If the problem persists, reset the GPU by calling
'gpuDevice(1)'. 

I can at least do something though. The sum command works, for example, even though the answer isn't very interesting in this case.

>> sum(gpuA,'all')
ans =
     0

How much memory you need to do computations depends on the algorithms involved but hopefully you can use this thinking as a starting point for what you can expect to squeeze onto your GPU.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Joss Knight 2022 年 9 月 1 日

Just FYI, MATLAB won't allow in-place computation on a workspace variable because it needs to hold onto the original array in case of error (or user Ctrl-C). Computation inside a function on local variables will be more optimized.

サインインしてコメントする。

How do I know how large an array can fit on the GPU?

2 件のコメント
なしを表示なしを非表示

採用された回答

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

Community Treasure Hunt

How do I know how large an array can fit on the GPU?

2 件のコメント なしを表示なしを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

Community Treasure Hunt

2 件のコメント
なしを表示なしを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示