Overcoming VRAM limitations on Nvidia A100
14 ビュー (過去 30 日間)
古いコメントを表示
I have access to a cluster with several Nvidia A100 40GB GPU's. I am training a deep learning network on these GPU's, however using trainNetwork() only makes use of around 10GB of the GPU's vRAM. I beleive this is a limitation of Nvidia Cuda, see here.
I have two related questions;
- Other cluster users are writting in python with the 'DistributedDataParallel' module in PyTorch and are able to load in 40Gb of data (over the cuda limitation) onto the GPU's; is there a similar work around for MATLAB?
- If this isn't the case is there any way to use Multi-instance GPU's, so essentially split the physical card into several smaller virtual GPU's and compute in parrellel?
Ideally I would like to speed up computation, so having a 3/4 of the vRAM empty which could otherwise be used for mini-batches is a little heart breaking.
0 件のコメント
採用された回答
Joss Knight
2023 年 3 月 14 日
Just increase the MiniBatchSize and it'll use more memory.
6 件のコメント
Joss Knight
2023 年 3 月 14 日
You may never get that 10% so don't get your hopes up! Also, the best utilization is not necessarily at the highest batch size.
Why not ask a new question where you show your code for your datastore and one of us can help you make it partitionable.
その他の回答 (0 件)
参考
カテゴリ
Help Center および File Exchange で Parallel and Cloud についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!