train a deep learning model (resnet-50 network) on a remote HPC cluster

17 ビュー (過去 30 日間)
EK_47
EK_47 2022 年 10 月 14 日
コメント済み: EK_47 2022 年 10 月 14 日
I am trying to run a code, which uses a pre-trained ResNet-50 network, on a remote HPC cluster by submitting batch GPU jobs. I get the following error at this line:
net = resnet50
Error using resnet50
resnet50 requires the Deep Learning Toolbox Model for ResNet-50 Network support
package for the pretrained weights. To install this support package, use the <a
href="matlab:
matlab.addons.supportpackage.internal.explorer.showSupportPackages('RESNET50',
'tripwire')">Add-On Explorer</a>. To obtain the untrained layers, use
resnet50('Weights','none'), which does not require the support package.
It seems the Deep Learning Toolbox Model for ResNet-50 Network add-on is not installed on the cluster. How can I install this add-on on it?
Thanks

採用された回答

David Willingham
David Willingham 2022 年 10 月 14 日
Just to confirm, you're sending batch jobs to a HPC cluster that has MATLAB parallel server installed?
If so, one option to try would be:
  1. save resnet50 as as MAT file
  2. attach the MAT file when submitting the job
  3. have a load MAT file command in the function you're submitting.
  1 件のコメント
EK_47
EK_47 2022 年 10 月 14 日
Brilliant! Thank you for your answer. It solved my problem.
Yes, the HPC cluster has MATLAB paraller server installed.
In your point 1, you said "save resnet50 as a MAT file". I was not sure what you mean by "save resnet50". What I did was just I called it in MATLAB on my local machine
basenet = resnet50;
then saved it as
save('basenet.mat','basenet');
and then transferred this MAT file into the remote cluster and loaded it there.
Thanks

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeImage Data Workflows についてさらに検索

製品


リリース

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by