Neural net training is crashing my win10 computer completely
23 ビュー (過去 30 日間)
古いコメントを表示
I'm very new to neural nets, but I've been trying to follow this tutorial (and others) to load a pretrained network (VGG19) and edit the output layers to use it for regression rather than classification. The training and validation data are labeled, 224x224 images which are loaded as 4D arrays, trn_4d and val_4d, as specified in the tutorial. The variables trn_L and val_L are the normalized label vectors.
When I run this code, my computer spontaneously reboots itself immediately after the neural net training window opens. I can't find any error log for Matlab, and my PC system logger doesn't show anything crazy happening. This is making it very difficult for me to track down the problem. I'm hoping my mistake is obvious to someone else.
% Load pretrained convnet
net = vgg19;
layers = net.Layers;
% Delete the output layers
layers = layers(1:44);
% Add in new output layers for regression
layers = [layers
fullyConnectedLayer(1,"Name","fc8","WeightL2Factor",0)
regressionLayer("Name","regressionoutput")];
% Convnet training settings
miniBatchSize = 4;
validationFrequency = floor(numel(trn_WL)/miniBatchSize);
options = trainingOptions('sgdm', ...
'MiniBatchSize',miniBatchSize, ...
'MaxEpochs',30, ...
'InitialLearnRate',1e-5, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropFactor',0.1, ...
'LearnRateDropPeriod',20, ...
'Shuffle','every-epoch', ...
'ValidationData',{val_4d, val_L}, ...
'ValidationFrequency',validationFrequency, ...
'Plots','training-progress', ...
'Verbose',false);
% Convnet object
net = trainNetwork(trn_4d,trn_L,layers,options);
For reference, here are the original 47 layers for VGG-19:
layers = [
imageInputLayer([224 224 3],"Name","input")
convolution2dLayer([3 3],64,"Name","conv1_1","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu1_1")
convolution2dLayer([3 3],64,"Name","conv1_2","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu1_2")
maxPooling2dLayer([2 2],"Name","pool1","Stride",[2 2])
convolution2dLayer([3 3],128,"Name","conv2_1","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu2_1")
convolution2dLayer([3 3],128,"Name","conv2_2","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu2_2")
maxPooling2dLayer([2 2],"Name","pool2","Stride",[2 2])
convolution2dLayer([3 3],256,"Name","conv3_1","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu3_1")
convolution2dLayer([3 3],256,"Name","conv3_2","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu3_2")
convolution2dLayer([3 3],256,"Name","conv3_3","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu3_3")
convolution2dLayer([3 3],256,"Name","conv3_4","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu3_4")
maxPooling2dLayer([2 2],"Name","pool3","Stride",[2 2])
convolution2dLayer([3 3],512,"Name","conv4_1","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu4_1")
convolution2dLayer([3 3],512,"Name","conv4_2","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu4_2")
convolution2dLayer([3 3],512,"Name","conv4_3","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu4_3")
convolution2dLayer([3 3],512,"Name","conv4_4","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu4_4")
maxPooling2dLayer([2 2],"Name","pool4","Stride",[2 2])
convolution2dLayer([3 3],512,"Name","conv5_1","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu5_1")
convolution2dLayer([3 3],512,"Name","conv5_2","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu5_2")
convolution2dLayer([3 3],512,"Name","conv5_3","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu5_3")
convolution2dLayer([3 3],512,"Name","conv5_4","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu5_4")
maxPooling2dLayer([2 2],"Name","pool5","Stride",[2 2])
fullyConnectedLayer(4096,"Name","fc6","WeightL2Factor",0)
reluLayer("Name","relu6")
dropoutLayer(0.5,"Name","drop6")
fullyConnectedLayer(4096,"Name","fc7","WeightL2Factor",0)
reluLayer("Name","relu7")
dropoutLayer(0.5,"Name","drop7")
fullyConnectedLayer(1000,"Name","fc8","WeightL2Factor",0)
softmaxLayer("Name","prob")
classificationLayer("Name","output")];
7 件のコメント
Uday Pradhan
2020 年 10 月 15 日
You may still keep learning about deep learning using shallower networks (~10 - 12 layers). Feel free to visit the documentation pages for more information.
回答 (2 件)
HayderMU
2022 年 1 月 15 日
Hi,
I am having the same problem as you. The computer suddenly reboots during training. I could not find any log file in Matlab or Windows. I am using Matlab 2020b in windows 10 (RAM 32GB and GPU have 8GB memory). I installed a GPU and CPU temp logger which did not record anything unusual. I think it's something related to windows 10. I have another pc with lower specifications (Matlab 2020b, 4GB GPU, and WIn7). I tried the training process and it went without any errors. The problem occurred with different networks (not just a single network). I still have no reasonable explanation for it. I have to train the networks with errors in my old PC.
0 件のコメント
Stephen Wolstenholme
2024 年 3 月 4 日
Multiple copies of my main application EasyNN-plus have been running without any problems for decades on Windows 7 onwards. I expected it to work on Windows 11 but it crashed. After a lot of work it became obvious that it was a fault with Windows 11. I installed the latest release called 23H2 and the fault dissapeared. I think it could have been a threading problem because EasNN-plus uses many threads. It winds down to 1 thread as the target error is in range. On 23H2 there is no problem.
0 件のコメント
参考
カテゴリ
Help Center および File Exchange で Image Data Workflows についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!