How do validation check work in Neuralnet ?
現在この質問をフォロー中です
- フォローしているコンテンツ フィードに更新が表示されます。
- コミュニケーション基本設定に応じて電子メールを受け取ることができます。
エラーが発生しました
ページに変更が加えられたため、アクションを完了できません。ページを再度読み込み、更新された状態を確認してください。
古いコメントを表示

I'm learning about the neural network in MATLAB. when I learn about the neural net, I don't see anything about validation check (usually data is divided by 2 training and test testing) but in Matlab, they have a part for validation and have Validation check(in figure = 6).
so what I want to know is why we need validation check and how it work to check
採用された回答
Greg Heath
2017 年 8 月 7 日
編集済み: Greg Heath
2018 年 7 月 11 日
design = train + validate
train : Weight Estimation
validate: Not directly involved in weight estimation. Protects ability to generalize to nontraining data. Stops training when the nontraining val subset error rate increases CONTINUOUSLY for more than 6 (default) epochs.
val subset error rate is therefore SLIGHTLY biased.
test subset error rate is COMPLETELY unbiased
default division ratio = 0.7/0.15/0.15
If val stopping occurs, take a look at the error rate curves and you will see why training was stopped.
OBVIOUSLY, the most unbiased approach for constant timestep timeseries prediction is to use DIVIDEBLOCK data division with the validation subset in the middle.
Hope this helps
Thank you for formally accepting my answer
Greg
8 件のコメント
R G
2017 年 8 月 7 日
sorry, I can't get your point. you mentioned about "nontraining val subset error rate" so how can I calculate it or get it, and why we need it.
because I already try to delete validation part so it makes my learning step not stop by error, whole the process can learning longer and get more accuracy. So, is validation part useless?
Greg Heath
2017 年 8 月 7 日
編集済み: Greg Heath
2017 年 8 月 7 日
Reread my post.
What good does it do to get excellent performance on training data if the net doesn't generalize well and get good performance on ALL ( train + val + tst + unseen ) data ?
The nondesign test subset error rate is the unbiased estimate of net performance on unseen data.
The val subset helps prevent the nontraining error rate from getting too high.
So, why in the world would you want to delete it?
Hope this helps.
Greg
firstly, I don't know what validation is, so I delete it to try to see the difference. and after delete it, it can generalize well and get better performance in even training and testing data (except validation part because i deleted it)
secondly, my data set is small. so I need to try k-fold cross validation, and in that, the convention about k-folk only have 2 type (training and testing)
thirdly, I don't know how to get that error number (=6), it mean in validation data, I have 6 error or 6% in data is the error or something else.
and one more thing, do you have any reference for your point (paper, book ...etc) for me can read more detail about that and also convince my advisor to not delete it (he is the main reason why I have to delete validation part). it will be very helpful
Greg Heath
2017 年 8 月 9 日
1. Validation is a guard against overtraining an overfit net.
2. It is not necessary. However, it is so useful that it is a
MATLAB default.
3. If you use a val subset and it stops training. Look at the 2
NONTRAINING error rate curves and it will become obvious why
training was stopped.
4. k-fold crossval can be done with a val set. I have posted
several examples in the NEWSGROUP and/or ANSWERS.
5. Reread my post! Training stops if the val subset error increases
CONTINUOUSLY for 6 (default) epochs. This is interpreted
as the net is becoming unable to accurately estimate
outputs for nontraining (val + tst + unseen) data.
6. If you want to delete the val subset, then use BAYESIAN
REGULARIZATION via TRAINBR.
7. This is information that has been known and used for decades.
ANY decent book on NNs will explain it.
8. I'm sure you will find many discussions on it in the
COMP.AI.NEURAL-NETS NEWSGROUP as well as any decent NN Text.
Hope this helps.
Greg
R G
2017 年 8 月 9 日
quite clear for me now, thank you!
ErikaZ
2018 年 7 月 10 日
Hi Greg, I am using DIVIDEBLOCK data division for my NARX net. Can you explain briefly why "val stopping is the most important for timeseries design using DIVIDEBLOCK data division"? Thanks.
Greg Heath
2018 年 7 月 11 日
編集済み: Greg Heath
2018 年 7 月 11 日
Thanks for the heads up! Changed to:
OBVIOUSLY, the most UNBIASED approach for constant timestep timeseries prediction is to use DIVIDEBLOCK data division with the validation subset in the middle.
GREG
Moritz Hesse
2019 年 5 月 2 日
There is also a Mathworks article on this here: https://uk.mathworks.com/help/deeplearning/ug/train-and-apply-multilayer-neural-networks.html#bss331l-17
その他の回答 (0 件)
カテゴリ
ヘルプ センター および File Exchange で Matrix Indexing についてさらに検索
参考
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Web サイトの選択
Web サイトを選択すると、翻訳されたコンテンツにアクセスし、地域のイベントやサービスを確認できます。現在の位置情報に基づき、次のサイトの選択を推奨します:
また、以下のリストから Web サイトを選択することもできます。
最適なサイトパフォーマンスの取得方法
中国のサイト (中国語または英語) を選択することで、最適なサイトパフォーマンスが得られます。その他の国の MathWorks のサイトは、お客様の地域からのアクセスが最適化されていません。
南北アメリカ
- América Latina (Español)
- Canada (English)
- United States (English)
ヨーロッパ
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
