フィルターのクリア

Selection of Neural Network Training Data

1 回表示 (過去 30 日間)
Kamuran Turksoy
Kamuran Turksoy 2017 年 5 月 4 日
回答済み: Greg Heath 2017 年 5 月 5 日
One can divide his/her data into training, validation and testing and use them to train a neural network model (regression in my case). My question is, what if there are some data points in the training set that impair the model performance? Are there any good ways to find such data points and remove them from the training data set?
I was thinking of using something similar to cross-validation (leave one out) as:
1. Leave a data point from training set
2. Train the model with the rest of the training set
3. If there is improvement in error of the validation (or testing) sets discard the point.
4. Repeat this for all data points until no more improvement is observed.
There are two problems with this method:
1. It will take a long time for large data sets.
2. Random initial weights will add complexity on discarding data points. Constant initial values with a seed value may not be optimum set to begin with.

回答 (1 件)

Greg Heath
Greg Heath 2017 年 5 月 5 日
Before learning, obtain the mean and standard deviations of the input and target variables. Overlay the plots of the variables on lines of mean +/- m*std for m= 1:4.
Remove or modify outliers.
Hope this helps
Thank you for formally accepting my answer
Greg.

カテゴリ

Help Center および File ExchangeCustom Training Loops についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by