ブースティング回帰アンサンブル回帰の最適化
この例では、ブースティング回帰アンサンブルのハイパーパラメーターを最適化する方法を示します。最適化によりモデルの交差検証損失を最小化します。
問題は、自動車の加速度、エンジン排気量、馬力および重量に基づいてガロンあたりの走行マイル数で燃費をモデル化することです。これらの予測子および他の予測子が含まれている carsmall データを読み込みます。
load carsmall
X = [Acceleration Displacement Horsepower Weight];
Y = MPG;LSBoost アルゴリズムと代理分岐を使用してアンサンブル回帰をデータに当てはめます。学習サイクル数、代理分岐の最大数および学習率を変化させることにより、生成されたモデルを最適化します。さらに、最適化で反復ごとに交差検証を再分割できるようにします。
再現性を得るために、乱数シードを設定し、'expected-improvement-plus' の獲得関数を使用します。
rng('default') Mdl = fitrensemble(X,Y, ... 'Method','LSBoost', ... 'Learner',templateTree('Surrogate','on'), ... 'OptimizeHyperparameters',{'NumLearningCycles','MaxNumSplits','LearnRate'}, ... 'HyperparameterOptimizationOptions',struct('Repartition',true, ... 'AcquisitionFunctionName','expected-improvement-plus'))
|====================================================================================================================|
| Iter | Eval | Objective: | Objective | BestSoFar | BestSoFar | NumLearningC-| LearnRate | MaxNumSplits |
| | result | log(1+loss) | runtime | (observed) | (estim.) | ycles | | |
|====================================================================================================================|
| 1 | Best | 3.5219 | 7.7984 | 3.5219 | 3.5219 | 383 | 0.51519 | 4 |
| 2 | Best | 3.4752 | 0.5745 | 3.4752 | 3.4777 | 16 | 0.66503 | 7 |
| 3 | Best | 3.1575 | 0.80103 | 3.1575 | 3.1575 | 33 | 0.2556 | 92 |
| 4 | Accept | 6.3076 | 0.31943 | 3.1575 | 3.1579 | 13 | 0.0053227 | 5 |
| 5 | Accept | 3.4449 | 5.2646 | 3.1575 | 3.1579 | 277 | 0.45891 | 99 |
| 6 | Accept | 3.9806 | 0.28546 | 3.1575 | 3.1584 | 10 | 0.13017 | 33 |
| 7 | Best | 3.059 | 0.22775 | 3.059 | 3.06 | 10 | 0.30126 | 3 |
| 8 | Accept | 3.1707 | 0.25535 | 3.059 | 3.1144 | 10 | 0.28991 | 15 |
| 9 | Accept | 3.0937 | 0.28985 | 3.059 | 3.1046 | 10 | 0.31488 | 13 |
| 10 | Accept | 3.196 | 0.29561 | 3.059 | 3.1233 | 10 | 0.32005 | 11 |
| 11 | Best | 3.0495 | 0.26591 | 3.0495 | 3.1083 | 10 | 0.27882 | 85 |
| 12 | Best | 2.946 | 0.33824 | 2.946 | 3.0774 | 10 | 0.27157 | 7 |
| 13 | Accept | 3.2026 | 0.29892 | 2.946 | 3.0995 | 10 | 0.25734 | 20 |
| 14 | Accept | 5.7151 | 6.9632 | 2.946 | 3.0996 | 376 | 0.001001 | 43 |
| 15 | Accept | 3.207 | 9.7721 | 2.946 | 3.0937 | 499 | 0.027394 | 18 |
| 16 | Accept | 3.8606 | 0.792 | 2.946 | 3.0937 | 36 | 0.041427 | 12 |
| 17 | Accept | 3.2026 | 8.4811 | 2.946 | 3.095 | 443 | 0.019836 | 76 |
| 18 | Accept | 3.4832 | 3.7527 | 2.946 | 3.0956 | 205 | 0.99989 | 8 |
| 19 | Accept | 5.6285 | 3.2631 | 2.946 | 3.0942 | 192 | 0.0022197 | 2 |
| 20 | Accept | 3.0896 | 3.6001 | 2.946 | 3.0938 | 188 | 0.023227 | 93 |
|====================================================================================================================|
| Iter | Eval | Objective: | Objective | BestSoFar | BestSoFar | NumLearningC-| LearnRate | MaxNumSplits |
| | result | log(1+loss) | runtime | (observed) | (estim.) | ycles | | |
|====================================================================================================================|
| 21 | Accept | 3.1408 | 2.6992 | 2.946 | 3.0935 | 156 | 0.02324 | 5 |
| 22 | Accept | 4.691 | 0.32615 | 2.946 | 3.0941 | 12 | 0.076435 | 2 |
| 23 | Accept | 5.4686 | 0.91401 | 2.946 | 3.0935 | 50 | 0.0101 | 58 |
| 24 | Accept | 6.3759 | 0.50062 | 2.946 | 3.0893 | 23 | 0.0014716 | 22 |
| 25 | Accept | 6.1278 | 0.8852 | 2.946 | 3.094 | 47 | 0.0034406 | 2 |
| 26 | Accept | 5.9134 | 0.26115 | 2.946 | 3.0969 | 11 | 0.024712 | 12 |
| 27 | Accept | 3.401 | 2.7279 | 2.946 | 3.0995 | 151 | 0.067779 | 7 |
| 28 | Accept | 3.2757 | 3.4281 | 2.946 | 3.1009 | 198 | 0.032311 | 8 |
| 29 | Accept | 3.2296 | 0.39827 | 2.946 | 3.1023 | 17 | 0.30283 | 19 |
| 30 | Accept | 3.2385 | 1.5376 | 2.946 | 3.1027 | 83 | 0.21601 | 76 |
__________________________________________________________
Optimization completed.
MaxObjectiveEvaluations of 30 reached.
Total function evaluations: 30
Total elapsed time: 82.1278 seconds
Total objective function evaluation time: 67.3176
Best observed feasible point:
NumLearningCycles LearnRate MaxNumSplits
_________________ _________ ____________
10 0.27157 7
Observed objective function value = 2.946
Estimated objective function value = 3.1219
Function evaluation time = 0.33824
Best estimated feasible point (according to models):
NumLearningCycles LearnRate MaxNumSplits
_________________ _________ ____________
10 0.30126 3
Estimated objective function value = 3.1027
Estimated function evaluation time = 0.28359

Mdl =
RegressionEnsemble
ResponseName: 'Y'
CategoricalPredictors: []
ResponseTransform: 'none'
NumObservations: 94
HyperparameterOptimizationResults: [1×1 BayesianOptimization]
NumTrained: 10
Method: 'LSBoost'
LearnerNames: {'Tree'}
ReasonForTermination: 'Terminated normally after completing the requested number of training cycles.'
FitInfo: [10×1 double]
FitInfoDescription: {2×1 cell}
Regularization: []
Properties, Methods
この損失を、最適化されていないブースティングされたモデルの損失および既定のアンサンブルの損失と比較します。
loss = kfoldLoss(crossval(Mdl,'kfold',10))loss = 19.2667
Mdl2 = fitrensemble(X,Y, ... 'Method','LSBoost', ... 'Learner',templateTree('Surrogate','on')); loss2 = kfoldLoss(crossval(Mdl2,'kfold',10))
loss2 = 30.4083
Mdl3 = fitrensemble(X,Y);
loss3 = kfoldLoss(crossval(Mdl3,'kfold',10))loss3 = 29.0495
このアンサンブルを最適化する別の方法については、交差検証の使用によるアンサンブル回帰の最適化を参照してください。