Main Content

このページの翻訳は最新ではありません。ここをクリックして、英語の最新版を参照してください。

ブースティング回帰アンサンブル回帰の最適化

この例では、ブースティング回帰アンサンブルのハイパーパラメーターを最適化する方法を示します。最適化によりモデルの交差検証損失を最小化します。

問題は、自動車の加速度、エンジン排気量、馬力および重量に基づいてガロンあたりの走行マイル数で燃費をモデル化することです。これらの予測子および他の予測子が含まれている carsmall データを読み込みます。

load carsmall
X = [Acceleration Displacement Horsepower Weight];
Y = MPG;

LSBoost アルゴリズムと代理分岐を使用してアンサンブル回帰をデータにあてはめます。学習サイクル数、代理分岐の最大数および学習率を変化させることにより、生成されたモデルを最適化します。さらに、最適化で反復ごとに交差検証を再分割できるようにします。

再現性を得るために、乱数シードを設定し、'expected-improvement-plus' の獲得関数を使用します。

rng('default')
Mdl = fitrensemble(X,Y, ...
    'Method','LSBoost', ...
    'Learner',templateTree('Surrogate','on'), ...
    'OptimizeHyperparameters',{'NumLearningCycles','MaxNumSplits','LearnRate'}, ...
    'HyperparameterOptimizationOptions',struct('Repartition',true, ...
    'AcquisitionFunctionName','expected-improvement-plus'))
|====================================================================================================================|
| Iter | Eval   | Objective:  | Objective   | BestSoFar   | BestSoFar   | NumLearningC-|    LearnRate | MaxNumSplits |
|      | result | log(1+loss) | runtime     | (observed)  | (estim.)    | ycles        |              |              |
|====================================================================================================================|
|    1 | Best   |      3.5219 |      21.016 |      3.5219 |      3.5219 |          383 |      0.51519 |            4 |
|    2 | Best   |      3.4752 |     0.98008 |      3.4752 |      3.4777 |           16 |      0.66503 |            7 |
|    3 | Best   |      3.1575 |      1.3284 |      3.1575 |      3.1575 |           33 |       0.2556 |           92 |
|    4 | Accept |      6.3076 |     0.54695 |      3.1575 |      3.1579 |           13 |    0.0053227 |            5 |
|    5 | Accept |      3.4449 |       10.44 |      3.1575 |      3.1579 |          277 |      0.45891 |           99 |
|    6 | Accept |      3.9806 |      2.0242 |      3.1575 |      3.1584 |           10 |      0.13017 |           33 |
|    7 | Best   |       3.059 |     0.38494 |       3.059 |        3.06 |           10 |      0.30126 |            3 |
|    8 | Accept |      3.1707 |      1.9261 |       3.059 |      3.1144 |           10 |      0.28991 |           15 |
|    9 | Accept |      3.0937 |     0.77032 |       3.059 |      3.1046 |           10 |      0.31488 |           13 |
|   10 | Accept |       3.196 |     0.48281 |       3.059 |      3.1233 |           10 |      0.32005 |           11 |
|   11 | Best   |      3.0495 |     0.49506 |      3.0495 |      3.1083 |           10 |      0.27882 |           85 |
|   12 | Best   |       2.946 |     0.59174 |       2.946 |      3.0774 |           10 |      0.27157 |            7 |
|   13 | Accept |      3.2026 |      0.4401 |       2.946 |      3.0995 |           10 |      0.25734 |           20 |
|   14 | Accept |       5.595 |      14.031 |       2.946 |      3.0996 |          440 |    0.0010008 |           36 |
|   15 | Accept |      3.1976 |      16.678 |       2.946 |      3.0935 |          496 |     0.027133 |           18 |
|   16 | Accept |      3.9809 |      1.1919 |       2.946 |      3.0927 |           34 |     0.041016 |           18 |
|   17 | Accept |      3.0512 |      12.984 |       2.946 |      3.0939 |          428 |     0.019766 |            3 |
|   18 | Accept |      3.4832 |      7.0974 |       2.946 |      3.0946 |          205 |      0.99989 |            8 |
|   19 | Accept |      3.3389 |      2.9724 |       2.946 |      3.0956 |           95 |     0.021453 |            2 |
|   20 | Accept |      3.2818 |      16.374 |       2.946 |      3.0979 |          494 |     0.020773 |           12 |
|====================================================================================================================|
| Iter | Eval   | Objective:  | Objective   | BestSoFar   | BestSoFar   | NumLearningC-|    LearnRate | MaxNumSplits |
|      | result | log(1+loss) | runtime     | (observed)  | (estim.)    | ycles        |              |              |
|====================================================================================================================|
|   21 | Accept |      3.4367 |      18.473 |       2.946 |      3.0962 |          480 |      0.27412 |            7 |
|   22 | Accept |      6.2247 |     0.47359 |       2.946 |      3.0995 |           10 |     0.010965 |           15 |
|   23 | Accept |      3.2847 |      6.0674 |       2.946 |      3.0991 |          181 |     0.057422 |           22 |
|   24 | Accept |       3.142 |      7.4902 |       2.946 |      3.0997 |          222 |     0.025594 |           25 |
|   25 | Accept |      3.2174 |     0.88694 |       2.946 |       3.106 |           18 |      0.32203 |           37 |
|   26 | Accept |       3.064 |      3.6397 |       2.946 |      3.1057 |          108 |      0.18554 |            1 |
|   27 | Accept |      3.4532 |      3.7091 |       2.946 |      3.1038 |           93 |      0.22441 |            3 |
|   28 | Accept |      3.1992 |      7.8344 |       2.946 |      3.1038 |          252 |     0.020628 |            3 |
|   29 | Best   |      2.9432 |     0.55603 |      2.9432 |      3.0766 |           10 |      0.36141 |           86 |
|   30 | Best   |       2.891 |     0.53217 |       2.891 |           3 |           10 |      0.38339 |            2 |

Figure contains an axes. The axes with title Min objective vs. Number of function evaluations contains 2 objects of type line. These objects represent Min observed objective, Estimated min objective.

__________________________________________________________
Optimization completed.
MaxObjectiveEvaluations of 30 reached.
Total function evaluations: 30
Total elapsed time: 227.0038 seconds
Total objective function evaluation time: 162.4163

Best observed feasible point:
    NumLearningCycles    LearnRate    MaxNumSplits
    _________________    _________    ____________

           10             0.38339          2      

Observed objective function value = 2.891
Estimated objective function value = 2.9674
Function evaluation time = 0.53217

Best estimated feasible point (according to models):
    NumLearningCycles    LearnRate    MaxNumSplits
    _________________    _________    ____________

           10             0.30126          3      

Estimated objective function value = 3
Estimated function evaluation time = 0.65359
Mdl = 
  RegressionEnsemble
                         ResponseName: 'Y'
                CategoricalPredictors: []
                    ResponseTransform: 'none'
                      NumObservations: 94
    HyperparameterOptimizationResults: [1x1 BayesianOptimization]
                           NumTrained: 10
                               Method: 'LSBoost'
                         LearnerNames: {'Tree'}
                 ReasonForTermination: 'Terminated normally after completing the requested number of training cycles.'
                              FitInfo: [10x1 double]
                   FitInfoDescription: {2x1 cell}
                       Regularization: []


  Properties, Methods

この損失を、最適化されていないブースティングされたモデルの損失および既定のアンサンブルの損失と比較します。

loss = kfoldLoss(crossval(Mdl,'kfold',10))
loss = 20.6082
Mdl2 = fitrensemble(X,Y, ...
    'Method','LSBoost', ...
    'Learner',templateTree('Surrogate','on'));
loss2 = kfoldLoss(crossval(Mdl2,'kfold',10))
loss2 = 36.4539
Mdl3 = fitrensemble(X,Y);
loss3 = kfoldLoss(crossval(Mdl3,'kfold',10))
loss3 = 36.6756

このアンサンブルを最適化する別の方法については、交差検証の使用によるアンサンブル回帰の最適化を参照してください。