Main Content

このページの翻訳は最新ではありません。ここをクリックして、英語の最新版を参照してください。

ブースティング回帰アンサンブル回帰の最適化

この例では、ブースティング回帰アンサンブルのハイパーパラメーターを最適化する方法を示します。最適化によりモデルの交差検証損失を最小化します。

問題は、自動車の加速度、エンジン排気量、馬力および重量に基づいてガロンあたりの走行マイル数で燃費をモデル化することです。これらの予測子および他の予測子が含まれている carsmall データを読み込みます。

load carsmall
X = [Acceleration Displacement Horsepower Weight];
Y = MPG;

LSBoost アルゴリズムと代理分岐を使用してアンサンブル回帰をデータに当てはめます。学習サイクル数、代理分岐の最大数および学習率を変化させることにより、生成されたモデルを最適化します。さらに、最適化で反復ごとに交差検証を再分割できるようにします。

再現性を得るために、乱数シードを設定し、'expected-improvement-plus' の獲得関数を使用します。

rng('default')
Mdl = fitrensemble(X,Y, ...
    'Method','LSBoost', ...
    'Learner',templateTree('Surrogate','on'), ...
    'OptimizeHyperparameters',{'NumLearningCycles','MaxNumSplits','LearnRate'}, ...
    'HyperparameterOptimizationOptions',struct('Repartition',true, ...
    'AcquisitionFunctionName','expected-improvement-plus'))
|====================================================================================================================|
| Iter | Eval   | Objective:  | Objective   | BestSoFar   | BestSoFar   | NumLearningC-|    LearnRate | MaxNumSplits |
|      | result | log(1+loss) | runtime     | (observed)  | (estim.)    | ycles        |              |              |
|====================================================================================================================|
|    1 | Best   |      3.5219 |      23.094 |      3.5219 |      3.5219 |          383 |      0.51519 |            4 |
|    2 | Best   |      3.4752 |       1.058 |      3.4752 |      3.4777 |           16 |      0.66503 |            7 |
|    3 | Best   |      3.1575 |      1.7661 |      3.1575 |      3.1575 |           33 |       0.2556 |           92 |
|    4 | Accept |      6.3076 |     0.75453 |      3.1575 |      3.1579 |           13 |    0.0053227 |            5 |
|    5 | Accept |      3.4449 |      12.089 |      3.1575 |      3.1579 |          277 |      0.45891 |           99 |
|    6 | Accept |      3.9806 |     0.53574 |      3.1575 |      3.1584 |           10 |      0.13017 |           33 |
|    7 | Best   |       3.059 |     0.50434 |       3.059 |        3.06 |           10 |      0.30126 |            3 |
|    8 | Accept |      3.1707 |     0.55084 |       3.059 |      3.1144 |           10 |      0.28991 |           15 |
|    9 | Accept |      3.0937 |      1.7613 |       3.059 |      3.1046 |           10 |      0.31488 |           13 |
|   10 | Accept |       3.196 |      0.5447 |       3.059 |      3.1233 |           10 |      0.32005 |           11 |
|   11 | Best   |      3.0495 |     0.54281 |      3.0495 |      3.1083 |           10 |      0.27882 |           85 |
|   12 | Best   |       2.946 |     0.95546 |       2.946 |      3.0774 |           10 |      0.27157 |            7 |
|   13 | Accept |      3.2026 |     0.56077 |       2.946 |      3.0995 |           10 |      0.25734 |           20 |
|   14 | Accept |      5.7151 |      14.815 |       2.946 |      3.0996 |          376 |     0.001001 |           43 |
|   15 | Accept |       3.207 |      19.833 |       2.946 |      3.0937 |          499 |     0.027394 |           18 |
|   16 | Accept |      3.8606 |      2.9077 |       2.946 |      3.0937 |           36 |     0.041427 |           12 |
|   17 | Accept |      3.2026 |      18.062 |       2.946 |       3.095 |          443 |     0.019836 |           76 |
|   18 | Accept |      3.4832 |      8.5124 |       2.946 |      3.0956 |          205 |      0.99989 |            8 |
|   19 | Accept |      5.6285 |      7.2496 |       2.946 |      3.0942 |          192 |    0.0022197 |            2 |
|   20 | Accept |      3.0896 |      9.8514 |       2.946 |      3.0938 |          188 |     0.023227 |           93 |
|====================================================================================================================|
| Iter | Eval   | Objective:  | Objective   | BestSoFar   | BestSoFar   | NumLearningC-|    LearnRate | MaxNumSplits |
|      | result | log(1+loss) | runtime     | (observed)  | (estim.)    | ycles        |              |              |
|====================================================================================================================|
|   21 | Accept |      3.1408 |      6.2065 |       2.946 |      3.0935 |          156 |      0.02324 |            5 |
|   22 | Accept |       4.691 |     0.64575 |       2.946 |      3.0941 |           12 |     0.076435 |            2 |
|   23 | Accept |      5.4686 |      2.3084 |       2.946 |      3.0935 |           50 |       0.0101 |           58 |
|   24 | Accept |      6.3759 |      1.0393 |       2.946 |      3.0893 |           23 |    0.0014716 |           22 |
|   25 | Accept |      6.1278 |      1.9418 |       2.946 |       3.094 |           47 |    0.0034406 |            2 |
|   26 | Accept |      5.9134 |     0.61777 |       2.946 |      3.0969 |           11 |     0.024712 |           12 |
|   27 | Accept |       3.401 |      6.0447 |       2.946 |      3.0995 |          151 |     0.067779 |            7 |
|   28 | Accept |      3.2757 |      8.3803 |       2.946 |      3.1009 |          198 |     0.032311 |            8 |
|   29 | Accept |      3.2296 |      1.0905 |       2.946 |      3.1023 |           17 |      0.30283 |           19 |
|   30 | Accept |      3.2385 |      3.4949 |       2.946 |      3.1027 |           83 |      0.21601 |           76 |

Figure contains an axes object. The axes object with title Min objective vs. Number of function evaluations contains 2 objects of type line. These objects represent Min observed objective, Estimated min objective.

__________________________________________________________
Optimization completed.
MaxObjectiveEvaluations of 30 reached.
Total function evaluations: 30
Total elapsed time: 192.7954 seconds
Total objective function evaluation time: 157.7177

Best observed feasible point:
    NumLearningCycles    LearnRate    MaxNumSplits
    _________________    _________    ____________

           10             0.27157          7      

Observed objective function value = 2.946
Estimated objective function value = 3.1219
Function evaluation time = 0.95546

Best estimated feasible point (according to models):
    NumLearningCycles    LearnRate    MaxNumSplits
    _________________    _________    ____________

           10             0.30126          3      

Estimated objective function value = 3.1027
Estimated function evaluation time = 0.66784
Mdl = 
  RegressionEnsemble
                         ResponseName: 'Y'
                CategoricalPredictors: []
                    ResponseTransform: 'none'
                      NumObservations: 94
    HyperparameterOptimizationResults: [1x1 BayesianOptimization]
                           NumTrained: 10
                               Method: 'LSBoost'
                         LearnerNames: {'Tree'}
                 ReasonForTermination: 'Terminated normally after completing the requested number of training cycles.'
                              FitInfo: [10x1 double]
                   FitInfoDescription: {2x1 cell}
                       Regularization: []


  Properties, Methods

この損失を、最適化されていないブースティングされたモデルの損失および既定のアンサンブルの損失と比較します。

loss = kfoldLoss(crossval(Mdl,'kfold',10))
loss = 19.2667
Mdl2 = fitrensemble(X,Y, ...
    'Method','LSBoost', ...
    'Learner',templateTree('Surrogate','on'));
loss2 = kfoldLoss(crossval(Mdl2,'kfold',10))
loss2 = 30.4083
Mdl3 = fitrensemble(X,Y);
loss3 = kfoldLoss(crossval(Mdl3,'kfold',10))
loss3 = 29.0495

このアンサンブルを最適化する別の方法については、交差検証の使用によるアンサンブル回帰の最適化を参照してください。