メインコンテンツ

このページの内容は最新ではありません。最新版の英語を参照するには、ここをクリックします。

ブースティング回帰アンサンブル回帰の最適化

この例では、ブースティング回帰アンサンブルのハイパーパラメーターを最適化する方法を示します。最適化によりモデルの交差検証損失を最小化します。

問題は、自動車の加速度、エンジン排気量、馬力および重量に基づいてガロンあたりの走行マイル数で燃費をモデル化することです。これらの予測子および他の予測子が含まれている carsmall データを読み込みます。

load carsmall
X = [Acceleration Displacement Horsepower Weight];
Y = MPG;

LSBoost アルゴリズムと代理分岐を使用してアンサンブル回帰をデータに当てはめます。学習サイクル数、代理分岐の最大数および学習率を変化させることにより、生成されたモデルを最適化します。さらに、最適化で反復ごとに交差検証を再分割できるようにします。

再現性を得るために、乱数シードを設定し、'expected-improvement-plus' の獲得関数を使用します。

rng('default')
Mdl = fitrensemble(X,Y, ...
    'Method','LSBoost', ...
    'Learner',templateTree('Surrogate','on'), ...
    'OptimizeHyperparameters',{'NumLearningCycles','MaxNumSplits','LearnRate'}, ...
    'HyperparameterOptimizationOptions',struct('Repartition',true, ...
    'AcquisitionFunctionName','expected-improvement-plus'))
|====================================================================================================================|
| Iter | Eval   | Objective:  | Objective   | BestSoFar   | BestSoFar   | NumLearningC-|    LearnRate | MaxNumSplits |
|      | result | log(1+loss) | runtime     | (observed)  | (estim.)    | ycles        |              |              |
|====================================================================================================================|
|    1 | Best   |      3.5219 |      10.455 |      3.5219 |      3.5219 |          383 |      0.51519 |            4 |
|    2 | Best   |      3.4752 |     0.69741 |      3.4752 |      3.4777 |           16 |      0.66503 |            7 |
|    3 | Best   |      3.1575 |     0.98093 |      3.1575 |      3.1575 |           33 |       0.2556 |           92 |
|    4 | Accept |      6.3076 |     0.44059 |      3.1575 |      3.1579 |           13 |    0.0053227 |            5 |
|    5 | Accept |      3.4449 |      7.1181 |      3.1575 |      3.1579 |          277 |      0.45891 |           99 |
|    6 | Accept |      3.9806 |      0.3954 |      3.1575 |      3.1584 |           10 |      0.13017 |           33 |
|    7 | Best   |       3.059 |      0.3028 |       3.059 |        3.06 |           10 |      0.30126 |            3 |
|    8 | Accept |      3.1707 |     0.39215 |       3.059 |      3.1144 |           10 |      0.28991 |           15 |
|    9 | Accept |      3.0937 |     0.33979 |       3.059 |      3.1046 |           10 |      0.31488 |           13 |
|   10 | Accept |       3.196 |     0.29743 |       3.059 |      3.1233 |           10 |      0.32005 |           11 |
|   11 | Best   |      3.0495 |      0.3101 |      3.0495 |      3.1083 |           10 |      0.27882 |           85 |
|   12 | Best   |       2.946 |     0.35846 |       2.946 |      3.0774 |           10 |      0.27157 |            7 |
|   13 | Accept |      3.2026 |     0.35964 |       2.946 |      3.0995 |           10 |      0.25734 |           20 |
|   14 | Accept |      5.7151 |      8.3193 |       2.946 |      3.0996 |          376 |     0.001001 |           43 |
|   15 | Accept |       3.207 |       11.35 |       2.946 |      3.0937 |          499 |     0.027394 |           18 |
|   16 | Accept |      3.8606 |     0.95907 |       2.946 |      3.0937 |           36 |     0.041427 |           12 |
|   17 | Accept |      3.2026 |      10.153 |       2.946 |       3.095 |          443 |     0.019836 |           76 |
|   18 | Accept |      3.4832 |      4.7346 |       2.946 |      3.0956 |          205 |      0.99989 |            8 |
|   19 | Accept |      5.6285 |      4.3078 |       2.946 |      3.0942 |          192 |    0.0022197 |            2 |
|   20 | Accept |      3.0896 |      4.4109 |       2.946 |      3.0938 |          188 |     0.023227 |           93 |
|====================================================================================================================|
| Iter | Eval   | Objective:  | Objective   | BestSoFar   | BestSoFar   | NumLearningC-|    LearnRate | MaxNumSplits |
|      | result | log(1+loss) | runtime     | (observed)  | (estim.)    | ycles        |              |              |
|====================================================================================================================|
|   21 | Accept |      3.1408 |      3.3598 |       2.946 |      3.0935 |          156 |      0.02324 |            5 |
|   22 | Accept |       4.691 |     0.39154 |       2.946 |      3.0941 |           12 |     0.076435 |            2 |
|   23 | Accept |      5.4686 |      1.2156 |       2.946 |      3.0935 |           50 |       0.0101 |           58 |
|   24 | Accept |      6.3759 |     0.64429 |       2.946 |      3.0893 |           23 |    0.0014716 |           22 |
|   25 | Accept |      6.1278 |      1.2505 |       2.946 |       3.094 |           47 |    0.0034406 |            2 |
|   26 | Accept |      5.9134 |     0.38206 |       2.946 |      3.0969 |           11 |     0.024712 |           12 |
|   27 | Accept |       3.401 |      3.4613 |       2.946 |      3.0995 |          151 |     0.067779 |            7 |
|   28 | Accept |      3.2757 |      4.4521 |       2.946 |      3.1009 |          198 |     0.032311 |            8 |
|   29 | Accept |      3.2296 |     0.60026 |       2.946 |      3.1023 |           17 |      0.30283 |           19 |
|   30 | Accept |      3.2385 |      1.9849 |       2.946 |      3.1027 |           83 |      0.21601 |           76 |

__________________________________________________________
Optimization completed.
MaxObjectiveEvaluations of 30 reached.
Total function evaluations: 30
Total elapsed time: 100.935 seconds
Total objective function evaluation time: 84.4244

Best observed feasible point:
    NumLearningCycles    LearnRate    MaxNumSplits
    _________________    _________    ____________

           10             0.27157          7      

Observed objective function value = 2.946
Estimated objective function value = 3.1219
Function evaluation time = 0.35846

Best estimated feasible point (according to models):
    NumLearningCycles    LearnRate    MaxNumSplits
    _________________    _________    ____________

           10             0.30126          3      

Estimated objective function value = 3.1027
Estimated function evaluation time = 0.34872

Figure contains an axes object. The axes object with title Min objective vs. Number of function evaluations, xlabel Function evaluations, ylabel Min objective contains 2 objects of type line. These objects represent Min observed objective, Estimated min objective.

Mdl = 
  RegressionEnsemble
                         ResponseName: 'Y'
                CategoricalPredictors: []
                    ResponseTransform: 'none'
                      NumObservations: 94
    HyperparameterOptimizationResults: [1×1 BayesianOptimization]
                           NumTrained: 10
                               Method: 'LSBoost'
                         LearnerNames: {'Tree'}
                 ReasonForTermination: 'Terminated normally after completing the requested number of training cycles.'
                              FitInfo: [10×1 double]
                   FitInfoDescription: {2×1 cell}
                       Regularization: []


  Properties, Methods

この損失を、最適化されていないブースティングされたモデルの損失および既定のアンサンブルの損失と比較します。

loss = kfoldLoss(crossval(Mdl,'kfold',10))
loss = 
19.2667
Mdl2 = fitrensemble(X,Y, ...
    'Method','LSBoost', ...
    'Learner',templateTree('Surrogate','on'));
loss2 = kfoldLoss(crossval(Mdl2,'kfold',10))
loss2 = 
30.4083
Mdl3 = fitrensemble(X,Y);
loss3 = kfoldLoss(crossval(Mdl3,'kfold',10))
loss3 = 
29.0495

このアンサンブルを最適化する別の方法については、交差検証の使用によるアンサンブル回帰の最適化を参照してください。