ベイズ最適化および ASHA 最適化による回帰モデルの自動選択
この例では、関数 fitrauto
を使用し、指定した学習予測子と応答データに基づいてさまざまなハイパーパラメーターの値をもつ回帰モデルのタイプの選択を自動的に試す方法を示します。既定では、この関数はモデルの選択と評価にベイズ最適化を使用します。学習データ セットに多数の観測値が含まれている場合は、代わりに非同期連続半減アルゴリズム (ASHA) を使用できます。最適化が完了すると、fitrauto
は、データ セット全体で学習済みの、新しいデータについての応答が最も優れているとされるモデルを返します。テスト データに対するモデルの性能をチェックします。
データの準備
標本データ セット NYCHousing2015
を読み込みます。これには、2015 年のニューヨーク市における不動産の売上に関する情報を持つ 10 の変数が含まれます。この例では、これらの変数の一部を使用して売価を解析します。
load NYCHousing2015
標本データ セット NYCHousing2015
を読み込む代わりに、NYC Open Data Web サイトからデータをダウンロードして、次の方法でインポートすることができます。
folder = 'Annualized_Rolling_Sales_Update'; ds = spreadsheetDatastore(folder,"TextType","string","NumHeaderLines",4); ds.Files = ds.Files(contains(ds.Files,"2015")); ds.SelectedVariableNames = ["BOROUGH","NEIGHBORHOOD","BUILDINGCLASSCATEGORY","RESIDENTIALUNITS", ... "COMMERCIALUNITS","LANDSQUAREFEET","GROSSSQUAREFEET","YEARBUILT","SALEPRICE","SALEDATE"]; NYCHousing2015 = readall(ds);
データ セットを前処理して、対象の予測子変数を選択します。前処理手順のいくつかは、線形回帰モデルの学習の例の手順と同じです。
まず、可読性を高めるため、変数名を小文字に変更します。
NYCHousing2015.Properties.VariableNames = lower(NYCHousing2015.Properties.VariableNames);
次に、特定の問題値を持つ標本を削除します。たとえば、面積の測定値 grosssquarefeet
または landsquarefeet
の少なくとも 1 つが非ゼロの標本のみを残します。0 ドルの saleprice
は現金対価なしの所有権移転を示すものと仮定し、その saleprice
の値をもつ標本を削除します。1500 以下の yearbuilt
の値はタイプミスであると仮定し、対応する標本を削除します。
NYCHousing2015(NYCHousing2015.grosssquarefeet == 0 & NYCHousing2015.landsquarefeet == 0,:) = []; NYCHousing2015(NYCHousing2015.saleprice == 0,:) = []; NYCHousing2015(NYCHousing2015.yearbuilt <= 1500,:) = [];
datetime
配列として指定された変数 saledate
を、MM
(月) と DD
(日) の 2 つの数値列に変換し、変数 saledate
を削除します。すべて 2015 年の標本のため、年は無視します。
[~,NYCHousing2015.MM,NYCHousing2015.DD] = ymd(NYCHousing2015.saledate); NYCHousing2015.saledate = [];
変数 borough
の数値は区の名前を示します。この変数を名前を使用したカテゴリカル変数に変更します。
NYCHousing2015.borough = categorical(NYCHousing2015.borough,1:5, ... ["Manhattan","Bronx","Brooklyn","Queens","Staten Island"]);
変数 neighborhood
には 254 のカテゴリがあります。簡単にするため、この変数は削除します。
NYCHousing2015.neighborhood = [];
変数 buildingclasscategory
をカテゴリカル変数に変換し、関数wordcloud
を使用して変数を確認します。
NYCHousing2015.buildingclasscategory = categorical(NYCHousing2015.buildingclasscategory); wordcloud(NYCHousing2015.buildingclasscategory);
1 戸建て住宅、2 戸建て住宅、3 戸建て住宅のみに興味があると仮定します。これらの住宅の標本インデックスを見つけ、それ以外の標本を削除します。次に、変数 buildingclasscategory
を、整数値のカテゴリ名をもつ順序カテゴリカル変数に変更します。
idx = ismember(string(NYCHousing2015.buildingclasscategory), ... ["01 ONE FAMILY DWELLINGS","02 TWO FAMILY DWELLINGS","03 THREE FAMILY DWELLINGS"]); NYCHousing2015 = NYCHousing2015(idx,:); NYCHousing2015.buildingclasscategory = categorical(NYCHousing2015.buildingclasscategory, ... ["01 ONE FAMILY DWELLINGS","02 TWO FAMILY DWELLINGS","03 THREE FAMILY DWELLINGS"], ... ["1","2","3"],'Ordinal',true);
すると、変数 buildingclasscategory
は、1 つの住宅に住む家族の数を示します。
関数 summary
を使用して、応答変数 saleprice
を調べます。
s = summary(NYCHousing2015); s.saleprice
ans = struct with fields:
Size: [24972 1]
Type: 'double'
Description: ''
Units: ''
Continuity: []
Min: 1
Median: 515000
Max: 37000000
NumMissing: 0
変数 saleprice
のヒストグラムを作成します。
histogram(NYCHousing2015.saleprice)
値 saleprice
の分布は右の裾が長く、すべての値が 0 より大きいため、変数 saleprice
を対数変換します。
NYCHousing2015.saleprice = log(NYCHousing2015.saleprice);
同様に、変数 grosssquarefeet
および landsquarefeet
を変換します。変数が 0 に等しい場合に備えて、各変数の対数を取る前に値 1 を加算します。
NYCHousing2015.grosssquarefeet = log(1 + NYCHousing2015.grosssquarefeet); NYCHousing2015.landsquarefeet = log(1 + NYCHousing2015.landsquarefeet);
データの分割と外れ値の削除
cvpartition
を使用して、データ セットを学習セットとテスト セットに分割します。モデル選択とハイパーパラメーター調整のプロセスに観測値の約 80% を使用し、fitrauto
によって返された最終モデルの性能のテストに他の 20% を使用します。
rng("default") % For reproducibility of the partition c = cvpartition(length(NYCHousing2015.saleprice),"Holdout",0.2); trainData = NYCHousing2015(training(c),:); testData = NYCHousing2015(test(c),:);
関数isoutlier
を使用して、学習データから saleprice
、grosssquarefeet
、および landsquarefeet
の外れ値を特定して削除します。
[priceIdx,priceL,priceU] = isoutlier(trainData.saleprice); trainData(priceIdx,:) = []; [grossIdx,grossL,grossU] = isoutlier(trainData.grosssquarefeet); trainData(grossIdx,:) = []; [landIdx,landL,landU] = isoutlier(trainData.landsquarefeet); trainData(landIdx,:) = [];
学習データの計算で使用したのと同じ下限および上限のしきい値を使用して、テスト データから saleprice
、grosssquarefeet
、および landsquarefeet
の外れ値を削除します。
testData(testData.saleprice < priceL | testData.saleprice > priceU,:) = []; testData(testData.grosssquarefeet < grossL | testData.grosssquarefeet > grossU,:) = []; testData(testData.landsquarefeet < landL | testData.landsquarefeet > landU,:) = [];
ベイズ最適化による自動モデル選択の使用
fitrauto
を使用して、trainData
のデータに適切な回帰モデルを見つけます。既定では、fitrauto
は、ベイズ最適化を使用してモデルとそのハイパーパラメーターの値を選択し、各モデルについて の値を計算します。ここで、"valLoss" は交差検証の平均二乗誤差 (MSE) です。fitrauto
は最適化のプロット、および最適化の結果の反復表示を提供します。これらの結果を解釈する方法の詳細については、Verbose の表示を参照してください。
ベイズ最適化を並列実行するよう指定します。これには Parallel Computing Toolbox™ が必要です。並列でのタイミングに再現性がないため、並列ベイズ最適化で再現性のある結果が生成されるとは限りません。最適化の複雑度に応じて、特に大きなデータ セットでは、この処理に時間がかかる場合があります。
bayesianOptions = struct("UseParallel",true); [bayesianMdl,bayesianResults] = fitrauto(trainData,"saleprice", ... "HyperparameterOptimizationOptions",bayesianOptions);
Warning: Data set has more than 10000 observations. Because ASHA optimization often finds good solutions faster than Bayesian optimization for data sets with many observations, try specifying the 'Optimizer' field value as 'asha' in the 'HyperparameterOptimizationOptions' value structure.
Copying objective function to workers... Done copying objective function to workers. Learner types to explore: ensemble, svm, tree Total iterations (MaxObjectiveEvaluations): 90 Total time (MaxTime): Inf |==========================================================================================================================================================| | Iter | Active | Eval | log(1+valLoss)| Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| validation loss | validation loss | | | |==========================================================================================================================================================| | 1 | 8 | Best | 0.25922 | 8.7966 | 0.25922 | 0.25922 | svm | BoxConstraint: 0.0055914 | | | | | | | | | | KernelScale: 0.0056086 | | | | | | | | | | Epsilon: 17.88 | | 2 | 7 | Accept | 0.19644 | 67.356 | 0.19314 | 0.19521 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 232 | | | | | | | | | | MinLeafSize: 8 | | 3 | 7 | Best | 0.19314 | 67.33 | 0.19314 | 0.19521 | svm | BoxConstraint: 529.96 | | | | | | | | | | KernelScale: 813.67 | | | | | | | | | | Epsilon: 0.0014318 | | 4 | 8 | Accept | 0.19662 | 75.495 | 0.19314 | 0.19521 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 271 | | | | | | | | | | MinLeafSize: 53 | | 5 | 8 | Best | 0.18769 | 79.998 | 0.18769 | 0.1877 | svm | BoxConstraint: 23.501 | | | | | | | | | | KernelScale: 37.99 | | | | | | | | | | Epsilon: 0.0072166 | | 6 | 8 | Accept | 0.20198 | 67.278 | 0.18769 | 0.1877 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 246 | | | | | | | | | | MinLeafSize: 1114 | | 7 | 8 | Accept | 0.20227 | 71.042 | 0.18769 | 0.1877 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 246 | | | | | | | | | | MinLeafSize: 1114 | | 8 | 8 | Accept | 0.29931 | 30.061 | 0.18769 | 0.1877 | tree | MinLeafSize: 2 | | 9 | 8 | Best | 0.18737 | 101.93 | 0.18737 | 0.1874 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 297 | | | | | | | | | | MinLeafSize: 3220 | | 10 | 8 | Accept | 0.25922 | 8.4803 | 0.18737 | 0.1874 | svm | BoxConstraint: 0.31228 | | | | | | | | | | KernelScale: 73.3 | | | | | | | | | | Epsilon: 2.1891 | | 11 | 8 | Accept | 0.25922 | 7.6613 | 0.18737 | 0.1874 | svm | BoxConstraint: 107.75 | | | | | | | | | | KernelScale: 414.93 | | | | | | | | | | Epsilon: 27.903 | | 12 | 8 | Accept | 0.19582 | 62.053 | 0.18737 | 0.18742 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 247 | | | | | | | | | | MinLeafSize: 4243 | | 13 | 8 | Accept | 0.18795 | 1.6154 | 0.18737 | 0.18742 | tree | MinLeafSize: 219 | | 14 | 8 | Best | 0.17764 | 256.31 | 0.17764 | 0.17767 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 275 | | | | | | | | | | MinLeafSize: 4 | | 15 | 8 | Accept | 0.1971 | 59.641 | 0.17764 | 0.17767 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 208 | | | | | | | | | | MinLeafSize: 210 | | 16 | 8 | Accept | 0.19855 | 1.8433 | 0.17764 | 0.17767 | tree | MinLeafSize: 895 | | 17 | 8 | Accept | 0.18966 | 78.082 | 0.17764 | 0.17767 | svm | BoxConstraint: 18.072 | | | | | | | | | | KernelScale: 48.632 | | | | | | | | | | Epsilon: 0.014558 | | 18 | 8 | Accept | 0.18558 | 1.0007 | 0.17764 | 0.17767 | tree | MinLeafSize: 81 | | 19 | 8 | Accept | 0.21098 | 3.0171 | 0.17764 | 0.17767 | tree | MinLeafSize: 12 | | 20 | 8 | Best | 0.17762 | 292.86 | 0.17762 | 0.17765 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 299 | | | | | | | | | | MinLeafSize: 161 | |==========================================================================================================================================================| | Iter | Active | Eval | log(1+valLoss)| Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| validation loss | validation loss | | | |==========================================================================================================================================================| | 21 | 8 | Accept | 0.23354 | 76.519 | 0.17762 | 0.17765 | svm | BoxConstraint: 0.0045714 | | | | | | | | | | KernelScale: 31.869 | | | | | | | | | | Epsilon: 0.0072361 | | 22 | 8 | Accept | 0.27791 | 16.397 | 0.17762 | 0.17765 | tree | MinLeafSize: 3 | | 23 | 8 | Accept | 0.20705 | 0.56716 | 0.17762 | 0.17765 | tree | MinLeafSize: 1381 | | 24 | 8 | Accept | 0.25951 | 8.5641 | 0.17762 | 0.17765 | tree | MinLeafSize: 4 | | 25 | 8 | Accept | 0.1853 | 103.97 | 0.17762 | 0.17765 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 218 | | | | | | | | | | MinLeafSize: 2260 | | 26 | 8 | Best | 0.17748 | 234.83 | 0.17748 | 0.17795 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 227 | | | | | | | | | | MinLeafSize: 161 | | 27 | 8 | Accept | 0.21866 | 47.523 | 0.17748 | 0.17756 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 239 | | | | | | | | | | MinLeafSize: 2731 | | 28 | 8 | Best | 0.17744 | 209.05 | 0.17744 | 0.17723 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 209 | | | | | | | | | | MinLeafSize: 12 | | 29 | 8 | Accept | 0.23155 | 5.0007 | 0.17744 | 0.17723 | tree | MinLeafSize: 7 | | 30 | 8 | Accept | 0.25922 | 9.2475 | 0.17744 | 0.17723 | svm | BoxConstraint: 404.64 | | | | | | | | | | KernelScale: 3.2648 | | | | | | | | | | Epsilon: 1.9718 | | 31 | 8 | Accept | 0.1856 | 223.47 | 0.17744 | 0.17723 | svm | BoxConstraint: 169.91 | | | | | | | | | | KernelScale: 27.071 | | | | | | | | | | Epsilon: 0.0098403 | | 32 | 8 | Accept | 0.23949 | 8.5208 | 0.17744 | 0.17723 | tree | MinLeafSize: 6 | | 33 | 8 | Accept | 0.25922 | 7.5558 | 0.17744 | 0.17723 | svm | BoxConstraint: 1.3089 | | | | | | | | | | KernelScale: 0.051591 | | | | | | | | | | Epsilon: 10.5 | | 34 | 8 | Accept | 0.29931 | 49.086 | 0.17744 | 0.17723 | tree | MinLeafSize: 2 | | 35 | 8 | Accept | 0.19293 | 2.0938 | 0.17744 | 0.17723 | tree | MinLeafSize: 421 | | 36 | 8 | Accept | 0.2433 | 44.756 | 0.17744 | 0.17745 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 213 | | | | | | | | | | MinLeafSize: 5333 | | 37 | 8 | Accept | 0.21113 | 0.58255 | 0.17744 | 0.17745 | tree | MinLeafSize: 2018 | | 38 | 8 | Accept | 0.178 | 196.2 | 0.17744 | 0.17745 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 200 | | | | | | | | | | MinLeafSize: 530 | | 39 | 8 | Accept | 0.25922 | 0.15808 | 0.17744 | 0.17745 | tree | MinLeafSize: 9074 | | 40 | 8 | Accept | 0.18727 | 1.3591 | 0.17744 | 0.17745 | tree | MinLeafSize: 46 | |==========================================================================================================================================================| | Iter | Active | Eval | log(1+valLoss)| Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| validation loss | validation loss | | | |==========================================================================================================================================================| | 41 | 8 | Accept | 0.18556 | 1.1831 | 0.17744 | 0.17745 | tree | MinLeafSize: 106 | | 42 | 8 | Accept | 0.18534 | 1.2318 | 0.17744 | 0.17745 | tree | MinLeafSize: 91 | | 43 | 8 | Accept | 0.18634 | 0.78251 | 0.17744 | 0.17745 | tree | MinLeafSize: 69 | | 44 | 8 | Accept | 0.18657 | 0.66041 | 0.17744 | 0.17745 | tree | MinLeafSize: 127 | | 45 | 8 | Accept | 0.1859 | 1.1918 | 0.17744 | 0.17745 | tree | MinLeafSize: 71 | | 46 | 8 | Accept | 0.19423 | 89.074 | 0.17744 | 0.17745 | svm | BoxConstraint: 111.04 | | | | | | | | | | KernelScale: 660.47 | | | | | | | | | | Epsilon: 0.011798 | | 47 | 8 | Accept | 0.18592 | 1.2115 | 0.17744 | 0.17745 | tree | MinLeafSize: 111 | | 48 | 8 | Accept | 0.18682 | 1.6234 | 0.17744 | 0.17745 | tree | MinLeafSize: 143 | | 49 | 8 | Best | 0.17736 | 276.94 | 0.17736 | 0.17735 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 254 | | | | | | | | | | MinLeafSize: 330 | | 50 | 8 | Accept | 0.18845 | 2.9137 | 0.17736 | 0.17735 | tree | MinLeafSize: 41 | | 51 | 8 | Accept | 0.18563 | 2.2093 | 0.17736 | 0.17735 | tree | MinLeafSize: 80 | | 52 | 8 | Accept | 0.18529 | 0.84567 | 0.17736 | 0.17735 | tree | MinLeafSize: 82 | | 53 | 8 | Accept | 0.18529 | 0.98317 | 0.17736 | 0.17735 | tree | MinLeafSize: 83 | | 54 | 8 | Accept | 0.19472 | 1.9906 | 0.17736 | 0.17735 | tree | MinLeafSize: 25 | | 55 | 8 | Accept | 0.22651 | 0.65124 | 0.17736 | 0.17735 | tree | MinLeafSize: 4236 | | 56 | 8 | Accept | 0.33688 | 103.3 | 0.17736 | 0.17735 | tree | MinLeafSize: 1 | | 57 | 8 | Accept | 0.18636 | 1.2646 | 0.17736 | 0.17735 | tree | MinLeafSize: 67 | | 58 | 8 | Best | 0.17725 | 212.81 | 0.17725 | 0.17725 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 221 | | | | | | | | | | MinLeafSize: 63 | | 59 | 8 | Accept | 0.18521 | 1.2055 | 0.17725 | 0.17725 | tree | MinLeafSize: 99 | | 60 | 8 | Accept | 0.18521 | 1.5858 | 0.17725 | 0.17725 | tree | MinLeafSize: 97 | |==========================================================================================================================================================| | Iter | Active | Eval | log(1+valLoss)| Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| validation loss | validation loss | | | |==========================================================================================================================================================| | 61 | 8 | Accept | 0.18545 | 1.5226 | 0.17725 | 0.17725 | tree | MinLeafSize: 96 | | 62 | 8 | Accept | 0.18547 | 0.87251 | 0.17725 | 0.17725 | tree | MinLeafSize: 95 | | 63 | 8 | Accept | 0.19011 | 1.0096 | 0.17725 | 0.17725 | tree | MinLeafSize: 291 | | 64 | 8 | Accept | 0.1949 | 1.1552 | 0.17725 | 0.17725 | tree | MinLeafSize: 598 | | 65 | 8 | Accept | 0.18745 | 1.2691 | 0.17725 | 0.17725 | tree | MinLeafSize: 175 | | 66 | 8 | Accept | 0.1867 | 1.1783 | 0.17725 | 0.17725 | tree | MinLeafSize: 56 | | 67 | 8 | Accept | 0.18534 | 1.4406 | 0.17725 | 0.17725 | tree | MinLeafSize: 91 | | 68 | 8 | Accept | 0.18592 | 1.183 | 0.17725 | 0.17725 | tree | MinLeafSize: 111 | | 69 | 8 | Accept | 0.18535 | 1.0641 | 0.17725 | 0.17725 | tree | MinLeafSize: 89 | | 70 | 8 | Accept | 0.18535 | 1.2021 | 0.17725 | 0.17725 | tree | MinLeafSize: 89 | | 71 | 8 | Accept | 0.19073 | 2.2491 | 0.17725 | 0.17725 | tree | MinLeafSize: 35 | | 72 | 8 | Accept | 0.18662 | 1.7733 | 0.17725 | 0.17725 | tree | MinLeafSize: 57 | | 73 | 8 | Accept | 0.18534 | 1.7077 | 0.17725 | 0.17725 | tree | MinLeafSize: 91 | | 74 | 8 | Accept | 0.17749 | 237.6 | 0.17725 | 0.17725 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 234 | | | | | | | | | | MinLeafSize: 291 | | 75 | 8 | Accept | 0.1854 | 1.3993 | 0.17725 | 0.17725 | tree | MinLeafSize: 93 | | 76 | 8 | Accept | 0.18516 | 2.5983 | 0.17725 | 0.17725 | tree | MinLeafSize: 85 | | 77 | 8 | Accept | 0.18519 | 1.0102 | 0.17725 | 0.17725 | tree | MinLeafSize: 100 | | 78 | 8 | Accept | 0.18518 | 0.85859 | 0.17725 | 0.17725 | tree | MinLeafSize: 87 | | 79 | 8 | Accept | 0.18545 | 0.74629 | 0.17725 | 0.17725 | tree | MinLeafSize: 96 | | 80 | 8 | Accept | 0.18516 | 0.93654 | 0.17725 | 0.17725 | tree | MinLeafSize: 84 | |==========================================================================================================================================================| | Iter | Active | Eval | log(1+valLoss)| Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| validation loss | validation loss | | | |==========================================================================================================================================================| | 81 | 8 | Accept | 0.18523 | 1.1649 | 0.17725 | 0.17725 | tree | MinLeafSize: 88 | | 82 | 8 | Accept | 0.18719 | 1.7177 | 0.17725 | 0.17725 | tree | MinLeafSize: 157 | | 83 | 8 | Accept | 0.18545 | 1.704 | 0.17725 | 0.17725 | tree | MinLeafSize: 96 | | 84 | 8 | Accept | 0.18529 | 0.95989 | 0.17725 | 0.17725 | tree | MinLeafSize: 82 | | 85 | 8 | Accept | 0.18535 | 0.95307 | 0.17725 | 0.17725 | tree | MinLeafSize: 89 | | 86 | 8 | Accept | 0.18596 | 1.1768 | 0.17725 | 0.17725 | tree | MinLeafSize: 110 | | 87 | 8 | Accept | 0.18518 | 1.3797 | 0.17725 | 0.17725 | tree | MinLeafSize: 86 | | 88 | 8 | Accept | 0.18535 | 0.89804 | 0.17725 | 0.17725 | tree | MinLeafSize: 89 | | 89 | 8 | Accept | 0.18572 | 303.75 | 0.17725 | 0.17725 | svm | BoxConstraint: 205.71 | | | | | | | | | | KernelScale: 26.184 | | | | | | | | | | Epsilon: 0.0010342 | | 90 | 8 | Accept | 0.18562 | 1.5575 | 0.17725 | 0.17725 | tree | MinLeafSize: 79 |
__________________________________________________________ Optimization completed. Total iterations: 90 Total elapsed time: 940.6075 seconds Total time for training and validation: 3869.0047 seconds Best observed learner is an ensemble model with: Learner: ensemble Method: LSBoost NumLearningCycles: 221 MinLeafSize: 63 Observed log(1 + valLoss): 0.17725 Time for training and validation: 212.8107 seconds Best estimated learner (returned model) is an ensemble model with: Learner: ensemble Method: LSBoost NumLearningCycles: 221 MinLeafSize: 63 Estimated log(1 + valLoss): 0.17725 Estimated time for training and validation: 212.9539 seconds Documentation for fitrauto display
Total elapsed time
の値から、ベイズ最適化の実行に時間がかかったことがわかります (約 16 分)。
fitrauto
によって返される最終的なモデルが、最適な推定学習器となります。モデルを返す前に、関数は学習データ セット全体 (trainData
)、リストされている Learner
(またはモデル) のタイプ、および表示されたハイパーパラメーター値を使用して、モデルの再学習を行います。
ASHA 最適化による自動モデル選択の使用
学習セットの観測値の数が原因でベイズ最適化による fitrauto
の実行に長い時間がかかる場合は、代わりに ASHA 最適化による fitrauto
を使用することを検討してください。trainData
に含まれる観測値が 10,000 を超える場合は、ASHA 最適化による fitrauto
を使用して適切な回帰モデルを自動的に見つけるよう試します。ASHA 最適化による fitrauto
を使用すると、関数はさまざまなハイパーパラメーターの値をもつ複数のモデルを無作為に選択し、学習データの小さいサブセットで学習させます。特定のモデルに対する の値 (ここで "valLoss" は交差検証 MSE) が有望な場合、そのモデルをプロモートし、より多くの学習データで学習させます。このプロセスを繰り返し、データの量を徐々に増やしながら有望なモデルに学習させます。既定の設定では、fitrauto
は、最適化のプロット、および最適化の結果の反復表示を提供します。これらの結果を解釈する方法の詳細については、Verbose の表示を参照してください。
ASHA 最適化を並列実行するよう指定します。ASHA 最適化は既定のベイズ最適化に比べて反復回数が多くなる場合が多いことに注意してください。時間の制約がある場合は、HyperparameterOptimizationOptions
構造体の MaxTime
フィールドを指定して、fitrauto
を実行する秒数を制限できます。
ashaOptions = struct("Optimizer","asha","UseParallel",true); [ashaMdl,ashaResults] = fitrauto(trainData,"saleprice", ... "HyperparameterOptimizationOptions",ashaOptions);
Copying objective function to workers... Done copying objective function to workers. Learner types to explore: ensemble, svm, tree Total iterations (MaxObjectiveEvaluations): 340 Total time (MaxTime): Inf |=======================================================================================================================================================| | Iter | Active | Eval | log(1+valLoss)| Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| validation loss | size | | | |=======================================================================================================================================================| | 1 | 7 | Error | NaN | 0.74354 | 0.25939 | 228 | svm | BoxConstraint: 0.75271 | | | | | | | | | | KernelScale: 11.791 | | | | | | | | | | Epsilon: 0.70708 | | 2 | 7 | Best | 0.25939 | 0.6809 | 0.25939 | 228 | svm | BoxConstraint: 322.3 | | | | | | | | | | KernelScale: 183.2 | | | | | | | | | | Epsilon: 18.839 | | 3 | 4 | Error | NaN | 1.3032 | 0.20407 | 228 | svm | BoxConstraint: 0.097665 | | | | | | | | | | KernelScale: 15.388 | | | | | | | | | | Epsilon: 0.0088338 | | 4 | 4 | Error | NaN | 0.99145 | 0.20407 | 228 | svm | BoxConstraint: 0.23529 | | | | | | | | | | KernelScale: 0.0053637 | | | | | | | | | | Epsilon: 0.19924 | | 5 | 4 | Error | NaN | 0.96507 | 0.20407 | 228 | svm | BoxConstraint: 0.22674 | | | | | | | | | | KernelScale: 80.959 | | | | | | | | | | Epsilon: 1.3516 | | 6 | 4 | Best | 0.20407 | 1.2793 | 0.20407 | 228 | tree | MinLeafSize: 7 | | 7 | 7 | Accept | 0.26031 | 0.20035 | 0.20407 | 228 | svm | BoxConstraint: 0.020147 | | | | | | | | | | KernelScale: 172.03 | | | | | | | | | | Epsilon: 23.989 | | 8 | 7 | Accept | 0.21268 | 0.5432 | 0.20407 | 228 | tree | MinLeafSize: 2 | | 9 | 8 | Best | 0.19076 | 1.3514 | 0.19076 | 910 | tree | MinLeafSize: 7 | | 10 | 8 | Accept | 0.20199 | 1.5091 | 0.19076 | 228 | svm | BoxConstraint: 0.0010751 | | | | | | | | | | KernelScale: 1.1093 | | | | | | | | | | Epsilon: 0.0079776 | | 11 | 8 | Accept | 0.25956 | 0.4022 | 0.19076 | 228 | tree | MinLeafSize: 6369 | | 12 | 8 | Accept | 0.1994 | 0.19641 | 0.19076 | 228 | tree | MinLeafSize: 20 | | 13 | 8 | Accept | 0.24111 | 11.229 | 0.19076 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 209 | | | | | | | | | | MinLeafSize: 95 | | 14 | 8 | Best | 0.19072 | 0.40043 | 0.19072 | 910 | tree | MinLeafSize: 20 | | 15 | 8 | Accept | 0.25943 | 0.18893 | 0.19072 | 228 | tree | MinLeafSize: 239 | | 16 | 8 | Accept | 0.25931 | 14.082 | 0.19072 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 234 | | | | | | | | | | MinLeafSize: 3498 | | 17 | 7 | Accept | 0.2316 | 22.039 | 0.19072 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 289 | | | | | | | | | | MinLeafSize: 65 | | 18 | 7 | Accept | 0.19145 | 21.99 | 0.19072 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 221 | | | | | | | | | | MinLeafSize: 2 | | 19 | 8 | Accept | 0.25944 | 20.756 | 0.19072 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 239 | | | | | | | | | | MinLeafSize: 4727 | | 20 | 8 | Accept | 0.2593 | 0.28174 | 0.19072 | 228 | svm | BoxConstraint: 235.91 | | | | | | | | | | KernelScale: 152.29 | | | | | | | | | | Epsilon: 18.94 | |=======================================================================================================================================================| | Iter | Active | Eval | log(1+valLoss)| Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| validation loss | size | | | |=======================================================================================================================================================| | 21 | 8 | Accept | 0.4238 | 16.204 | 0.19072 | 228 | svm | BoxConstraint: 159.02 | | | | | | | | | | KernelScale: 809.99 | | | | | | | | | | Epsilon: 0.037815 | | 22 | 7 | Accept | 0.19826 | 26.317 | 0.19072 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 260 | | | | | | | | | | MinLeafSize: 4 | | 23 | 7 | Accept | 0.25943 | 0.15328 | 0.19072 | 228 | tree | MinLeafSize: 469 | | 24 | 7 | Accept | 0.19506 | 21.139 | 0.19072 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 289 | | | | | | | | | | MinLeafSize: 2 | | 25 | 8 | Best | 0.18635 | 16.44 | 0.18635 | 910 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 221 | | | | | | | | | | MinLeafSize: 2 | | 26 | 8 | Accept | 0.20324 | 23.523 | 0.18635 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 293 | | | | | | | | | | MinLeafSize: 1 | | 27 | 8 | Accept | 0.2593 | 0.41755 | 0.18635 | 228 | svm | BoxConstraint: 71.635 | | | | | | | | | | KernelScale: 360.15 | | | | | | | | | | Epsilon: 1.6391 | | 28 | 8 | Accept | 0.1979 | 20.326 | 0.18635 | 910 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 260 | | | | | | | | | | MinLeafSize: 4 | | 29 | 8 | Best | 0.18429 | 27.503 | 0.18429 | 910 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 289 | | | | | | | | | | MinLeafSize: 2 | | 30 | 8 | Error | NaN | 0.85989 | 0.18429 | 228 | svm | BoxConstraint: 0.0015051 | | | | | | | | | | KernelScale: 153.62 | | | | | | | | | | Epsilon: 0.39629 | | 31 | 8 | Accept | 0.25996 | 0.33645 | 0.18429 | 228 | svm | BoxConstraint: 26.844 | | | | | | | | | | KernelScale: 0.0013803 | | | | | | | | | | Epsilon: 0.63605 | | 32 | 8 | Accept | 0.21217 | 0.65386 | 0.18429 | 228 | tree | MinLeafSize: 2 | | 33 | 8 | Error | NaN | 61.857 | 0.18429 | 228 | svm | BoxConstraint: 0.76664 | | | | | | | | | | KernelScale: 0.26621 | | | | | | | | | | Epsilon: 0.0062126 | | 34 | 8 | Accept | 0.2595 | 0.54994 | 0.18429 | 228 | tree | MinLeafSize: 452 | | 35 | 8 | Accept | 3.9362 | 72.511 | 0.18429 | 228 | svm | BoxConstraint: 0.16539 | | | | | | | | | | KernelScale: 0.10362 | | | | | | | | | | Epsilon: 0.0028173 | | 36 | 8 | Accept | 0.19261 | 2.1563 | 0.18429 | 910 | svm | BoxConstraint: 0.0010751 | | | | | | | | | | KernelScale: 1.1093 | | | | | | | | | | Epsilon: 0.0079776 | | 37 | 8 | Accept | 0.2592 | 0.17352 | 0.18429 | 228 | tree | MinLeafSize: 5784 | | 38 | 8 | Accept | 0.25932 | 0.33198 | 0.18429 | 228 | svm | BoxConstraint: 9.1472 | | | | | | | | | | KernelScale: 0.0014485 | | | | | | | | | | Epsilon: 0.013142 | | 39 | 8 | Accept | 4.0201 | 95.532 | 0.18429 | 228 | svm | BoxConstraint: 0.0034677 | | | | | | | | | | KernelScale: 0.024607 | | | | | | | | | | Epsilon: 0.51376 | | 40 | 8 | Accept | 0.25946 | 14.029 | 0.18429 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 233 | | | | | | | | | | MinLeafSize: 1217 | |=======================================================================================================================================================| | Iter | Active | Eval | log(1+valLoss)| Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| validation loss | size | | | |=======================================================================================================================================================| | 41 | 8 | Best | 0.17949 | 55.465 | 0.17949 | 3639 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 221 | | | | | | | | | | MinLeafSize: 2 | | 42 | 8 | Accept | 0.25919 | 0.47484 | 0.17949 | 228 | svm | BoxConstraint: 0.0012342 | | | | | | | | | | KernelScale: 1.9096 | | | | | | | | | | Epsilon: 10.912 | | 43 | 8 | Accept | 0.19872 | 16.731 | 0.17949 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 284 | | | | | | | | | | MinLeafSize: 5 | | 44 | 8 | Accept | 8.752 | 78.427 | 0.17949 | 228 | svm | BoxConstraint: 0.0038233 | | | | | | | | | | KernelScale: 0.1099 | | | | | | | | | | Epsilon: 0.021148 | | 45 | 8 | Accept | 0.25934 | 13.151 | 0.17949 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 291 | | | | | | | | | | MinLeafSize: 1016 | | 46 | 8 | Accept | 0.25921 | 10.067 | 0.17949 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 227 | | | | | | | | | | MinLeafSize: 8012 | | 47 | 8 | Error | NaN | 0.83663 | 0.17949 | 228 | svm | BoxConstraint: 2.8936 | | | | | | | | | | KernelScale: 7.6973 | | | | | | | | | | Epsilon: 0.010032 | | 48 | 8 | Error | NaN | 93.522 | 0.17949 | 228 | svm | BoxConstraint: 0.0057789 | | | | | | | | | | KernelScale: 0.024173 | | | | | | | | | | Epsilon: 0.0019218 | | 49 | 8 | Accept | 0.19661 | 26.107 | 0.17949 | 910 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 293 | | | | | | | | | | MinLeafSize: 1 | | 50 | 8 | Accept | 0.25921 | 0.27531 | 0.17949 | 228 | svm | BoxConstraint: 0.058053 | | | | | | | | | | KernelScale: 14.827 | | | | | | | | | | Epsilon: 13.791 | | 51 | 8 | Error | NaN | 0.59973 | 0.17949 | 228 | svm | BoxConstraint: 0.023521 | | | | | | | | | | KernelScale: 5.596 | | | | | | | | | | Epsilon: 0.0014762 | | 52 | 8 | Accept | 4.3906 | 99.781 | 0.17949 | 228 | svm | BoxConstraint: 96.756 | | | | | | | | | | KernelScale: 0.010139 | | | | | | | | | | Epsilon: 0.13254 | | 53 | 8 | Error | NaN | 2.0696 | 0.17949 | 228 | svm | BoxConstraint: 0.006626 | | | | | | | | | | KernelScale: 0.70401 | | | | | | | | | | Epsilon: 0.0054568 | | 54 | 8 | Accept | 0.25924 | 15.37 | 0.17949 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 290 | | | | | | | | | | MinLeafSize: 2231 | | 55 | 8 | Error | NaN | 0.31071 | 0.17949 | 228 | svm | BoxConstraint: 361.12 | | | | | | | | | | KernelScale: 52.988 | | | | | | | | | | Epsilon: 0.43709 | | 56 | 8 | Error | NaN | 2.0388 | 0.17949 | 228 | svm | BoxConstraint: 16.409 | | | | | | | | | | KernelScale: 3.8514 | | | | | | | | | | Epsilon: 0.023638 | | 57 | 8 | Accept | 0.20898 | 0.93287 | 0.17949 | 910 | tree | MinLeafSize: 2 | | 58 | 8 | Accept | 0.20038 | 0.35381 | 0.17949 | 228 | tree | MinLeafSize: 8 | | 59 | 8 | Accept | 0.25945 | 17.341 | 0.17949 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 273 | | | | | | | | | | MinLeafSize: 688 | | 60 | 8 | Error | NaN | 64.494 | 0.17949 | 228 | svm | BoxConstraint: 0.11582 | | | | | | | | | | KernelScale: 0.34549 | | | | | | | | | | Epsilon: 0.16015 | |=======================================================================================================================================================| | Iter | Active | Eval | log(1+valLoss)| Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| validation loss | size | | | |=======================================================================================================================================================| | 61 | 7 | Accept | 0.25938 | 11.039 | 0.17949 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 207 | | | | | | | | | | MinLeafSize: 2893 | | 62 | 7 | Accept | 0.22949 | 0.29853 | 0.17949 | 228 | tree | MinLeafSize: 77 | | 63 | 8 | Accept | 0.19119 | 0.70442 | 0.17949 | 910 | tree | MinLeafSize: 8 | | 64 | 8 | Accept | 0.18582 | 25.838 | 0.17949 | 910 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 284 | | | | | | | | | | MinLeafSize: 5 | | 65 | 8 | Accept | 0.21762 | 20.878 | 0.17949 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 202 | | | | | | | | | | MinLeafSize: 38 | | 66 | 8 | Error | NaN | 73.825 | 0.17949 | 228 | svm | BoxConstraint: 913.22 | | | | | | | | | | KernelScale: 0.38887 | | | | | | | | | | Epsilon: 0.15596 | | 67 | 8 | Accept | 0.25935 | 0.33883 | 0.17949 | 228 | tree | MinLeafSize: 150 | | 68 | 8 | Accept | 0.20006 | 23.908 | 0.17949 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 249 | | | | | | | | | | MinLeafSize: 2 | | 69 | 8 | Accept | 0.20364 | 33.513 | 0.17949 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 287 | | | | | | | | | | MinLeafSize: 15 | | 70 | 8 | Accept | 0.20016 | 20.232 | 0.17949 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 259 | | | | | | | | | | MinLeafSize: 6 | | 71 | 8 | Accept | 0.25946 | 0.16791 | 0.17949 | 228 | tree | MinLeafSize: 6893 | | 72 | 8 | Accept | 0.35187 | 0.63625 | 0.17949 | 228 | svm | BoxConstraint: 0.19105 | | | | | | | | | | KernelScale: 84.991 | | | | | | | | | | Epsilon: 0.073344 | | 73 | 8 | Accept | 0.20327 | 15.236 | 0.17949 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 211 | | | | | | | | | | MinLeafSize: 2 | | 74 | 8 | Error | NaN | 72.02 | 0.17949 | 228 | svm | BoxConstraint: 0.23518 | | | | | | | | | | KernelScale: 0.53603 | | | | | | | | | | Epsilon: 0.011066 | | 75 | 8 | Accept | 0.26049 | 0.33939 | 0.17949 | 228 | svm | BoxConstraint: 0.0013512 | | | | | | | | | | KernelScale: 0.0015726 | | | | | | | | | | Epsilon: 24.722 | | 76 | 8 | Error | NaN | 0.85688 | 0.17949 | 228 | svm | BoxConstraint: 843.32 | | | | | | | | | | KernelScale: 98.622 | | | | | | | | | | Epsilon: 0.0013207 | | 77 | 8 | Accept | 0.25939 | 0.24487 | 0.17949 | 228 | svm | BoxConstraint: 288.52 | | | | | | | | | | KernelScale: 0.0011806 | | | | | | | | | | Epsilon: 0.12918 | | 78 | 8 | Accept | 0.19746 | 21.241 | 0.17949 | 910 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 249 | | | | | | | | | | MinLeafSize: 2 | | 79 | 8 | Accept | 0.25967 | 0.36212 | 0.17949 | 228 | svm | BoxConstraint: 0.86126 | | | | | | | | | | KernelScale: 0.80732 | | | | | | | | | | Epsilon: 3.6131 | | 80 | 8 | Error | NaN | 30.648 | 0.17949 | 228 | svm | BoxConstraint: 0.014789 | | | | | | | | | | KernelScale: 10.262 | | | | | | | | | | Epsilon: 0.00053097 | |=======================================================================================================================================================| | Iter | Active | Eval | log(1+valLoss)| Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| validation loss | size | | | |=======================================================================================================================================================| | 81 | 8 | Best | 0.17835 | 69.425 | 0.17835 | 3639 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 289 | | | | | | | | | | MinLeafSize: 2 | | 82 | 8 | Error | NaN | 0.70287 | 0.17835 | 228 | svm | BoxConstraint: 0.044119 | | | | | | | | | | KernelScale: 725.24 | | | | | | | | | | Epsilon: 0.067068 | | 83 | 8 | Accept | 0.25922 | 0.20654 | 0.17835 | 228 | tree | MinLeafSize: 5151 | | 84 | 8 | Accept | 0.18422 | 21.378 | 0.17835 | 910 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 259 | | | | | | | | | | MinLeafSize: 6 | | 85 | 8 | Accept | 0.25956 | 15.603 | 0.17835 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 220 | | | | | | | | | | MinLeafSize: 398 | | 86 | 8 | Accept | 0.25925 | 16.649 | 0.17835 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 287 | | | | | | | | | | MinLeafSize: 3704 | | 87 | 8 | Accept | 0.19717 | 20.535 | 0.17835 | 910 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 211 | | | | | | | | | | MinLeafSize: 2 | | 88 | 8 | Accept | 0.25922 | 14.481 | 0.17835 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 215 | | | | | | | | | | MinLeafSize: 4480 | | 89 | 8 | Accept | 0.25923 | 0.31075 | 0.17835 | 228 | svm | BoxConstraint: 93.534 | | | | | | | | | | KernelScale: 0.0012628 | | | | | | | | | | Epsilon: 0.00070881 | | 90 | 8 | Error | NaN | 105.27 | 0.17835 | 228 | svm | BoxConstraint: 0.002754 | | | | | | | | | | KernelScale: 0.030396 | | | | | | | | | | Epsilon: 0.0049664 | | 91 | 8 | Accept | 0.38786 | 1.3545 | 0.17835 | 228 | svm | BoxConstraint: 59.578 | | | | | | | | | | KernelScale: 7.0125 | | | | | | | | | | Epsilon: 0.048114 | | 92 | 8 | Error | NaN | 20.814 | 0.17835 | 228 | svm | BoxConstraint: 16.856 | | | | | | | | | | KernelScale: 0.0069656 | | | | | | | | | | Epsilon: 0.00079872 | | 93 | 7 | Accept | 0.25921 | 16.582 | 0.17835 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 275 | | | | | | | | | | MinLeafSize: 779 | | 94 | 7 | Accept | 0.2592 | 0.15883 | 0.17835 | 228 | tree | MinLeafSize: 5053 | | 95 | 7 | Accept | 0.29146 | 1.0903 | 0.17835 | 228 | svm | BoxConstraint: 0.0029396 | | | | | | | | | | KernelScale: 35.64 | | | | | | | | | | Epsilon: 0.0034305 | | 96 | 8 | Accept | 0.41923 | 0.56162 | 0.17835 | 228 | svm | BoxConstraint: 0.034261 | | | | | | | | | | KernelScale: 9.1273 | | | | | | | | | | Epsilon: 0.04355 | | 97 | 8 | Accept | 0.20525 | 0.70228 | 0.17835 | 910 | tree | MinLeafSize: 2 | | 98 | 8 | Accept | 0.20139 | 0.2252 | 0.17835 | 228 | tree | MinLeafSize: 12 | | 99 | 8 | Accept | 0.25923 | 0.21183 | 0.17835 | 228 | svm | BoxConstraint: 0.076547 | | | | | | | | | | KernelScale: 1.3896 | | | | | | | | | | Epsilon: 5.7928 | | 100 | 8 | Error | NaN | 1.1784 | 0.17835 | 228 | svm | BoxConstraint: 103.69 | | | | | | | | | | KernelScale: 380.67 | | | | | | | | | | Epsilon: 0.023201 | |=======================================================================================================================================================| | Iter | Active | Eval | log(1+valLoss)| Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| validation loss | size | | | |=======================================================================================================================================================| | 101 | 8 | Accept | 0.44687 | 0.86774 | 0.17835 | 228 | svm | BoxConstraint: 0.011037 | | | | | | | | | | KernelScale: 464.93 | | | | | | | | | | Epsilon: 0.01088 | | 102 | 8 | Accept | 0.19127 | 0.46502 | 0.17835 | 910 | tree | MinLeafSize: 12 | | 103 | 8 | Accept | 3.9177 | 105.84 | 0.17835 | 228 | svm | BoxConstraint: 0.18091 | | | | | | | | | | KernelScale: 0.0093375 | | | | | | | | | | Epsilon: 0.0046786 | | 104 | 8 | Error | NaN | 0.33372 | 0.17835 | 228 | svm | BoxConstraint: 0.3297 | | | | | | | | | | KernelScale: 60.67 | | | | | | | | | | Epsilon: 1.522 | | 105 | 8 | Accept | 0.21268 | 0.30804 | 0.17835 | 228 | tree | MinLeafSize: 46 | | 106 | 8 | Accept | 0.19508 | 0.63479 | 0.17835 | 228 | svm | BoxConstraint: 141.35 | | | | | | | | | | KernelScale: 51.798 | | | | | | | | | | Epsilon: 0.0064846 | | 107 | 8 | Accept | 0.25922 | 0.28154 | 0.17835 | 228 | svm | BoxConstraint: 111.07 | | | | | | | | | | KernelScale: 0.010862 | | | | | | | | | | Epsilon: 2.691 | | 108 | 8 | Accept | 0.2592 | 13.479 | 0.17835 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 289 | | | | | | | | | | MinLeafSize: 163 | | 109 | 7 | Accept | 0.19161 | 1.6643 | 0.17835 | 910 | svm | BoxConstraint: 141.35 | | | | | | | | | | KernelScale: 51.798 | | | | | | | | | | Epsilon: 0.0064846 | | 110 | 7 | Accept | 0.25926 | 0.23349 | 0.17835 | 228 | svm | BoxConstraint: 0.0014645 | | | | | | | | | | KernelScale: 0.37849 | | | | | | | | | | Epsilon: 2.0091 | | 111 | 8 | Accept | 0.25923 | 0.21702 | 0.17835 | 228 | svm | BoxConstraint: 46.088 | | | | | | | | | | KernelScale: 0.0015015 | | | | | | | | | | Epsilon: 0.30073 | | 112 | 8 | Accept | 0.19687 | 25.947 | 0.17835 | 910 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 287 | | | | | | | | | | MinLeafSize: 15 | | 113 | 8 | Accept | 0.17871 | 51.27 | 0.17835 | 3639 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 259 | | | | | | | | | | MinLeafSize: 6 | | 114 | 8 | Accept | 0.20081 | 17.879 | 0.17835 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 278 | | | | | | | | | | MinLeafSize: 2 | | 115 | 8 | Accept | 0.20322 | 17.346 | 0.17835 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 255 | | | | | | | | | | MinLeafSize: 4 | | 116 | 8 | Error | NaN | 0.95447 | 0.17835 | 228 | svm | BoxConstraint: 2.9117 | | | | | | | | | | KernelScale: 16.756 | | | | | | | | | | Epsilon: 0.0023456 | | 117 | 8 | Accept | 0.19387 | 13.294 | 0.17835 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 215 | | | | | | | | | | MinLeafSize: 1 | | 118 | 8 | Accept | 0.19425 | 15.035 | 0.17835 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 212 | | | | | | | | | | MinLeafSize: 8 | | 119 | 8 | Accept | 0.25924 | 0.37346 | 0.17835 | 228 | svm | BoxConstraint: 0.0209 | | | | | | | | | | KernelScale: 9.3689 | | | | | | | | | | Epsilon: 26.54 | | 120 | 8 | Accept | 0.26066 | 0.27477 | 0.17835 | 228 | tree | MinLeafSize: 272 | |=======================================================================================================================================================| | Iter | Active | Eval | log(1+valLoss)| Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| validation loss | size | | | |=======================================================================================================================================================| | 121 | 8 | Accept | 0.19484 | 18.004 | 0.17835 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 261 | | | | | | | | | | MinLeafSize: 14 | | 122 | 8 | Accept | 0.2592 | 0.23541 | 0.17835 | 228 | tree | MinLeafSize: 133 | | 123 | 8 | Accept | 0.25921 | 0.29134 | 0.17835 | 228 | svm | BoxConstraint: 1.5995 | | | | | | | | | | KernelScale: 2.8676 | | | | | | | | | | Epsilon: 15.471 | | 124 | 8 | Accept | 0.23223 | 0.43343 | 0.17835 | 228 | tree | MinLeafSize: 1 | | 125 | 8 | Accept | 0.25972 | 0.203 | 0.17835 | 228 | svm | BoxConstraint: 0.0086335 | | | | | | | | | | KernelScale: 400.4 | | | | | | | | | | Epsilon: 2.0501 | | 126 | 8 | Error | NaN | 0.2949 | 0.17835 | 228 | svm | BoxConstraint: 7.4426 | | | | | | | | | | KernelScale: 0.002509 | | | | | | | | | | Epsilon: 0.0026332 | | 127 | 8 | Accept | 0.26011 | 0.29631 | 0.17835 | 228 | svm | BoxConstraint: 0.11427 | | | | | | | | | | KernelScale: 567.97 | | | | | | | | | | Epsilon: 17.13 | | 128 | 8 | Accept | 0.25923 | 0.32762 | 0.17835 | 228 | svm | BoxConstraint: 83.085 | | | | | | | | | | KernelScale: 0.0012722 | | | | | | | | | | Epsilon: 0.0023782 | | 129 | 8 | Accept | 0.19582 | 20.926 | 0.17835 | 910 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 278 | | | | | | | | | | MinLeafSize: 2 | | 130 | 8 | Accept | 0.21135 | 0.35596 | 0.17835 | 228 | tree | MinLeafSize: 3 | | 131 | 8 | Error | NaN | 87.153 | 0.17835 | 228 | svm | BoxConstraint: 358.5 | | | | | | | | | | KernelScale: 0.081127 | | | | | | | | | | Epsilon: 0.002852 | | 132 | 8 | Accept | 0.25922 | 0.18321 | 0.17835 | 228 | tree | MinLeafSize: 4593 | | 133 | 7 | Error | NaN | 0.80608 | 0.17835 | 228 | svm | BoxConstraint: 0.0082359 | | | | | | | | | | KernelScale: 64.836 | | | | | | | | | | Epsilon: 0.25191 | | 134 | 7 | Accept | 0.2592 | 0.1831 | 0.17835 | 228 | svm | BoxConstraint: 0.029216 | | | | | | | | | | KernelScale: 8.6693 | | | | | | | | | | Epsilon: 14.283 | | 135 | 8 | Accept | 0.21864 | 0.42231 | 0.17835 | 228 | tree | MinLeafSize: 66 | | 136 | 8 | Accept | 4.0359 | 106.74 | 0.17835 | 228 | svm | BoxConstraint: 97.5 | | | | | | | | | | KernelScale: 0.013998 | | | | | | | | | | Epsilon: 0.04939 | | 137 | 8 | Accept | 0.1864 | 18.298 | 0.17835 | 910 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 215 | | | | | | | | | | MinLeafSize: 1 | | 138 | 8 | Error | NaN | 0.93006 | 0.17835 | 228 | svm | BoxConstraint: 0.0092347 | | | | | | | | | | KernelScale: 496.16 | | | | | | | | | | Epsilon: 0.11821 | | 139 | 8 | Accept | 0.18463 | 21.544 | 0.17835 | 910 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 212 | | | | | | | | | | MinLeafSize: 8 | | 140 | 8 | Error | NaN | 4.9749 | 0.17835 | 228 | svm | BoxConstraint: 0.24317 | | | | | ...
__________________________________________________________ Optimization completed. Total iterations: 340 Total elapsed time: 725.1069 seconds Total time for training and validation: 5193.6688 seconds Best observed learner is an ensemble model with: Learner: ensemble Method: LSBoost NumLearningCycles: 289 MinLeafSize: 2 Observed log(1 + valLoss): 0.17753 Time for training and validation: 295.4965 seconds Documentation for fitrauto display
Total elapsed time
の値から、ASHA 最適化の方がベイズ最適化よりも実行時間が短くなったことがわかります (約 12 分)。
fitrauto
によって返される最終的なモデルが、観測された最適な学習器となります。モデルを返す前に、関数は学習データ セット全体 (trainData
)、リストされている Learner
(またはモデル) のタイプ、および表示されたハイパーパラメーター値を使用して、モデルの再学習を行います。
テスト セットのパフォーマンスの評価
テスト セット testData
で返されたモデル bayesianMdl
および ashaMdl
のパフォーマンスを評価します。各モデルについて、テスト セットの平均二乗誤差 (MSE) を計算し、MSE の対数変換を行って、fitrauto
の詳細表示の値と一致させます。MSE (および対数変換された MSE) の値が小さいほど、パフォーマンスが優れていることを示します。
bayesianTestMSE = loss(bayesianMdl,testData,"saleprice");
bayesianTestError = log(1 + bayesianTestMSE)
bayesianTestError = 0.1782
ashaTestMSE = loss(ashaMdl,testData,"saleprice");
ashaTestError = log(1 + ashaTestMSE)
ashaTestError = 0.1795
各モデルについて、テスト セットの予測応答値と実際の応答値を比較します。予測売価を縦軸に、実際の売価を横軸に沿ってプロットします。基準線上にある点は予測が正しいことを示します。優れたモデルでは、生成された予測が線の近くに分布します。1 行 2 列のタイル レイアウトを使用して 2 つのモデルの結果を比較します。
bayesianTestPredictions = predict(bayesianMdl,testData); ashaTestPredictions = predict(ashaMdl,testData); tiledlayout(1,2) nexttile plot(testData.saleprice,bayesianTestPredictions,".") hold on plot(testData.saleprice,testData.saleprice) % Reference line hold off xlabel(["True Sale Price","(log transformed)"]) ylabel(["Predicted Sale Price","(log transformed)"]) title("Bayesian Optimization Model") nexttile plot(testData.saleprice,ashaTestPredictions,".") hold on plot(testData.saleprice,testData.saleprice) % Reference line hold off xlabel(["True Sale Price","(log transformed)"]) ylabel(["Predicted Sale Price","(log transformed)"]) title("ASHA Optimization Model")
対数変換された MSE の値および予測プロットから、bayesianMdl
および ashaMdl
モデルの性能は、テスト セットで同様に優れていることがわかります。
各モデルについて、箱ひげ図を使用して、行政区ごとの予測売価と実際の売価の分布を比較します。関数boxchart
を使用して、箱ひげ図を作成します。各箱ひげ図には、中央値、第 1 四分位数と第 3 四分位数、外れ値 (四分位数間範囲を使用して計算)、および外れ値ではない最小値と最大値を表示します。特に、各ボックスの内側の線は標本の中央値であり、円形のマーカーは外れ値を示します。
各行政区について、赤色の箱ひげ図 (予測売価の分布を示す) と青色の箱ひげ図 (実際の売価の分布を示す) を比較します。予測売価と実際の売価の分布が似ていることは、予測が優れていることを示します。1 行 2 列のタイル レイアウトを使用して 2 つのモデルの結果を比較します。
tiledlayout(1,2) nexttile boxchart(testData.borough,testData.saleprice) hold on boxchart(testData.borough,bayesianTestPredictions) hold off legend(["True Sale Prices","Predicted Sale Prices"]) xlabel("Borough") ylabel(["Sale Price","(log transformed)"]) title("Bayesian Optimization Model") nexttile boxchart(testData.borough,testData.saleprice) hold on boxchart(testData.borough,ashaTestPredictions) hold off legend(["True Sale Prices","Predicted Sale Prices"]) xlabel("Borough") ylabel(["Sale Price","(log transformed)"]) title("ASHA Optimization Model")
両方のモデルについて、各行政区における予測売価の中央値は実際の売価の中央値とほぼ一致しています。予測売価は、実際の売価よりも変動が少ないようです。
各モデルについて、住宅に住む家族の数ごとに予測売価と実際の売価の分布を比較するボックス チャートを表示します。1 行 2 列のタイル レイアウトを使用して 2 つのモデルの結果を比較します。
tiledlayout(1,2) nexttile boxchart(testData.buildingclasscategory,testData.saleprice) hold on boxchart(testData.buildingclasscategory,bayesianTestPredictions) hold off legend(["True Sale Prices","Predicted Sale Prices"]) xlabel("Number of Families in Dwelling") ylabel(["Sale Price","(log transformed)"]) title("Bayesian Optimization Model") nexttile boxchart(testData.buildingclasscategory,testData.saleprice) hold on boxchart(testData.buildingclasscategory,ashaTestPredictions) hold off legend(["True Sale Prices","Predicted Sale Prices"]) xlabel("Number of Families in Dwelling") ylabel(["Sale Price","(log transformed)"]) title("ASHA Optimization Model")
両方のモデルについて、各タイプの住宅における予測売価の中央値は実際の売価の中央値とほぼ一致しています。予測売価は、実際の売価よりも変動が少ないようです。
各モデルについて、テスト セットの残差のヒストグラムをプロットし、それらが正規分布していることを確認します (売価は対数変換されます)。1 行 2 列のタイル レイアウトを使用して 2 つのモデルの結果を比較します。
bayesianTestResiduals = testData.saleprice - bayesianTestPredictions; ashaTestResiduals = testData.saleprice - ashaTestPredictions; tiledlayout(1,2) nexttile histogram(bayesianTestResiduals) title("Test Set Residuals (Bayesian)") nexttile histogram(ashaTestResiduals) title("Test Set Residuals (ASHA)")
ヒストグラムはわずかに左の裾が長くなっていますが、両方とも 0 付近でほぼ対称です。
参考
fitrauto
| boxchart
| histogram
| BayesianOptimization