ActiveSetMethod: entropy | GPR

Question

Marius Marinescu 2021 年 12 月 3 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1602400-activesetmethod-entropy-gpr

回答済み: Aditya 2025 年 2 月 4 日 4:52

Hello,

I was wondering what option to select for ActiveSetMethod when fitting a Gaussian procces model. Since I have too many data I use the option subset of data point ('FitMethod','sd'), and -'ActiveSetSize',2000- to select only two thousands points. So far I understood, fitrgp select randomly 2000 points from the data set. Some questions arrises:

Do GPR use the other points in the data set (for training)? Where? I saw that in the RegressionGP object there is saved all the data and some matrices have the size of all data (for example matrix W, Alpha,...).
In spite of choosing the points randomly Matlab have the option 'ActiveSetMethod' with four possible values: random (default), sgma, entropy, likelihood. Is there any documentation of what does each option specifically? When I choose entropy, fitgpr takes so long in comparison to random (21 min. vs less than 5). Why is so different?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Aditya 2025 年 2 月 4 日 4:52

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1602400-activesetmethod-entropy-gpr#answer_1558943

Hi Marius,

When using Gaussian Process Regression (GPR) with a large dataset in MATLAB, you can employ the 'FitMethod', 'sd' option to fit the model using a subset of data points, known as the active set. This approach helps manage computational complexity by reducing the number of data points used in training. Here's a breakdown of your questions and the options available:ActiveSetMethod Options

random: Selects data points randomly for the active set. This is the fastest option because it doesn't involve any optimization or criterion-based selection.
sgma (Subset of Data using a Greedy Method for Approximation): Uses a greedy approach to select points that are most representative of the data distribution. This method is more computationally intensive than random selection but aims to choose a more informative subset.
entropy: Selects points based on maximizing the differential entropy of the predictive distribution. This method tries to choose the most informative points and is computationally expensive, which explains the longer runtime compared to random selection.
likelihood: Chooses points that maximize the marginal likelihood of the model. This method is also computationally intensive as it involves optimizing the likelihood function over subsets of the data.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

ActiveSetMethod: entropy | GPR

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

ActiveSetMethod: entropy | GPR

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示