bayesian logistic regression - slicesample - finding Machine learning parameters

Question

Matthias 2016 年 7 月 27 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/297328-bayesian-logistic-regression-slicesample-finding-machine-learning-parameters

コメント済み: Tom Lane 2016 年 7 月 28 日

GT.mat

MATLAB Online で開く

since I have problems with separation for logistic regression I would like to use bayesian logistic regression

I follow this script bayesian logistic regression

However it is for 1D and my problem has 4 features, not 1. I run into problems where the slicesample function is used, and I can't solve the error message "INITIAL is not in the domain of the target distribution."

I tried different initial, but somehow my posterior distribution "post" is always zero. Did I define it wrong? Shall I use another distribution than the binomial one? Or maybe I made something wrong for "logitp" and its use in "post". I feed in a matrix "featAdd1s", while in the link mentioned above, a vector "weight" is used.

Any help highly appreciated. I attached the GT.mat needed for the code.

% features = predictors
featOr = GT(:,1:end-1); % original feature
% recenter and rescale input features as suggested by Gelman2008
% "A weakly informative default prior distribution for logistic and other regression models"
% such that each input feature has mean 0 and std = 0.5
scaleFe = @(feat) (feat.*(0.5/std(feat)))-mean(feat)*(0.5/std(feat));
% http://stats.stackexchange.com/questions/46429/transform-data-to-desired-mean-and-standard-deviation
% add ones, necessary?
featAdd1s = [ones(size(featOr,1),1), scaleFe(featOr(:,1)), scaleFe(featOr(:,2)), scaleFe(featOr(:,3)), scaleFe(featOr(:,4))];
% The number of tests
total = ones(size(featAdd1s,1),1);
% The number of success
poor = GT(:,end);
%%Logistic Regression Model 
logitp = @(b,X) 1./(1+exp(-X*b));
%%use priors according to Gelman2008 - "A weakly informative default prior distribution for logistic and other regression models"
% for intercept
pdIn = makedist('tLocationScale','mu',0,'sigma',10,'nu',1);
prior1 = @(b0) pdf(pdIn,b0);
pd = makedist('tLocationScale','mu',0,'sigma',2.5,'nu',1); % for predictors
prior2 = @(b1) pdf(pd,b1);
prior3 = @(b2) pdf(pd,b2);
prior4 = @(b3) pdf(pd,b3);
prior5 = @(b4) pdf(pd,b4);

By Bayes' theorem, the joint posterior distribution of the model parameters is proportional to the product of the likelihood and priors.

post = @(b) prod(binopdf(poor,total,logitp(b',featAdd1s)))*prior1(b(1))*prior2(b(2))*prior3(b(3))*prior4(b(4))*prior5(b(5)) ;                  % priors
%%Slice Sampling
% initial = [1 1 1 1 1];
initial = [1 1 1 1 1];
post(initial)
% initial = [0.02 0.02 0.02 0.02 0.02];
nsamples = 1000;
% trace = slicesample(initial,nsamples,'pdf',post);
trace = slicesample(initial,nsamples,'pdf',post,'width',[10, 10, 10, 10,10]); % ?! No idea...10 is default...

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Tom Lane 2016 年 7 月 27 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/297328-bayesian-logistic-regression-slicesample-finding-machine-learning-parameters#answer_229937

MATLAB Online で開く

It looks like you have the right idea. But I suspect the calculations are underflowing. You're multiplying thousands of probability values, many perhaps quite small. I was able to make your problem work by using the log posterior instead of the posterior itself:

logpost = @(b) sum(log(binopdf(poor,total,logitp(b',featAdd1s)))) + ...
    log(prior1(b(1))) + log(prior2(b(2))) + log(prior3(b(3))) + log(prior4(b(4))*prior5(b(5)))
trace = slicesample(initial,nsamples,'logpdf',logpost,'width',[10, 10, 10, 10,10]);

I may try to get the published example changed to use this technique.

By the way, it's not necessary in your problem, but sometimes setting the slope coefficients to 0 as an initial value, and the intercept coefficient to some moderate value, can give a starting point that will at least be feasible.

2 件のコメント
なしを表示なしを非表示

Matthias 2016 年 7 月 27 日

Hi Tom

Thank you for your answer and the hint with the starting point. Yes you are right, I noticed that if I use fewer values, and hence fewer terms in the posterior probability, it work. (500 values worked, 1'000 not).

But does this mean the Bayesian approach is limited to a number of observations?

Once I have the model parameters by taking the mean of the slicesample output, can I use them like in a classical logistic regression (sigmoid function) way to predict? (Also note that I scaled the input features first, somehow I have the feeling the found parameters can not be used for an observation with unscaled features)

If not, how else do I do prediction in the Bayesian case?

Tom Lane 2016 年 7 月 28 日

1. Bayes methods are not limited by the number of samples. If you use the 'logpdf' approach that I illustrated, you should be okay.

2. If you only want to get estimates and use them for prediction, you could take the mean of the trace values, possibly omitting some top rows to avoid the effects of the initial values before the traces settle down. You are right that you would have to transform the new X features using the same scaling that you used during fitting. That is, scale using the mean and std of the X from fitting, not by separately scaling new X values based on their own mean and std.

サインインしてコメントする。

bayesian logistic regression - slicesample - finding Machine learning parameters

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

2 件のコメント
なしを表示なしを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

bayesian logistic regression - slicesample - finding Machine learning parameters

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

2 件のコメント なしを表示なしを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示