variational Bayes one dimension gaussian mixture model

Question

Rini 2016 年 9 月 25 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/304437-variational-bayes-one-dimension-gaussian-mixture-model

コメント済み: Rini 2016 年 9 月 27 日

GMMVb_SB.m

Hi, I am trying to develop a variational bays one dimension gaussian mixture model following the Penny Roberts paper (there are ND implementation on matlab file exchange but the results do not always converge so I decided to write my own code). The function I have written so far converges sometimes in 4-5 iterations and sometimes just gives NaN values. I debugged and cross checked my code with the equations but haven't figured put what is causing it to behave so weird. Is it possible to have someone in this forum have a look and give me some feedback? I would really appreciate the help!

Thank you

X = Gaussian mixture of 1D data
function [label, model, L] = mixGaussVb_SB(X)
% Variational Bayesian inference for Gaussian mixture.
% Input: 
%   X: 1 x n data matrix
% Output:
%   label: 1 x n cluster label
%   model: trained model structure
fprintf('Variational Bayesian Gaussian mixture: running ... \n');
N=length(X);
%%assign prior
prior.m0=mean(X); prior.v0=((max(X)-min(X))/3).^2; % prior for mean
prior.b0=1e3; prior.c0=1e-3; % prior for variance
prior.lambda0=5; % prior for mixing weights
%%VBGM
tol = 1e-8;
maxiter = 2000;
L = -inf(1,maxiter);
model = init(X,prior,N);
for iter = 2:maxiter
    iter
    prevmu=model.m;
    model = expect(X,model);
    model = maximize(X,model,N,prior);
    if sum((abs(model.m)-abs(prevmu)).^2)<tol
        break;        
    end
   % L(iter) = bound(X,model,prior,N)/n;
    %if abs(L(iter)-L(iter-1)) < tol*abs(L(iter)); break; end
end
L = L(2:iter);
label = zeros(1,N);
[~,label(:)] = max(model.R,[],2);
[~,~,label(:)] = unique(label);
function model = init(X,prior,N)
idx=kmeans(X,2);
for s=1:length(unique(idx))
    temp1=X(idx==s);
    pi(s)=length(X(idx==s))/length(X);
    mu(s)=mean(X(idx==s));
    beta(s)=1./var(X(idx==s));
end
m=mu; %prior for mean
v=(1./beta)./(N.*pi); % prior for mean
b=var(beta)./beta; %prior for variance
c=beta./b; % prior for variance
lambda=100.*pi; %prior for weights
model.R = full(sparse(1:N,idx,1,N,max(idx),N));
model.m=m;
model.v=v;
model.lambda=lambda;
model = maximize(X,model,N,prior);
function model = maximize(X, model,N,prior)
R=model.R;
m=model.m;
v=model.v;
for s=1:length(m)
    pi_bar(s)=(1/N)*sum(R(:,s));
    N_bar(s)=N*pi_bar(s);
    y_est(:,s)=(1/N)* (R(:,s)'*X);
    y_est2(:,s)=(1/N)* (R(:,s)'*X.^2);
      % update hyper param
      % mixing weights
      lambda(s)=N_bar(s)+prior.lambda0;
      % variance
      sigma_bar2(s)=y_est2(:,s)+pi_bar(s)*(m(s).^2+v(s))-2*(m(s)*y_est(s));
      b(s)=1/((N/2)*sigma_bar2(s)+(1/prior.b0));
      c(s)=(0.5*N_bar(s))+prior.c0;
      % mean
      tau0=1/prior.v0;
      tau(s)=1/v(s);
      beta_bar(s)=b(s)*c(s);
      tau(s)=tau0+(N_bar(s)*beta_bar(s));
      m(s)=(tau0/tau(s))*prior.m0+((N_bar(s)*beta_bar(s))/tau(s))*(y_est(s)/pi_bar(s));
  end
tmp1=gamrnd(c,b);
tmp2=gamrnd(lambda,1);
v=1./tau;
model.mean = normrnd(m,v);
model.sigma2 = 1./tmp1;
model.weight = tmp2./sum(tmp2);
model.lambda=lambda;
model.b=b;
model.c=c;
model.v=v;
model.m=m;
model.beta_bar=beta_bar;
model.N_bar=N_bar;
% Done
function model = expect(X, model)
lambda=model.lambda;
c=model.c;
b=model.b;
m=model.m;
v=model.v;
beta_bar=model.beta_bar;
for s=1:length(model.mean)
    tmp_lambda=lambda(~ismember(lambda,lambda(s)));
    pi_est(s)=exp(psi(lambda(s))-psi(sum(tmp_lambda)));
    beta_est(s)=exp(psi(c(s))+log(b(s)));
      pdf_w(:,s)=pi_est(s)*sqrt(beta_est(s))*exp(-0.5*beta_bar(s)*(X.^2+m(s)^2+v(s)-2*m(s).*X));
  end
  R = bsxfun(@rdivide, pdf_w, sum(pdf_w, 2));
model.logR = log(R);
model.R = R;
model.pi_est=pi_est;
  model.beta_est=beta_est;

2 件のコメント
なしを表示なしを非表示

Walter Roberson 2016 年 9 月 26 日

We need to know two example inputs, one which it converges for, and one that it gives nan on.

Rini 2016 年 9 月 26 日

Thank you for the reply. Please find attached two inputs. They are both 1D data generated with mixture models from 3 components.

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Walter Roberson 2016 年 9 月 26 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/304437-variational-bayes-one-dimension-gaussian-mixture-model#answer_236217

MATLAB Online で開く

I notice that the code you attached is not exactly the same as the code you posted. The attached code appears to be more commented.

I loaded your X from nan_input.mat and ran

GMMVb_SB(X.')

By iteration #12, R(:,1) have all become 0 in maximize. That leads to N_bar(1) becoming 0, which leads to c(1) becoming assigned prior.c . When you get to beta_est(s)=exp(psi(c(s))+log(b(s))); in expect then because that c(s) is small, psi(c(1)) becomes -789.025866305206 and log(b(1)) only adds about 6 1/2 to that, exp() of that is 0. Your pdf_w(:,1) becomes entirely 0. sum() of the pdf_w there is still 0. rdivide of 0 / 0 is NaN . And then the entire rest of your calculation is polluted.

Your R(:,1) do not appear to all start out as 0, though some of them do.

I do not know anything about the calculation algorithm you are using (I am not familiar with the math of it). At the moment I do not see any inherent reason why it should not be possible for all of the values to "migrate" to one side or the other. When you are working with exp() and probabilities, underflow to 0 or overflow to inf is a perpetual risk.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Rini 2016 年 9 月 27 日

Thank you for looking into the code. It is weird that the equations for are not that hard to implement for one dimension yet I cannot figure what did I miss. May be I will use some of the files posted for N-Dimension in file exchange.

Thank you once again!

サインインしてコメントする。