File Exchange

image thumbnail

EM algorithm for Gaussian mixture model with background noise

version (3.07 KB) by Andrew
Standard EM algorithm to fit a GMM with the (optional) consideration of background noise.


Updated 16 May 2012

View Version History

View License

This is the standard EM algorithm for GMMs, presented in Bishop's book "Pattern Recognition and Machine Learning", Chapter 9, with one small exception, the addition of a uniform distribution to the mixture to pick up background noise/speckle; data points which one would not want to associate with any cluster.

NOTE: This function requires the MATLAB Statistical Toolbox and, for plotting the ellipses, the function error_ellipse, available from Also requires at least MATLAB 7.9 (2009b)

For a demo example simply run GM_EM();
Plotting is provided automatically for 1D/2D cases with 5 GMs or less.

Usage: % GM_EM - fit a Gaussian mixture model to N points located in n-dimensional space.
% GM_EM(X,k) - fit a GMM to X, where X is N x n and k is the number of
% clusters. Algorithm follows steps outlined in Bishop
% (2009) 'Pattern Recognition and Machine Learning', Chapter 9.

% Optional inputs
% bn_noise - allow for uniform background noise term ('T' or 'F',
% default 'T'). If 'T', relevant classification uses the
% (k+1)th cluster
% reps - number of repetitions with different initial conditions
% (default = 10). Note: only the best fit (in a likelihood sense) is
% returned.
% max_iters - maximum iteration number for EM algorithm (default = 100)
% tol - tolerance value (default = 0.01)

% Outputs
% idx - classification/labelling of data in X
% mu - GM centres

Cite As

Andrew (2021). EM algorithm for Gaussian mixture model with background noise (, MATLAB Central File Exchange. Retrieved .

Comments and Ratings (5)

Anders Ueland

Thank you! A very nice contribution.

I used your program on a feature vector with 20 000 samples and I tried to make it faster. By replacing the matrix product by a vectorized implementation, avoiding the diag function, I achieved a speedup of a factor of 40.

Current matrix product implementation:
% tot_sum = (X'-repmat(mu(:,j),1,N)) * diag(gamma_znk(:,j)) * (X'-repmat(mu(:,j),1,N))';

Suggested implementation:
% tot_sum = bsxfun(@times, X'-repmat(mu(:,j),1,N), gamma_znk(:,j)') * (X'-repmat(mu(:,j),1,N))';

Muhammad Jawaid

David Provencher

I'm trying to run the code, but I keep getting this warning :

'Warning: chol failed, algorithm abandoned';

because the cholcov(Sigma(:,:,j),0); line always fails at the 2nd iteration (bn_noise='T') or 3rd iteration (bn_noise='F').

FYI, I have no NaN values in my data, and I get coherent results with kmeans() and emgm() [the submission that inspired this one]. Actually, no matter what data I feed into the function (e.g. squre matrix, rand(m,n), ...) this step always fails.

Any insight on this?

Jin Wang

the input has to be square,right?
if my input data is not square, like 200x10, what should I do?


an "unknown" cluster, this is what we have been looking for. thanks a lot.

MATLAB Release Compatibility
Created with R2009b
Compatible with any release
Platform Compatibility
Windows macOS Linux

Inspired by: EM Algorithm for Gaussian Mixture Model (EM GMM)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!