get parameters of gaussian distributions from ksdensity function

16 ビュー (過去 30 日間)
Hey there,
I am locking for a possibilty do get an analytical solution of the distribution of my numbers. Since my numbers are generated by a simulation i can't say for sure which distribution would describe them the best at any time.
The best results i got to describe my data is with the ksdensity funcionmatlab ks density, but the results from ks density are only x and y point of a curve that fits the data.
Is there a possibility to get the parameters of the gaussian distributions from the ksdensity function? Like in the first example here:
where you can clearly see the bimodality of the data. Would it be possible to get the parameters of the 2 gaussian distribution that are superimposed here?


Thiago Henrique Gomes Lobato
Thiago Henrique Gomes Lobato 2019 年 9 月 29 日
編集済み: Thiago Henrique Gomes Lobato 2019 年 9 月 29 日
The ksdensity uses a nonparametric representation to calculate the probabilities, so there's no parameters to get from the function self. If, however, you know which distribution may be underlying it (or can make a good visual estimation), you can do a later parametric optimization of your data to get the parameters. An example based in the two gaussians that you mentioned:
rng('default') % For reproducibility
x = [randn(30,1); 5+randn(30,1)];
[f,xi] = ksdensity(x);
% Here I generate a function from two Gaussians and output
% the rms of the estimation error from the values obtained from ksdensity
fun = @(xx,t,y)rms(y-(xx(5)*1./sqrt(xx(1)^2*2*pi).*exp(-(t-xx(2)).^2/(2*xx(1)^2))+...
xx(6)*1./sqrt(xx(3)^2*2*pi).*exp(-(t-xx(4)).^2/(2*xx(3)^2)) ) );
% Get the parameters with the minimum error. To improve convergence,choose reasonable initial values
[x,fval] = fminsearch(@(r)fun(r,xi,f),[2,0.5,2,4,0.5,0.5]);
% Make sure sigmas are positive
x([1,3]) = abs(x([1,3]));
% Generate the Parametric functions
pd1 = makedist('Normal','mu',x(2),'sigma',x(1));
pd2 = makedist('Normal','mu',x(4),'sigma',x(3));
% Get the probability values
y1 = pdf(pd1,xi)*x(5); % x(5) is the participation factor from pdf1
y2 = pdf(pd2,xi)*x(6); % x(6) is the participation factor from pdf2
% Plot
hold on;
legend({'ksdensity',['\mu : ',num2str(x(2)),'. \sigma :',num2str(x(1))],...
['\mu : ',num2str(x(4)),'. \sigma :',num2str(x(3))],'pdf1+pdf2'})
you can see a list of possible distributions from matlab here in the parameter 'name': .
A good aproach for you might then be:
  1. Plot the distribution data with ksdensity
  2. Verify what does it looks like and search for the distribution that most resemble it
  3. Do a parametric fit in the data as shown above
If you want to fully automatize it you can generate optimization functions for multiple distributions and then choose the one with the lowest fit error. I hope it helped and if something is not clear you can ask it.
  2 件のコメント
Thiago Henrique Gomes Lobato
Thiago Henrique Gomes Lobato 2019 年 9 月 29 日
編集済み: Thiago Henrique Gomes Lobato 2019 年 9 月 29 日
You can adjust your optimization as you get new data, if you know from experience that they will always be 2-4 superimposed gaussians you don't need to test Poisson distributions, for example. An idea would be maybe to use findpeaks to get possible values from the mean and then perform the optimization based in the number of peaks you find:
[PeakValues,meanGuesses] = findpeaks(f);
NOfGaussians = length(meanGuesses);
meanGuesses = xi(meanGuesses);
[x,fval] = fminsearch(@(r)fun(r,xi,f,NOfGaussians),[2,meanGuesses(1),2,meanGuesses(2),0.5,0.5]);
Then just make sure the function do the right thing with the number of gaussians parameter and adjust the initial values vector.


その他の回答 (0 件)




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by