Create a Rayleigh distribution fitting experimental data
古いコメントを表示
Hi, i'm trying to fit a rayleigh distribution to experimental data, but even if I've found the optimal parameter B for the distribution, it results in a completely different one.
I've tried using histfit (which works but I can't use in my assignment), makedist and the distributionFitter app.
Here's the code i'm using:
peaks=findpeaks(Data(:,i),'MinPeakHeight',0.01);
peaks=sort(peaks);
B=raylfit(peaks); %find parameter for Rayleigh distribution
ray=makedist('Rayleigh','B',B);
raydist=pdf(ray,peaks);
peaksdist=histcounts(peaks,1000,'Normalization','pdf'); %experimental distribution
hold on
x=linspace(0,max(peaks),length(raydist));
plot(x,raydist,'LineWidth',2)

When using the distributionFitter app I get this result instead:

But the parameter B used is the same in both (0.031).
How can I obtain the same distribution using the makedist function?
2 件のコメント
William Rose
2022 年 1 月 16 日
@DANIELE PASINI, can you provide the data please?
DANIELE PASINI
2022 年 1 月 16 日
採用された回答
その他の回答 (4 件)
the cyclist
2022 年 1 月 16 日
According to the documentation for raylfit, it takes the raw data as input. Is there a reason you are not simply using
B=raylfit(Data(:,i));
to find the fit parameter?
5 件のコメント
DANIELE PASINI
2022 年 1 月 16 日
Image Analyst
2022 年 1 月 16 日
So those points are obviously bogus. Just remove them first
data(data < 0) = [];
DANIELE PASINI
2022 年 1 月 16 日
Here are some things I have observed in your data:
- There are 65,536 data points in the original set
- 32,722 of those -- HALF! -- are positive, so these are not just tiny anomalies
- After using peaks, there are only 3,383 points
I am pretty certain that you are throwing away a lot of data that you do not intend to. I don't think peaks does what you think it does. (It does not just trim away the very small and negative values.)
What @Image Analyst sugggests might be what you want, but I have no idea why you think you should be fitting a Rayleigh distribution to data in which half the data are negative.
DataTable = readtable('https://www.mathworks.com/matlabcentral/answers/uploaded_files/864220/Cartel1.xlsx');
histogram(DataTable{:,1},75)
DANIELE PASINI
2022 年 1 月 16 日
William Rose
2022 年 1 月 16 日
編集済み: William Rose
2022 年 1 月 16 日
@DANIELE PASINI, here is code with simulated data to demonstrate. As others have said, your data may not be appropriate for Rayleigh (whih is square root of the sum of two independent normal random variables). Your samples should be in vector x. Compute N=number of data points before computing numExp.
I did not use histfit, since you said it was not allowed. I did use raylfit(). If raylfit() is not allowed, then compute the maximum likelihood estimate for b^2, and take the square root:
bsqdhat=dot(x,x)/(2*N); . %MLE for b^2
The square toot of bsqdhat is not necessarily a MLE for b, and it may not be unbiased, but this seems to be what raylfit() does, because the result is the same to 4 decimal places, whenever I compare the estimates.
%RayleighFitTest.m
%WCR 20220116
%Generate simulated data to fit.
%Replace this section with your data.
b=0.031;
N=1000; %number of samples
x=b*sqrt(-2*log(rand(1,N))); %x=vector of samples whose distribution we wish to esitmate
%fit the data
bhat=raylfit(x); %Matlab's fit
bsqdhat=dot(x,x)/(2*N); %MLE for b^2:
fprintf('Best fit Rayleigh parameter: bhat=%.4f, sqrt(b^2hat)=%.4f\n',bhat,sqrt(bsqdhat));
%plot the histogram of the data, and the number expected from the fit
h=histogram(x);
BinCenters=(h.BinEdges(1:end-1)+h.BinEdges(2:end))/2;
numExp=N*h.BinWidth*raylpdf(BinCenters,b); %expected number at each bin
hold on; grid on
ylabel('Number of Occurences');
titlestr=sprintf('$Histogram, N=%d, \\hat{b}=%.4f$',N,bhat);
title(titlestr,'Interpreter','latex');
plot(BinCenters,numExp,'-bx');
Good luck!
William Rose
2022 年 1 月 18 日
編集済み: William Rose
2022 年 1 月 18 日
0 投票
[I edited this comment to correct typographical errors.]
histogram(peaks) plots the number of occurences of values within the range of x-values for that bin. The expected number of counts for that bin is the integral of the probability density function over the bin, times the total number of samples.
Since each bin is reasonably narrow, we can approximate the integral with the midpoint value times the width:
where p(x)=probability density at x, and
, and
= bin width. Therefore
The command
raydist=pdf(ray,peaks);
does not give the correct result because it does not multiply by the number of samples, and does not multiply by the bin width, and because the argument peaks is not the correct argument to use. Instead of peaks, you should pass the x-values at which you want to evaluate the probability density function. And those desired x values are the bin center points. You can see and understand from my code how I computed the bin center points.
I hope your work goes well. What is the source of your signal? Seismogram? Biological recording?
William Rose
2022 年 1 月 18 日
0 投票
@DANIELE PASINI, very interesting. The autocorrelation shows some ringing at a period of about 16 samples, i.e. some kind of resonance at that frequency.
カテゴリ
ヘルプ センター および File Exchange で Rayleigh Distribution についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!





