why my cdf does not match cumsum(pdf)

x=1D array of 100 discrete values that I calculate Rayleigh pdf and cdf as below
pdfrayl = 2 * x .* exp(-x .^ 2);
cdfrayl = 1 - exp(-x .^ 2);
Now I plot following two lines:
plot(x, cdfrayl)
hold on;
plot(x, cumsum(pdfrayl )/ sum(pdfrayl))
I expected them to match exactly, but they don't. Can anybody please explain why they don't match?

 採用された回答

Wayne King
Wayne King 2012 年 9 月 15 日

1 投票

It does not appear to me that you are approximating the Riemann sum of the integral of the PDF correctly here.
x = 0:0.01:10;
y = x/4.*exp(-x.^2/8);
% \Delta x for the Riemann sum
dx = 10/length(x);
yc = cumsum(y).*dx;
yct = 1-exp(-x.^2/8);
plot(x,yc,'r-.','linewidth',2);
hold on;
plot(x,yct,'b');
legend('Approximation of CDF','True CDF', ...
'Location','SouthEast');

その他の回答 (1 件)

Russ Adheaux
Russ Adheaux 2012 年 9 月 15 日

0 投票

Thanks. What if my x vector is a random array that is not equally spaced (no single dx),?I guess I will have to sort x and use dx between consecutive x values in cumsum. Correct?

3 件のコメント

Wayne King
Wayne King 2012 年 9 月 15 日
The dx is coming the length of the interval max(x)-min(x) divided by the number of points. Yes, you would have to sort the x. Do you know the ecdf function?
Star Strider
Star Strider 2012 年 9 月 15 日
For unevenly spaced, monotonically-increasing x-data, I suggest trapz or cumtrapz, specifying both x and y data as arguments. It then allows you to integrate with respect to any x-variable spacing, and is more accurate than cumsum.
Russ Adheaux
Russ Adheaux 2012 年 9 月 19 日
Hmmm...just learned about ecdf. I was using the following code to get cdf of my x data. Is there an advantage to use ecdf directly? My ultimate goal is to fit a user-defined distribution function to this data.
xbins=[0:dx:5]; [n,xout] = hist(x,xbins); yout=n/sum(n)/dx; cdfy=cumsum(yout*dx);

サインインしてコメントする。

カテゴリ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by