Normalization pdf histogram and cdf

Hi, I am using this code in MATLAB:
histogram(my data,'Normalization','pdf');
after plotting the pdf histogram, the y axis is in a range between 0 to 100. But I need to have the y axis in a range between 0 to 1, because when I plot: (my data, 'Normalization','cdf') the y axis is in range 0 to 1. Please help me to have both "pdf" and "cdf" in a same y axis scale (0 to 1) in one graph. Thank you.

1 件のコメント

Thi Lan Anh DINH
Thi Lan Anh DINH 2022 年 6 月 28 日
Using 'probability' instead of 'pdf', so that your y-axis will be from 0 to 1.
histogram(my data,'Normalization','probability');

サインインしてコメントする。

回答 (2 件)

Walter Roberson
Walter Roberson 2020 年 5 月 2 日

2 投票

You could use ylim() to simply prevent the peak from being drawn.
Or you could increase the bar widths, such as by decreasing the number of bars you ask for.

5 件のコメント

Farshad Daraei Ghadikolaei
Farshad Daraei Ghadikolaei 2020 年 5 月 2 日
Thanks Roberson for your answer. Actually, I dont want to limit the y axis. I want to have my data in y axis in a scale 0 to 1, instead of 0 to 100. And also I don't want to change bar widths. In otherword, I don't know why in normalization cdf plot, the y axis range is from 0 to 1, but in normalization pdf the y axis in output graph is in range 0 to 100. I want to have both cdf and pdf graph in a same range, 0 to 1.
Walter Roberson
Walter Roberson 2020 年 5 月 2 日
% 'pdf' Probability density function estimate. The height
% of each bar is, (number of observations in bin)
% / (total number of observations * width of bin).
% The area of each bar is the relative number of
% observations, and the sum of the bar areas is
% less than or equal to 1.
Therefore, pdf is not just fraction of the observations, it is fraction scaled according to width of the bin. If you have a large number of observations in a small enough bin, then the pdf should be greater than 1. If you force the pdf to be less than 1 then you are not displaying pdf.
If you have a histogram pdf going up to 100 then you have some seriously distorted statistics, but it is valid.
It might perhaps make more sense for you to use 'probability' as your normalization rather than pdf.
Farshad Daraei Ghadikolaei
Farshad Daraei Ghadikolaei 2020 年 5 月 2 日
編集済み: Farshad Daraei Ghadikolaei 2020 年 5 月 2 日
Yes, your are right. The number and type of my data is not as routine data. And also I have to select a smale bin widths. Is there any way to change y axis in'cdf' plot in a range between 0 to 100 instead of 0 to 1? Or is it possible to have cdf plot in range from 0 to 1 as another axis in right hand side of the pdf plot?
Jeff Miller
Jeff Miller 2020 年 5 月 3 日
Yes, that is possible. Look at the command 'yyaxis' or (if you have an older version of MATLAB) 'plotyy'.
Farshad Daraei Ghadikolaei
Farshad Daraei Ghadikolaei 2020 年 5 月 3 日
Thank you Jeff. I will try.

サインインしてコメントする。

Farshad Daraei Ghadikolaei
Farshad Daraei Ghadikolaei 2020 年 5 月 6 日

0 投票

Hi Walter, Could you please let me know the difference between 'pdf' and 'probability' plots?

3 件のコメント

Walter Roberson
Walter Roberson 2020 年 5 月 6 日
pdf is Probability Density.
Suppose for example that you had a uniform distribution with "probability" p over 0 to 1/2. Then then the total would be and that would be p*(1/2 - 0) = p/2 . But the total probability must be 1, so p/2 == 1 which requires that p = 2. But how can "probability" be 2 ? The answer is that what you are integrating is not probability itself but rather "probability density" with it being necessary to adjust by area that the probability distribution occurs over.
Farshad Daraei Ghadikolaei
Farshad Daraei Ghadikolaei 2020 年 5 月 8 日
Thanks.
Steven Lord
Steven Lord 2022 年 6 月 28 日
If you look at the description of the Name-Value argument 'Normalization' on the histogram documentation page, the table in that section lists how the bin values are computed.
For probability Normalization, the bin values are the bin counts divided by the number of elements in the input data and so they must all be less than or equal to 1.
For pdf Normalization, the bin values are the probability values (= bin counts divided by number of elements) divided by the bin width. If you have bins of width less than 1 the pdf bin values will be greater than the probability bin values.

サインインしてコメントする。

カテゴリ

製品

リリース

R2019a

コメント済み:

2022 年 6 月 28 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by