ksdensity general help, what bandwidth and other settings to choose?

10 ビュー (過去 30 日間)
Right Grievous
Right Grievous 2014 年 5 月 25 日
コメント済み: the cyclist 2014 年 5 月 26 日
Hi everybody,
I have some general questions regarding the ksdensity function. I'm trying to compare the cumulative density function of two samples, if I plot the data using a histogram they don't appear to be that different.
When I use the ksdensity (cdf) function and test for a significant difference using a K-S test, significance seems to depend a lot on the bandwith and bounding of my distributions.
My question is: what are the rules for choosing a bandwidth etc? If significance depends on the settings I choose I want to have good reasons for choosing them.
I am constraining the distributions between 0 and 500 because my maximum value is 460 and the data are all positive. However, the default bandwidth results in minor differences towards the upper end of the distributions, a bandwidth of 1.5 does not.
I have attached the data, column 1 is the group, column 2 are the measurements.
Thank you for any help,
Rod.

採用された回答

the cyclist
the cyclist 2014 年 5 月 26 日
You should be able to apply the K-S statistical test directly to your discrete data. You don't need to (and shouldn't) apply ksdensity() before testing for significance between your two distributions.
(I did not see any data attached to your question.)
  2 件のコメント
Right Grievous
Right Grievous 2014 年 5 月 26 日
I'm at work and don't have the data with me here, I definitely attached it, but you are right that it's not there.
I thought I should use a density function because my sample sizes are quite different, both are large (>100 samples) but one is twice the size of the other. I could bin the data and calculate percentage of the sample in each bin? I basically want to show that both groups are drawn from the same distribution.
Thanks for your help,
Rod.
the cyclist
the cyclist 2014 年 5 月 26 日
The data vectors do not need to be the same size. For example,
x1 = randn(100,1);
x2 = randn(200,1);
[h,p,ksstat] = kstest2(x1,x2)
will work just fine.

サインインしてコメントする。

その他の回答 (0 件)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by