Meaning of the p-value calculation for Kolmogorov-Smirnov test in KSTEST2 function

Question

Aharon Renick 2024 年 12 月 29 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2172540-meaning-of-the-p-value-calculation-for-kolmogorov-smirnov-test-in-kstest2-function

回答済み: Abhas 2024 年 12 月 29 日

I am using the Kolmogorov-Smirnov test to determine if two samples of data derive from the same distribution or not. I understand how the KS statistic is calculated: Basicly it's the maximum difference between the empirical CDF of the two samples. The null hypothesis is that the samples are from the same distribution, so if the p-value is lower than alpha (say, alpha=0.05), we reject the null hypothesis and state that the two samples are from different distributions. What i don't understand is how the p-value is calculated. I am using the MATLAB KSTEST2 function. the p-value calculations is (relavnt parts):

n      =  n1 * n2 /(n1 + n2);
lambda =  max((sqrt(n) + 0.12 + 0.11/sqrt(n)) * KSstatistic , 0);
j       =  (1:101)';
pValue  =  2 * sum((-1).^(j-1).*exp(-2*lambda*lambda*j.^2));
pValue  =  min(max(pValue, 0), 1);

Setting aside the max and min that are probably just making sure lambda isn't negative and p-valus isn't above 1 or under 0, I don't understand how and why the p-value is calculated this way. Can anyone explain?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Abhas 2024 年 12 月 29 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2172540-meaning-of-the-p-value-calculation-for-kolmogorov-smirnov-test-in-kstest2-function#answer_1556615

Hi @Aharon Renick,

The p-value is derived from the limiting distribution of the KS statistic under the null hypothesis. The KS test compares the empirical CDFs of the two samples, and the p-value represents the probability of observing a KS statistic as extreme (or more) as the one calculated, assuming the null hypothesis is true.

The term "exp(-2*lambda*lambda*j.^2))" comes from the theory of Brownian bridges and the asymptotic behavior of the KS statistic.
Alternating signs "(-1).^(j-1)" account for corrections in the cumulative distribution function.
The summation gives the cumulative probability up to the given KS statistic.

After extensive searching, I came across some valuable resources on the topic. One is "http://e-maxx.ru/bookz/files/numerical_recipes.pdf" by Press et al., pages 736–740 (2007). It references another, albeit more complex, resource: "https://www.jstor.org/stable/2984408?seq=1#page_scan_tab_contents" by Stephens, found on pages 115–122 in the Journal of the Royal Statistical Society: Series B (Methodological), 1970.

For interpretation refer the below strategy:

If pValue<α (e.g., α=0.05): Reject the null hypothesis: The two samples come from different distributions.
If pValue≥α: Fail to reject the null hypothesis: Insufficient evidence to claim the distributions differ.

You may refer to the below MathWorks documentation links to know more about the same: https://www.mathworks.com/help/stats/kstest2.html

I hope this helps!

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Meaning of the p-value calculation for Kolmogorov-Smirnov test in KSTEST2 function

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

Meaning of the p-value calculation for Kolmogorov-Smirnov test in KSTEST2 function

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示