Question about Anderson-Darling (adtest)

Question

BN 2020 年 7 月 31 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/573160-question-about-anderson-darling-adtest

回答済み: Stan Driggs 2021 年 11 月 2 日

I want to check if two data sets have similar distribution. I would like to use Anderson-darling test in order to do that, But adtest() in Matlab returns a test decision for the null hypothesis that the data in vector x is from a population with a normal distribution. My question is how to check if two data sets have similar distribution or not (without specifying the nature of that distributions).

So is it possible to do that in Matlab?

Thanks

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

John D'Errico 2020 年 7 月 31 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/573160-question-about-anderson-darling-adtest#answer_473464

編集済み: John D'Errico 2020 年 7 月 31 日

MATLAB Online で開く

Not using adtest. Like the Ford Model T, which Henry Ford sold in any color as long as the color you wanted was black, adtest tests to see if your distriution is any distribution, as long as normal is the distribution you want to test against.

Instead, you probably want to use a Kolmogorov-Smirnov test.

http://www.mit.edu/~6.s085/notes/lecture5.pdf

From the help for kstest2:

kstest2 Two-sample Kolmogorov-Smirnov goodness-of-fit hypothesis test.
    H = kstest2(X1,X2) performs a Kolmogorov-Smirnov (K-S) test 
    to determine if independent random samples, X1 and X2, are drawn from 
    the same underlying continuous population. 

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

BN 2020 年 7 月 31 日

Thank you, more ever than what you mentioned about the Kolmogorov Smirnov test; I found a function in FEX which do something like what you said but using Anderson darling test. But this function just prints results in the workspace and not able to save p-value. I ask how to overcome this issue in another question.

Anyways Kolmogorov Smirnov test which you mentioned is perfect too.

Thank you

Best Regards

サインインしてコメントする。

Answer 2

Stan Driggs 2021 年 11 月 2 日

2
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/573160-question-about-anderson-darling-adtest#answer_822275

I know this is a stale question, but John's answer vis-a-vi Henry Ford is a bit misleading. You can use Anderson-Darling to test for ANY continuous distribution. The adtest function allows you to specify different distributions by name or with a distribution object. Note that if you create a distribution object, you must specify the parameters of the distribution, which might be unknown. The problem is the critical values of the test statistic can be slightly different if you use estimates of the parameters for that particular distribution (e.g. sample mean and variance) instead of true values. The critical values are also slightly different depending on the number of samples in your data.

In general, if you know the CDF function and can generate random data that follows a specific distribution, then you can generate thousands of cases, calculate the AD test statistic for each case, histogram the ad values, and determine the critical values for various levels of significance yourself. If you really want to understand Anderson-Darling, you should go through this exercise. The adtest function does this monte carlo process for you when you pass in a distrbution object. For the built-in supported distributions (norm, exp, ev, logn, weibull) it probably uses precomputed critical value tables and adjusts for the number of samples. These published tables were originally generated by monte carlo analysis back in the mainframe days, and some of the published tables have been found to have errors. I believe the table values assume the distribution parameters are unknown. YMMV.

Note that the AD test is sensitive because it pays more attention to the tails of the distribution, since the tails are where distributions differ the most. This makes the AD test very sensitive to outliers. You will be tempted to start removing outliers from your data, but be careful. If you remove enough outliers, all distributions end up looking uniform!

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Question about Anderson-Darling (adtest)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (1 件)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

Question about Anderson-Darling (adtest)

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (1 件)

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示