Using chi2gof to test two distributions

18 ビュー (過去 30 日間)
Allie
Allie 2019 年 2 月 6 日
編集済み: Sim 2024 年 8 月 14 日
I want to use the chi2gof to test if two distributions come from a common distribution (null hypothesis) or if they do not come from a common distribution (alternative hypothesis). I have binned observational data (x), binned model data (y), and the bin edges (bins). Both the observational and model data are counts per bin.
x= [41 22 11 10 9 5 2 3 2]
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665]
bins=[0:9:81]
Because the data is already binned and because I'm testing x against y, I used the following code
[h,p,stat]=chi2gof(x,'Edges',bins,'Expected',y)
Manual calculation of the chi2 test statistic results in 4.6861 with a probablity of p=.7905. The above function however, produces a very different result. The resulting stats show different bin edges than designated, the ovserved counts per bin do not match x, the chi2 test statistic is ~87, and p<0.001. Could someone please explain why I'm getting such dramatically different results?

採用された回答

Jeff Miller
Jeff Miller 2019 年 2 月 7 日
Sorry, the x's really do have to be the data values. Try this:
bins=[0:9:81]
xvals = bins(1:end-1)+4.5; % Here are some fake data values that belong in each bin.
xcounts= [41 22 11 10 9 5 2 3 2] % These are the counts of the data values in each bin.
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665];
[h,p,stat]=chi2gof(xvals,'Edges',bins,'Expected',y,'Frequency',xcounts,'EMin',1)
This will give you your 4.68. By default, chi2gof groups small bins (less than 5) together, and 'EMin' tells it not to do that.
  2 件のコメント
Allie
Allie 2019 年 2 月 7 日
This worked! Thank you
Sim
Sim 2024 年 7 月 29 日

サインインしてコメントする。

その他の回答 (2 件)

Jeff Miller
Jeff Miller 2019 年 2 月 6 日
It looks like chi2gof expects the values in x to be the actual, original scores, not the bin counts. Try adding 'Frequency',x to the parameter list.
  1 件のコメント
Allie
Allie 2019 年 2 月 7 日
編集済み: Allie 2019 年 2 月 7 日
This did not work. The stat output is below. As you can see, it changed the edges and expected values from what I originally input and the chi2stat became even bigger.
stat =
chi2stat: 234.4383
df: 5
edges: [0 9 18 27 36 45 81]
O: [12 30 22 0 41 0]
E: [38.0520 24.2655 15.4665 9.8595 6.2895 11.0670]

サインインしてコメントする。


Sim
Sim 2024 年 8 月 14 日
編集済み: Sim 2024 年 8 月 14 日
Shouldn't you use the two-sample chi-square test?
The Chi-squared test needs binned data. However, as far as I understand, you need to give the raw data, and not the binned data, as inputs of CHI2TEST2.
Indeed, CHI2TEST2 places the raw data into bins:
bins = unique([x1(:,1); x2(:,1)]); % create a bin for each unique value

カテゴリ

Help Center および File ExchangeHypothesis Tests についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by