Testing for a Poisson Process

Question

Berk 2012 年 6 月 27 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/42127-testing-for-a-poisson-process

Hi,

Would you please help me understand to evaluate the outcome of Chi-Square tests? I want to test whether the data (which are created from a Poisson distribution) do really follow Poisson distribution by chi2gof function.

I am using this pdf http://ocw.mit.edu/courses/mathematics/18-443-statistics-for-applications-fall-2006/lecture-notes/lecture11.pdf as a source to understand the concept and followed the instructions there.

I used the following code to generate 1000 simulations of Poisson samples each having 1000 values with an expected value of 10.

for sim=1:1000
  X = poissrnd(10, [1000 1]);
  [H(sim) P(sim) STATS] = chi2gof(X,'cdf',@(z)poisscdf(z,10));
end

I have two questions regarding this issue:

1. When I ran the code like this, there are more rejected hypothesis (i.e. sum(H)) than if I had used

@(z)poisscdf(z,mean(X))

I would expect the first one would have less number of rejected hypothesis since I am using the same expected value for both generating the poisson values and testing them.

2. Can you please help me understand the meaning of 'z' which is used in the function handle?

Kind Regards, Berk

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Tom Lane 2012 年 6 月 27 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/42127-testing-for-a-poisson-process#answer_51876

MATLAB Online で開く

I find the same thing. I got 52/1000 rejected the first way. That's close to the 5% value that you would expect. I got fewer the second way. That's because by estimating the parameter from the data used in the test, you are using a fit that is "too good." Imagine if you estimated the p parameter of a binomial distribution -- this would always give an exact match between the observed and expected counts of 1's and 0's. This case is not so extreme, but is similar.

The chi2gof function does provide a way to do an approximate adjustment for that, if you tell it that you have estimated one parameter:

[H(sim) P(sim) STATS] = chi2gof(X,'cdf',@(z)poisscdf(z,mean(X)),'nparams',1);

The concept is described further in lecture 12 of the MIT course notes.

As for your other question, the chi2gof function defines a set of bin edges and calls your function to compute the cdf values at these edges. The "@(z)" indicates that your function accepts one input to be represented by the symbol z, and the rest of the expression defines how the cdf is to be computed at the provided z values.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Answer 2

Berk 2012 年 6 月 28 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/42127-testing-for-a-poisson-process#answer_52002

Dear Tom,

Thanks for the answer. However, I could not make this idea clear in my mind. In MIT lecture 12, it is written that, we can not use the mean and standard deviation of the data to test our hypothesis. Then, assuming that we don't know whether the data is generated by a distribution or not, how are we going to handle this problem? Let me explain this by illustrating my problem:

I need to test whether a random variable (i.e. journey time) follows Poisson distribution on a given road segment at a given time (at the moment let's assume that journey time is measured in discrete units). I have a historic data worth of 20 weekdays - and I want to test whether these 20 observations are generated via a Poisson distribution. Then, this test has to be done for 424*288 (number of road segments * number of time intervals) times.

As far as I understand if I use the mean of the data, then I will observe more accepted hypothesis (that the data are generated from a Poisson distribution) than the reality?

Thanks,

Kind Regards,

Berk

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Tom Lane 2012 年 6 月 28 日

Look again later in the lecture 12 notes, near equation (11.0.1). There is an explanation that while we cannot use the test as originally defined with estimated parameters and r-1 degrees of freedom, we can adjust the degrees of freedom to r-s-1. This will bring the number of accepted hypotheses back in line with the rejection rate you would expect. The notes gloss over the fact that this is not exact, but you may find it satisfactory for your purposes.

サインインしてコメントする。

Testing for a Poisson Process

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (1 件)

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

Testing for a Poisson Process

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (1 件)

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示