corrcoef : p value interpretation

Question

0 投票

Hi !

According to corrcoef help page, we can read : "If P(i,j) is small, say less than 0.05, then the correlation R(i,j) is significant."

Here is a snippet :

clear all;close all;clc;
N = 1000;
X = 1*rand(N,2);
Y = [2*X(:,1),-2*X(:,2)];
[r p] = corrcoef(X,Y)

We easily notice that X and Y are truly correlated. Here is an output of this script :

r =

    1.0000   -0.0108
   -0.0108    1.0000

p =

    1.0000    0.6278
    0.6278    1.0000

We have p(1,2) = p(2,1) > .05, means that R(1,2) = R(2,1) are not significant, which is fine to me.

Nevertheless, p(1,1) = p(2,2) > .05, means that R(1,1) = R(2,2) shouldn't be trusted either. I feel this result inconsistent.

Indeed, the correlation coefficient of a random vector with itself is 1, which reflects a perfect correlation. So why p(i,i) is not zero ?

Many thanks for your help.

Sylvain

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

José-Luis 2014 年 7 月 15 日

This is annoying. Whoever had answered this question deleted that and with it all the comments of people that were trying to contribute.

You are right Sylvain, it should be zero and should ignore the values in the diagonal.

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Alfonso Nieto-Castanon 2014 年 7 月 15 日

編集済み: Alfonso Nieto-Castanon 2014 年 7 月 16 日

MATLAB Online で開く

4 投票

There are several issues here:

1) regarding why you are getting not significant values in your example:

 N = 1000;
 X = 1*rand(N,2);
 Y = [2*X(:,1),-2*X(:,2)];
 [r p] = corrcoef(X,Y);

the last line is NOT computing the correlation between each column of X, and each column of Y, as you might expect. It is instead (look at help corrcoef) computing the correlation:

[r p] = corrcoef([X(:),Y(:)]);

Since the two halves of X(:) are oppositely associated with the two halves of Y(:) you get a very low, and not significant, correlation.

If you tried instead:

 [r p] = corrcoef([X Y]);
 r = r(1:2,3:4);
 p = p(1:2,3:4);

you will see that each column of X is very strongly associated with each column of Y, as you would expect from your definitions of X and Y.

2) regarding what is going on with the diagonals of p

The diagonals of p returned by corrcoef are always set to 1 (looking at the corrcoef.m code it might seem that they were intended to be NaN's instead). In any way, you should simply disregard those values, as they never represent any meaningful test (the diagonals of r are by definition 1's). To check this, you may do the following:

 x = randn(10,2);
 [r,p] = corrcoef(x);

(the diagonal of p are 1's)

But then,

[r,p] = corrcoef(x(:,1),x(:,1));

(and the values p(1,2) == 0, as expected from two perfectly correlated series)

3) regarding the interpretation of p-values

Your interpretation is perfectly correct, small p-values (e.g. p<.05) mean that you can reject the null hypothesis (the null hypothesis for these analyses is that r = 0; i.e. that the samples are uncorrelated).

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

Sylvain Rousseau 2014 年 7 月 16 日

Dear Alfonso,

I'am very grateful to you for this crystal clear explanation.

All the best

Sylvain

サインインしてコメントする。

corrcoef : p value interpretation

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

採用された回答

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

製品

タグ

Community Treasure Hunt

corrcoef : p value interpretation

1 件のコメント -1 件の古いコメントを表示 -1 件の古いコメントを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示 -1 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

製品

タグ

参考

Community Treasure Hunt

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示