Some insight on mvnrnd function

Question

1 投票

Hi, I am trying to simulate a simple bivariate VAR(1) process with Gaussian errors and I use mvnrnd function to draw from a multivariate normal with mean [0;0] and variance matrix Σ.

I was creating my errors outside of my for loop, but I realised the two approaches below give completely different results:

mu = [0;0]; sgm = [1 0.7; 0.7 1]; iter = 100;
rng(1)
E = mvnrnd(mu, sgm, iter);
E_e = zeros(iter,2);
rng(1)
for i = 1:iter
    E_e(i,:) = mvnrnd(mu, sgm);
end

Can you give me some insight on why these two are different - I have some ideas but cannot formalise my thoughts.

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

John D'Errico 2019 年 10 月 16 日

編集済み: John D'Errico 2019 年 10 月 16 日

MATLAB Online で開く

2 投票

For once, someone asks a good question, that says they saw something they did not understand, and want to understand what happened. :)

(Don't feel bad, I had to think for a second myself about it.) So what did happen? The answer is to change what you did, to a simpler problem. You had:

mu = [0;0]; sgm = [1 0.7; 0.7 1];

But let me changes sgm to an identity.

sgm = eye(2);
rng(1)
mvnrnd(mu, sgm,3)
ans =
     -0.64901      -1.1096
       1.1812     -0.84555
     -0.75845     -0.57266

So 3 sets of normal events, as rows of that matrix. A standard normal, so every one comes from a normal N(0,1) distribution.

But now lets generate them one set at a time. Look carefully at the sequence.

>> rng(1)
>> mvnrnd(mu, sgm)
ans =
     -0.64901       1.1812
>> mvnrnd(mu, sgm)
ans =
     -0.75845      -1.1096
>> mvnrnd(mu, sgm)
ans =
     -0.84555     -0.57266

Do you see that MATLAB generates the same set of numbers, but in a different sequence.

>> rng(1)
>> mvnrnd(0,1,6)
ans =
     -0.64901
       1.1812
     -0.75845
      -1.1096
     -0.84555
     -0.57266

The same 6 numbers. Now, think about how MATLAB stores numbers in an array. It goes down the columns.

>> reshape(1:6,[3,2])
ans =
     1     4
     2     5
     3     6

So, when you generate the sets one pair at a time, it gives you the same sequence, but ONE pair at a time. Consider the difference between these next two calls, then look back at what you saw when you generated them one pair at a time.

>> rng(1)
>> reshape(mvnrnd(0,1,6),[3,2])
ans =
     -0.64901      -1.1096
       1.1812     -0.84555
     -0.75845     -0.57266
>> rng(1)
>> reshape(mvnrnd(0,1,6),[2,3])'
ans =
     -0.64901       1.1812
     -0.75845      -1.1096
     -0.84555     -0.57266

All well and good, but what did you do now? You changed sgm. This gets into how you generate a set of normal deviates with a non-identity covariance matrix. So how is that done? You generate a set of iid N(0,1) deviates, then you transform them with a matrix multiply. So effectively, mvnrnd(mu,sgm,N) calls randn, generating an array of the right size. Then it multiplies that Nx2 matrix with a 2x2 matrix derived from the supplied covariance matrix. (The transformation is derived from a Cholesky factorization of your covariance matrix. I can go into more depth there, but I think it may only confuse things, rather than add information.)

Now does it make more sense? When you generate them one pair at a time, you generate them in a different sequence.

2 件のコメント
なしを表示なしを非表示

Ezgi Kurt 2019 年 10 月 16 日

MATLAB Online で開く

Wow.. thank you, that was actually very intuitive and very informative!

When the variance matrix is different then identity though, this equality does not hold (they return different numbers) - which must make sense but I seem to be failing to connect the dots.

So when we create the matrices, as you wrote down, this is what's going on:

% first we do the cholesky decompostion on the variance matrix
[T,err] = cholcov(sgm);
%create a randn and multiply by T
r = randn(N,2) * T + mu;

We are multiplying by the same T in both cases - but in one the vector created by randn has length N, and in the other 1. Both are coming from the same normal distribution, ~N(0,1)

What i still cannot comprehend is that are these two approaches doing fundamentally different things (as it seems to be the case)? And why is that? Is it because we do randn(N,1) the elements in this vector are drawn from the same distribution while we do randn(1,1) N times, they are drawn from different normal distributions , all with ~N(0,1)?

I apologize if my questions are bordering on idiotic, I just want to make sure I understand.

John D'Errico 2019 年 10 月 17 日

編集済み: John D'Errico 2019 年 10 月 17 日

MATLAB Online で開く

Not remotely idiotic. A valid question that says you are thinking is a good thing. I welcome intelligent thought.

First, lets see if we can reproduce what mvnrnd does.

% Note that I created mu as a row vector.
mu = [0,0]; sgm = [1 0.7; 0.7 1];
% cholesky factor
L = chol(sgm);
rng(1)
mvnrnd(mu, sgm)
ans =
     -0.64901      0.38921
rng(1)
randn(1,2)*L + mu
ans =
     -0.64901      0.38921

So they are identical.

That should tell you I got it right. Well, it should suggest I may have. ;-) Now, can we create the same sequence that you get by calling mvnrnd one pair at a time? Of course. I just need to be careful in how I sequence the random deviates, and where they are stored in the array, BEFORE the matrix multiply is performed.

>> rng(1)
>> randn(2,3)'*L + mu
ans =
     -0.64901      0.38921
     -0.75845      -1.3233
     -0.84555      -1.0009
     
>> rng(1)
>> mvnrnd(mu, sgm)
ans =
     -0.64901      0.38921
>> mvnrnd(mu, sgm)
ans =
     -0.75845      -1.3233
>> mvnrnd(mu, sgm)
ans =
     -0.84555      -1.0009
     

Now look carefully at the call to randn. I created it as 2x3, THEN I transposed it. That means the elements will be generated in the same sequence as they are generated when you call it one pair at a time. THEN I transformed them, exactly as mvnrnd would do. Now you should see they are the same, via mvnrnd one pair at a time, or using randn, with the appropriate transformation. That works, because mvnrnd is not really much more than a wrapper around randn, but then with a transformation as needed.

But when we call mvnrnd, asking for many sets of numbers at once, it does this:

rng(1)
mvnrnd(mu, sgm,3)
ans =
     -0.64901      -1.2467
       1.1812      0.22297
     -0.75845     -0.93988
rng(1)
randn(3,2)*L + mu
ans =
     -0.64901      -1.2467
       1.1812      0.22297
     -0.75845     -0.93988

And those are the same. But you can see they are different from what I did before. The difference is subtle, but it is important.

サインインしてコメントする。

Some insight on mvnrnd function

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

2 件のコメント
なしを表示なしを非表示

その他の回答 (0 件)

カテゴリ

製品

タグ

Community Treasure Hunt

Some insight on mvnrnd function

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

2 件のコメント なしを表示 なしを非表示

その他の回答 (0 件)

カテゴリ

製品

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示