Expanding Sample Covariance Matrix

5 ビュー (過去 30 日間)
Lemar DeSalis
Lemar DeSalis 2011 年 8 月 21 日
Hello!
I need to calculate the mean vector and the covariance matrix for sampled data. E.g. I have matrix with NumFeatures colums and NumSamples rows. I can then easily use "mean(MyMatrix)" and "cov(MyMatrix)".
However, what should I do if I want to extend the covariance matrix I got through the method described above?
So I have a covariance matrix calculated from the old samples, how can I add the influence of the new samples?
Is there an ease MATLAB-way to do that?
Thanks in advance!
  1 件のコメント
Oleg Komarov
Oleg Komarov 2011 年 8 月 21 日
The terminology you're using is not clear. Could you give an example.
For reference: http://www.mathworks.com/matlabcentral/answers/6200-tutorial-how-to-ask-a-question-on-answers-and-get-a-fast-answer

サインインしてコメントする。

回答 (2 件)

Lemar DeSalis
Lemar DeSalis 2011 年 8 月 22 日
% MyMatrix is a Matrix containing samples, in this case random data:
MyMatrix = rand( [NumSamples NumFeatures] );
% I need the mean vector and the covariance matrix:
MyMean = mean(MyMatrix);
MyCov = cov(MyMatrix);
% Now I got some new data:
MyLargerMatrix = vertcat(MyMatrix, SomeNewData);
% Calculate new values:
MyMean_New1 = mean(MyLargerMatrix)
MyCov_New1 = cov(MyLargerMatrix);
%%%%HERE IS MY QUESTION:
% But what to do, when the old data is not available anymore?
clear MyLargerMatrix, MyMatrix;
MyCov_New2 = ... ?
% How to update the covariance matrix, if you only have the old
% covariance matrix "MyMean", the number of old samples "NumSamples"
% and the new samples "SomeNewData"?
%
% MyCov_New2 should be identical to MyCov_New1, but MyCov_New2
% should be computed WITHOUT access to the old data.
% For the mean vector, this is easily possible, but how to do so for the covariance matrix?

Oleg Komarov
Oleg Komarov 2011 年 8 月 22 日
% Example inputs
A = rand(100,2);
B = randn(20,2);
C = [A;B];
% Sample covariances (normalized by N-1)
c1 = cov(A);
c2 = cov(B);
c3 = cov(C);
% Means
m1 = mean(A);
m2 = mean(B);
m3 = mean(C);
% Number of samples
nA = size(A,1);
nB = size(B,1);
nC = nA + nB;
% The question is: how to get c3 having only c1, c2, m1, m2?
% Keep in mind that:
  • cov(x,y) = E(xy) - E(x)E(y)
  • m3 = (m1*nA + m2*nB)/nC
  • same with E(xy)
  • cov is the sample covariance, thus we have to adjust for N-1
  • the following formula is valid for covariance only for covariance
ExEy12 = prod((m1*nA + m2*nB)/nC);
adj = nC/(nC-1);
(c1*(nA-1) + c2*(nB-1) + prod(m1)*nA + prod(m2)*nB)/nC*adj - ExEy12 * adj
c3
How to derive the variance is up to you. But you really just need paper and pencil.
  1 件のコメント
Lemar DeSalis
Lemar DeSalis 2011 年 8 月 23 日
Thanks, I was able to find a solution based on your code!

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeCreating and Concatenating Matrices についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by