Correlation between two differently formatted datasets

12 ビュー (過去 30 日間)
Melissa
Melissa 2015 年 3 月 7 日
回答済み: Chad Greene 2015 年 3 月 9 日
Hello,
I want to calculate the R^2 correlation between two different datasets.
The first one, A, is 192x288 (lat,lon) and I can visualize the values on a 2D colormap
The second one, B, is 555x2 (lat,lon) This data was from an excel file, in column format. The data is randomly spread throughout the globe, and do not lie on the same grid cells of A. The data is far too sparse to be able to interpolate.
I am having trouble figuring out how I can possibly find a correlation between these two different data formats. Is there a way to convert B into a map that I can visualize with a colormap like A? Also, how would the resolution affect this calculation?
Any help would be highly appreciated
Thank you,
Melissa

採用された回答

Chad Greene
Chad Greene 2015 年 3 月 9 日
Melissa,
Without knowing anything about your project, my gut feeling is that it does not seem prudent to grid your B dataset because you'll end up interpolating over long, long distances between data points. I suppose you could use triscatteredinterp or gridfit, but you'd probably want to then mask out any grid boxes that are far away from the B data points.
You can, however, get a correlation between these data sets. I'm going to make up a gridded dataset A and a point dataset B:
% Some gridded dataset A:
[lonA,latA] = meshgrid(-180:2:180,90:-1:-90);
A = peaks(181)+.1*latA;
% Some measurements B at specific points:
latB = 180*(rand(30,1)-.5);
lonB = 360*(rand(30,1)-.5);
B = .1*latB+rand(size(latB));
% Plot the points atop the gridded dataset:
pcolor(lonA,latA,A)
hold on
plot(lonB,latB,'rp','markersize',15)
shading interp
xlabel('longitude')
ylabel('latitude')
Then get A values at points B by interpolating the A dataset:
A_interp = interp2(lonA,latA,A,lonB,latB);
You can then use corrcoef to get a correlation coefficient, which for this fake data is 0.89:
R = corrcoef([A_interp B])
R =
1.0000 0.8936
0.8936 1.0000
But note that correlation coefficient depends a bit on data means and scaling. Below I'm going to use polyplot to plot the linear regression:
plot(A_interp,B,'b*')
hold on
polyplot(A_interp,B,'k-')
axis tight; box off
xlabel('dataset A')
ylabel('dataset B')

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeData Distribution Plots についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by