Sorry for the long delay since you posted this question. Hopefully this will still help. A few points of clarification and a bit of code should hopefully help.
1) If you think about this problem in the spatial domain as a convolution, the size of the resulting correlation matrix is not related to a chosen "zero padding". By definition, the full convolution of a 2-D matrix A of size [Ma,Na] and a matrix B of size [Mb,Nb] produces an output correlation matrix C of size: [Ma+Mb-1,Na+Nb-1].
2) You can see that the computation of the offset is correct most easily by creating a small problem in which you adjust where the template image is sliced from the original image and see if you accurately can recover the offset. To be clear, the meaning of "offset" is the translation of the template image that is required to have the content in both images align starting from an alignment in which first row and first column of both images are assumed to be the aligned (the upper left corner of both images).
A = magic(5);
template = A(2:4,3:5);
c = normxcorr2(template,A);
[ypeak, xpeak] = find(c==max(c(:)))
yoffSet = ypeak-size(template,1)
xoffSet = xpeak-size(template,2)
If you run this example, you would expect xoffset to be 2 and yoffset to be 1, and that's what you get. If you change the definition of template to be a different subset of A, you will again see that the recovered offset is correct.
3) I'm not clear on how R2011b factors into this, however, the definition of the normalized cross correlation matrix and how you compute offsets has not changed release to release. The current example is the correct way to recover the offset. I can't run R2011b easily at the moment, so hopefully this will be enough.