Matrix multiply slices of 3d Matricies

Given two 3d matricies, A and B with
size(A) = (n, m, k)
and
size(B) = (m, p, k)
perform matrix multiplications on each slice obtained by fixing the last index, yielding a matrix C with
size(C) = (n, p, k).
To clarify, we would have
C(:, :, 1) = A(:, :, 1)*B(:, :, 1), ..., C(:, :, k) = A(:, :, k)*B(:, :, k).
I need to do this with gpuArrays in the most efficient manner possible.

 採用された回答

Jill Reese
Jill Reese 2013 年 9 月 9 日

1 投票

If you have MATLAB R2013b, you can use the new gpuArray pagefun function like so:
C = pagefun(@mtimes, A, B);

1 件のコメント

Dan Ryan
Dan Ryan 2013 年 9 月 9 日
Brilliant! Thanks!

サインインしてコメントする。

その他の回答 (2 件)

James Tursa
James Tursa 2013 年 2 月 14 日
編集済み: James Tursa 2013 年 2 月 14 日

2 投票

If you are not restricted to gpuArrays you can do this:
C = mtimesx(A,B);
The MTIMESX function passes pointers to the slice data to BLAS library functions in the background, so it is pretty fast. You can find MTIMESX here:
MTIMESX is not yet multi-threaded across the third dimension (but an update is in the works). A nD matrix multiply multi-threaded on the third dimension called MMX can also be used:
C = MMX('mult', A, B);
MMX can be found here:

1 件のコメント

Dan Ryan
Dan Ryan 2013 年 2 月 14 日
great suggestion, I will keep an eye on this project

サインインしてコメントする。

Azzi Abdelmalek
Azzi Abdelmalek 2013 年 2 月 5 日
編集済み: Azzi Abdelmalek 2013 年 2 月 5 日

0 投票

n=3;
m=4;
k=5;
p=2;
A=rand(n,m,k)
B=rand(m,p,k)
C=zeros(n,p,k)
for ii=1:k
C(:,:,ii)=A(:,:,ii)*B(:,:,ii)
end

7 件のコメント

Sean de Wolski
Sean de Wolski 2013 年 2 月 5 日
Could also use a parfor loop since each iteration is independent of others.
Dan Ryan
Dan Ryan 2013 年 2 月 5 日
I was hoping to see some fully parallel version that could be implemented in one shot on the gpu. This would mean no looping constructs.
Azzi Abdelmalek
Azzi Abdelmalek 2013 年 2 月 5 日
I don't see how you could do it without a for loop.
Sean de Wolski
Sean de Wolski 2013 年 2 月 5 日
What's wrong with a loop?
Dan Ryan
Dan Ryan 2013 年 2 月 5 日
編集済み: Dan Ryan 2013 年 2 月 5 日
Big slowdown... for example:
A = gpuArray.rand(1000, 100, 100, 'single');
B = gpuArray.rand(1000, 100, 'single');
C = gpuArray.zeros(1000, 100, 100, 'single');
Compare
for idx = 1:100
C(:, :, idx) = A(:, :, idx).*B;
end
with
C = bsxfun(@times, A, B);
There is about a factor of 50 slowdown with the for loop.
Azzi Abdelmalek
Azzi Abdelmalek 2013 年 2 月 5 日
編集済み: Azzi Abdelmalek 2013 年 2 月 5 日
bsxfun don't work with
n=3;
m=4;
k=5;
p=2;
A=rand(n,m,k)
B=rand(m,p,k)
C = bsxfun(@times, A, B);
Jill Reese
Jill Reese 2013 年 2 月 14 日
Dan,
Can you elaborate on the sizes of m, n , k, and p that you are interested in? It would be useful to know a ballpark number for the size of problem you want to solve. Do you have many small page sizes, a few large pages, or something else?
Thanks,
Jill

サインインしてコメントする。

カテゴリ

ヘルプ センター および File ExchangeLoops and Conditional Statements についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by