Fast multiplication of rows of one matrix and columns of the second matrix

1 回表示 (過去 30 日間)
Mateusz
Mateusz 2011 年 10 月 5 日
I would like to compute v(k) = A(k, :)* B(:, k) as fast as possible (no-loops). Currently, I am doing diag(A * B) but it has unnecessary overhead of computation, and storage.
  2 件のコメント
Daniel Shub
Daniel Shub 2011 年 10 月 5 日
Do you want to do it as fast as possible or without loops? The JIT accelerator means that those two things are not necessarily the same.
Daniel Shub
Daniel Shub 2011 年 10 月 5 日
How big is A? Is it sparse or distributed or anything funky like that?

サインインしてコメントする。

採用された回答

Teja Muppirala
Teja Muppirala 2011 年 10 月 5 日
The fastest way to do something generally depends on the size and structure of your data.
Don't assume loops are slower. For simple linear algebra, loops are generally very fast. In fact for large matrices (1000x1000 etc.), I think loops are probably the fastest way actually.
v = zeros(1,size(A,1));
for k = 1:size(A,1)
v(k) = A(k,:)*B(:,k);
end
For smaller matrices, you are probably better off doing this:
v = sum(A'.*B);
The best thing to do it just to try things out and see what works best for your data.
  3 件のコメント
Teja Muppirala
Teja Muppirala 2011 年 10 月 5 日
Ah. Yeah I forgot the dot. Thanks James
Dr. Seis
Dr. Seis 2011 年 10 月 5 日
I created two random 10000x10000 matrices and the "for loop" took 2 seconds to compute what "diag" took over 20 seconds to compute. However, and I will have to make a correction to the above, this took only 1 second to execute:
v = sum(A.*B',2);
Note: I added the "dot" to denote that each element in A is multiplied to each respective element of B-transpose before the rows are summed. This should be the same result as v = diag(A*B);

サインインしてコメントする。

その他の回答 (1 件)

Daniel Shub
Daniel Shub 2011 年 10 月 5 日
Yair's blog has a nice post on memory issues and array operations:
It is not always obvious what is the best solution.
If your matrices are really big you might be better off distributing them to a graphic card. If your machine has a lot of cores, the for loop could be replaced by a parfor loop, or even distributed to a cluster. It is silly to worry about slight inefficiencies if you can access 1000+ cores.

カテゴリ

Help Center および File ExchangeCreating and Concatenating Matrices についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by