Computing the outer product (BLAS 2 operation) raised to the power k using the least number of flops

Question

Pascale Bou Chahine 2020 年 9 月 19 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/596431-computing-the-outer-product-blas-2-operation-raised-to-the-power-k-using-the-least-number-of-flops

コメント済み: Bjorn Gustavsson 2020 年 9 月 22 日

Explain how (xy^T)^k can be computed using the least number of flops, where x, y in R^n are vectors. Then write the corresponding MATLAB algorithm

Here is my algorithm:

function M = outer(x, y, k)
n = length(x); 
n = length(y); 
A = zeros(n, n);
for j = 1 : n
    for i = 1 : n
        A(i, j) = x(i) * y(j);
    M(i,j) = A(i,j)^k;
    end
end

so for example, if I have x = [1 2], and y = [2 4], and k = 2, then (xy^T) = [2 4; 4 8], and (xy^T)^2 = [4 16; 16 64], while (xy^T)^2 should be equal to [20 40; 40 80]. what is wrong with my algorithm? (And I looped over the columns first and put M(i,j) outside the for loop to use the least number of flops as the question wants. Is it correcr?)

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

Pascale Bou Chahine 2020 年 9 月 19 日

Yeah, I should end up with a matrix, i.e. if k = 2, I should have (xyT)^k = (xy^T)^2 = [20 40; 40 80], but with my algorithm, I'm getting element-wise power, i.e. I'm getting (xy^T)^2 = [4 16; 16 64].

Bjorn Gustavsson 2020 年 9 月 19 日

MATLAB Online で開く

If you want the [20 40;40 80] result, then you should take the matrix-power, and not the elementwise power. The matrix power are (for the case of an exponent of 2 and 3):

M2 = M*M;   % k = 2
M3 = M*M*M; % k = 3

You have to modify your algorithm and read up on efficient calculations of matrix-powers...

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Gaurav Garg 2020 年 9 月 22 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/596431-computing-the-outer-product-blas-2-operation-raised-to-the-power-k-using-the-least-number-of-flops#answer_498298

Hi Pascale,

MATLAB supports LAPACK and BLAS functions/operations which can help you to reduce the number of FLOPS for your function(s). To know more about LAPACK and BLAS, kindly refer to doc here and to know about how can you use MATLAB and BLAS operations in your script/function, kindly refer here.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Answer 2

Walter Roberson 2020 年 9 月 22 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/596431-computing-the-outer-product-blas-2-operation-raised-to-the-power-k-using-the-least-number-of-flops#answer_498313

LAPACK and BLAS are designed for computational efficiency, which is not the same thing as reducing the number of FLOPS. The least number of FLOPS is not necessarily the most efficient computation. The theoretical computations for matrix multiplications can involve a lot of cache misses for the right hand matrix; it is a lot more efficient in processor time to do block calculations, which are still "pretty" efficient but might have more FLOPS, -- for example it might require (say) 2*n+m additional additions compared to the theoretical, and yet a block multiplication might be 100 times faster due to not having as many cache misses.

matrix to an integer power can have the integer power decomposed into binary. For example, if k = 9, then instead of 9 matrix multiplications, M*M*M*M*M*M*M*M*M you can take M2 = M*M; M4 = M2*M2; M8 = M4*M2; M9 = M8*M -> only 4 matrix multiplies. Or M2 = M*M, M3 = M2*M, M9 = M3*M3 -> only 3 matrix multiplies

However, for large enough k even this is not necessarily the most efficient. Instead you can do [U,S,V] = svd(M); U * S^k * V and S is guaranteed to be diagonal, so S^k is S.^k .

But to know whether that is going to be more efficient than prime decomposition, you have to know the cost of SVD.

I have a suspicion that the problem might have been structured in a way that makes it easier to do SVD. Experimentally, in the tests I did, S was always non-zero only in its (1,1) entry.... though the values of the matrices are not at all obvious to me at the moment, so I do not know at the moment if there would be an "inexpensive" SVD that could be done.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Bjorn Gustavsson 2020 年 9 月 22 日

Well, since y is simply 2*x I think we can cook up the eigenvectors in 2-D space rather easily (x/norm(x) and one perpendicular to that) and the nonzero eigenvalue I get to be the product norm(x)*norm(y). But since the problem is clearly designed for someone to work out, it surely is a home-work task?

サインインしてコメントする。

Computing the outer product (BLAS 2 operation) raised to the power k using the least number of flops

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

回答 (2 件)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

Computing the outer product (BLAS 2 operation) raised to the power k using the least number of flops

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

回答 (2 件)

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示