Vec-trick implementation (multiple times)

2 ビュー (過去 30 日間)

ConvexHull 2021 年 8 月 21 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1438154-vec-trick-implementation-multiple-times

編集済み: Bruno Luong 2023 年 9 月 14 日

Dear all,

the question is related to Tensorproduct. Since the question was not answered as intended, i want to revisit the question.

Introduction:

Suppose you have a matrix vector multiplication, where a matrix C with size (np x mq) is constructed by a Kronecker product of matrices A with size (n x m) and B with size (p x q). The vector is denoted v with size (mp x 1) or its vectorized version X with size (m x p).

In two dimensions this operation can be performed with O(npq+qnm) operations instead of O(mqnp) operations, see Wikipedia.

Expensive variant (in case of flops):

Cheap variant (in case of flops):

Main question:

I want to perform many of these operations at ones, e.g. 2500000. Example: n=m=p=q=7 with A=size(7x7), B=size(7x7), v=size(49x2500000).

In Tensorproduct i have implemented a MeX-C version of the cheap variant which is quite slower than a Matlab version of the expensive variant provided by Bruno Luong.

Is it possible to implement the cheap version in Matlab without looping?

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

Bruno Luong 2021 年 8 月 23 日

Because smaller flops doesn't mean necessary faster. Memory access, cache, thread management are as well important, and which is fatest method probably depends on n=m=p=q.

ConvexHull 2021 年 8 月 23 日

編集済み: ConvexHull 2021 年 8 月 23 日

Yeah that's definitly the case here.

The main problem is that, if you want to perform the Vec-trick multiple times in a vectorized fashion you have to reorder the datastructure. After applying AX you cannot perform a Matrix-Matrix multiplication directly with B.

Stupid Memory access O.o!

サインインしてコメントする。

サインインしてこの質問に回答する。

採用された回答

ConvexHull 2021 年 8 月 24 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1438154-vec-trick-implementation-multiple-times#answer_773279

編集済み: ConvexHull 2021 年 8 月 25 日

MATLAB Online で開く

Here is a pure intrinsic Matlab version without loops, however with two transpose operations and quite slow.

n=7;m=7;p=7;q=7;
A = rand(n,m);
B = rand(p,q);
v = rand(m*p,500000,5);
n = 5;
C = kron(B,A);
tic
for i=1:n
    v1 = reshape(C*reshape(v,49,[]),size(v));
end
toc % Elapsed time is 0.456353 seconds
tic
for i=1:n
    v2 = reshape(reshape(B*reshape((A*reshape(v,7,[])).',7,[]),7*2500000,[]).',7,[]);
end
toc % Elapsed time is 3.879752 seconds
max(abs(v1(:)-v2(:))) 
% 1.4211e-14

22 件のコメント
20 件の古いコメントを表示20 件の古いコメントを非表示

ConvexHull 2021 年 8 月 24 日

編集済み: ConvexHull 2021 年 8 月 24 日

MATLAB Online で開く

I don't know what you mean.

The ().' is far the most expensive operation no matter what is being done in the background.
Reshape is for free.
The small 7er matrix-matrix multiplication is cheaper than the 49er big one.
By the way ()' or ().' are nearly same expensive.

n=7;m=7;p=7;q=7;
A = rand(n,m);
B = rand(p,q);
v = rand(m*p,500000,5);
n = 5;
tic
for i=1:n
    vv = reshape(v,7,[]); %#ok<*NASGU>
end
toc % Elapsed time is 0.000186 seconds
tic
for i=1:n
    vvv = A*vv;
end
toc % Elapsed time is 0.350487 seconds
tic
for i=1:n
    vvvv = (vvv).';
end
toc % Elapsed time is 1.682334 seconds
tic
for i=1:n
    vvvvv = reshape(vvvv,7,[]);
end
toc % Elapsed time is 0.000181 seconds
tic
for i=1:n
    vvvvvvv = B*vvvvv;
end
toc % Elapsed time is 0.347840 seconds
tic
for i=1:n
    vvvvvvvv = reshape(vvvvvvv,7*2500000,[]);
end
toc % Elapsed time is 0.000174 seconds
tic
for i=1:n
    vvvvvvvvv = (vvvvvvvv).';
end
toc % Elapsed time is 1.470868 seconds
tic
for i=1:n
    vvvvvvvvvv = reshape(vvvvvvvvv,7,[]);
end
toc % Elapsed time is 0.000148 seconds

Bruno Luong 2021 年 8 月 26 日

MATLAB Online で開く

Add benchmark with mtimesx

Conclusion

For version before R2020b, use expensive method for s < 44, use mtimesx otherwise;
For version R2020b or later, use expensive method for s < 27, use pagemtimes otherwise.

stab = 5:5:100;
t1 = zeros(size(stab));
t2 = zeros(size(stab));
t3 = zeros(size(stab));
t4 = zeros(size(stab));
for i = 1:length(stab)
    fprintf('%d/%d\n', i, length(stab));
    s = stab(i);
    n=s;
    m=s;
    p=s;
    q=s;
    
    A = rand(n,m);
    B = rand(p,q);
    v = rand(m*p,100000);
    
    tic
    C = kron(B,A);
    v1 = reshape(C*reshape(v,s*s,[]),size(v));
    t1(i) = toc;
    
    tic
    v2 = reshape(reshape(B*reshape((A*reshape(v,s,[])).',s,[]),[],s).',s,[]);
    t2(i) = toc;
    
    tic
    X = reshape(v, size(A,2), size(B,1), []);
    v3 = pagemtimes(pagemtimes(A, X), 'none', B, 'transpose');
    t3(i) = toc;
    
    tic
    X = reshape(v, size(A,2), size(B,1), []);
    v4 = mtimesx(mtimesx(A, X), 'N', B, 'T');
    t4(i) = toc;
end
close all
semilogy(stab, [t1; t2; t3; t4]');
legend('Expensive method', ...
    'Cheap method using transposition', ...
    'Cheap method using pagemtimes', ...
    'Cheap method using mtimesx');
xlabel('s');
ylabel('time [sec]');
grid on;

Stefano Cipolla 2023 年 9 月 14 日

編集済み: Stefano Cipolla 2023 年 9 月 14 日

Hi there! May I ask if you are aware of implementation of functions similar to "pagemtimes" but able to work with at least one sparse input? Alternatively do you see any easy workaround? More precisely I need someting like

pagemtimes(A, V)

where A is a nxnxn sparse real tensor and V is a real dense nxn matrix...

Bruno Luong 2023 年 9 月 14 日

編集済み: Bruno Luong 2023 年 9 月 14 日

MATLAB Online で開く

@Stefano Cipolla "sparse real tensor"

I'm not aware this native MATLAB class.

But you can put the A as diagonal block of n^2 x n^2 sparse matrix

SA = [A(:,:,1) 0         0 ... 0
      0       A(:,:,2)  0 ... 0
      ...
      9=0      0 ...           A(:,:,n)]
    

Do the same expansion for V (with the same block) then solve it

サインインしてコメントする。

その他の回答 (0 件)

サインインしてこの質問に回答する。

カテゴリ

MATLAB Language Fundamentals Data Types Data Type Identification

Help Center および File Exchange で Data Type Identification についてさらに検索

製品

MATLAB

リリース

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by

Vec-trick implementation (multiple times)

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

採用された回答

22 件のコメント
20 件の古いコメントを表示20 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

Vec-trick implementation (multiple times)

5 件のコメント 3 件の古いコメントを表示3 件の古いコメントを非表示

採用された回答

22 件のコメント 20 件の古いコメントを表示20 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

22 件のコメント
20 件の古いコメントを表示20 件の古いコメントを非表示