Fast implementation of max-plus matrix multiplication

Question

Davide Zorzenon 2021 年 4 月 13 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/800691-fast-implementation-of-max-plus-matrix-multiplication

コメント済み: Seth Younger 2021 年 9 月 14 日

I am trying to implement a fast max-plus algebra multiplication of square matrices in MATLAB. The max-plus multiplication ⊗ between two matrices

returns the matrix

such that

.

A naive implementation of the max-plus multiplication in MATLAB is:

function C = mp_prod(A,B)
n = size(A,1);
C = -inf*ones(n);
for i = 1:n
    for j = 1:n
        for k = 1:n
            C(i,j) = max(C(i,j), A(i,k) + B(k,j));
        end
    end
end

However, this is not particularly fast. The fastest implementation I could come up with is:

function C = fast_mp_prod(A,B)
n = size(A,1);
C = -inf*ones(n);
A = transpose(A);
for i = 1:n
    C(i,:) = max(A(:,i) + B);
end

which takes about 0.18 seconds to multiply two 100×100 matrices 100 times on my computer. Since performing the same test using a "standard" multiplication takes about 0.004 seconds (45 times faster), I was wondering if there were a way to speed up the code and obtain more comparable timings using MATLAB.

If this were not possible, would it be worth to try to create a package similar to LAPACK for max-plus algebra to get faster performances? Or is this the maximum achievable speed for reasons like "the CPU is inherently slower to calculate the maximum than the product of two numbers"?

4 件のコメント
2 件の古いコメントを表示2 件の古いコメントを非表示

Jan 2021 年 4 月 13 日

The naive max() implementation contains a branch. This slows down the processing in general, because it impedes the pipelining in the CPU, which decides for one of the branches using heuristics. In case of max() about half of the predictions are false.

The matrix multiplication is performed in highly optimized library. Even the naive implementation in Matlab profits from Matlab's JIT acceleration, which does not handle max() with the same efficiency.

Do you have a C compiler installed?

Davide Zorzenon 2021 年 4 月 13 日

Thank you for the explanation! Yes, I have it.

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Jan 2021 年 4 月 13 日

3
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/800691-fast-implementation-of-max-plus-matrix-multiplication#answer_674391

編集済み: Jan 2021 年 4 月 15 日

MATLAB Online で開く

A C-mex version:

[EDITED: Check of type double is added]

// mp_prod_mex.c
// C = mp_prod_mex(A, B)
// INPUT:  A, B: Real double [m x k] and [k x n] matrices.
// OUTPUT: Real double [m x n] matrix.
//
// Equivalent Matlab code:
//   [m, p] = size(A);  n = size(B, 2);
//   C      = -inf(m, n);
//   for i = 1:m
//      for j = 1:n
//         for k = 1:p
//            C(i,j) = max(C(i,j), A(i,k) + B(k,j));
//         end
//      end
//   end
//
// COMPILE:
//   mex -O -R2018a mp_prod_mex.c
//
// Jan, Heidelberg, (C) 2021, License: CC BY-SA 3.0
// Handling of rectangular matrices: Bruno Luong
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
  double *A, *Ak, *B, *Bj, *Bk, *C, mInf = -mxGetInf();
  register double s, t;
  mwSize m, n, p, i, j, k;
  
  // Get inputs:
  A = mxGetDoubles(prhs[0]);
  B = mxGetDoubles(prhs[1]);
  m = mxGetM(prhs[0]);  // size(A, 1)
  n = mxGetN(prhs[1]);  // size(B, 2)
  p = mxGetN(prhs[0]);  // size(A, 2) and size(B, 1)
          
  if (!mxIsDouble(prhs[0]) || !mxIsDouble(prhs[1]) ||
      mxIsComplex(prhs[0]) || mxIsComplex(prhs[1])) {
     mexErrMsgIdAndTxt("Jan:mp_prod_mex:BadInput", 
             "mp_prod_mex: Inputs must be real double matrices.");
  }
  if (mxGetM(prhs[1]) != p) {
     mexErrMsgIdAndTxt("Jan:mp_prod_mex:BadInput", 
             "mp_prod_mex: Size of A and B do not match.");
  }
  
  // Create output:
  plhs[0] = mxCreateDoubleMatrix(m, n, mxREAL);
  C = mxGetDoubles(plhs[0]);
  
  // Calculate result:
  for (j = 0; j < n; j++) {    // Loop over B at 1st for linear access to C
     Bj = B + j * n;
     for (i = 0; i < m; i++) {
        Ak = A + i;
        Bk = Bj;
        
        t  = mInf;
        for (k = 0; k < p; k++) {
           s   = *Ak + *Bk++;
           Ak += m;
           if (s > t) {
              t = s;
           }
        }
        *C++ = t;
     }
  }
  
  return;
}

Timings, 100x100 input, 100 iterations:

% Matlab R2018b, i5 mobile:
Elapsed time is 0.365667 seconds.   % 1st version from question
Elapsed time is 0.253912 seconds.   % 2nd version from question
Elapsed time is 0.337319 seconds.   % Bruno's version
Elapsed time is 0.077329 seconds.   % C mex

Transposing A for a contiguous access was no significant advantage.

12 件のコメント
10 件の古いコメントを表示10 件の古いコメントを非表示

Jan 2021 年 4 月 14 日

編集済み: Jan 2021 年 4 月 15 日

MATLAB Online で開く

Are you sure, that -Inf is replied even for empty input matrices? Neither -inf(0,n) nor -inf(m, 0) would contain any -inf value, but both contain no values. So what is the size of the output for empty inputs? Is this the wanted Matlab code:

[m, p] = size(A);
n      = size(B, 2);
C      = -inf(m, n);
for i = 1:m
   for j = 1:n
      for k = 1:p
         C(i,j) = max(C(i,j), A(i,k) + B(k,j));
      end
   end
end

Then

mp_prod(zeros(2,0), zeros(0,3))
%  -Inf  -Inf  -Inf
%  -Inf  -Inf  -Inf

If so, let me mention that your code throws an error for empty inputs and does not reply Inf's also:

bl_mp_prod(zeros(2,0), zeros(0,3))
% Error using reshape
% To RESHAPE the number of elements must not change.
% Error in kgd>bl_mp_prod (line 82)
% C=reshape(C,[m n])

The C-mex code to accept non-square inputs and to reply Inf's:

[EDITED] Version, which handles rectangular ipnuts, moved to the answer.

Bruno Luong 2021 年 4 月 14 日

MATLAB Online で開く

I'm allow my self to modify Jan code to extend it for non-rectangular matrices

// mp_gprod_mex.c
// C = mp_prod_mex(A, B)
// INPUT:   A: Real double [m x p] matrices.
//          B: Real double [p x n] matrices.
// OUTPUT:  C:    Real double [m x n] matrix.
//
// COMPILE:
//   mex -O -R2018a mp_gprod_mex.c
//
// Jan, Heidelberg, (C) 2021,  License: CC BY-SA 3.0
// Modified by Bruno Luong
#include "mex.h"
#include "math.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
  double *A, *Ak, *B, *Bj, *Bk, *C;
  register double s, t;
  mwSize m, n, p, i, j, k;
  
  // Get inputs:
  A = mxGetDoubles(prhs[0]);
  B = mxGetDoubles(prhs[1]);
  m = mxGetM(prhs[0]);
  p = mxGetN(prhs[0]);
  n = mxGetN(prhs[1]);
  
  // Check validity of inputs:
  if (mxIsComplex(prhs[0]) || mxIsComplex(prhs[1]) ||
      mxGetM(prhs[1]) != p ||
      !mxIsDouble(prhs[0]) || !mxIsDouble(prhs[1]) )  {
     mexErrMsgIdAndTxt("Jan:mp_prod_mex:BadInput", 
             "mp_gprod_mex: Inputs must be real double matrices");
  }
  
  // Create output:
  plhs[0] = mxCreateDoubleMatrix(m, n, mxREAL);
  C = mxGetDoubles(plhs[0]);
  
  // Calculate result:
  for (j = 0; j < n; j++) {
     Bj = B + j * p;
     for (i = 0; i < m; i++) {
        Ak = A + i;
        Bk = Bj;
        
        t  = -INFINITY; // -1/0;
        for (k = 0; k < p; k++) {  // Max of following elements
           s   = *Ak + *Bk++;
           Ak += m;
           if (s > t) {
              t = s;
           }
        }
        *C++ = t;
     }
  }
  
  return;
}

Davide Zorzenon 2021 年 4 月 14 日

編集済み: Davide Zorzenon 2021 年 4 月 14 日

@Bruno Luong: thanks for pointing out the case with matrices of dimensions

and

. I had never thought about that but to be consistent the result of the max-plus multiplication shoud be -inf(m,n) as you wrote.

Seth Younger 2021 年 9 月 14 日

@Jan I would love to connect with you and learn more about how this implementation works. Are you slicing the matrix up? I am confused by the (A,1) and (A,2).

サインインしてコメントする。

Answer 2

Bruno Luong 2021 年 4 月 13 日

2
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/800691-fast-implementation-of-max-plus-matrix-multiplication#answer_673946

編集済み: Bruno Luong 2021 年 4 月 13 日

MATLAB Online で開く

function C = mp_prod(A,B)
m=size(A,1);
n=size(B,2);
AA=reshape(A,m,1,[]);
BB=reshape(B.',1,n,[]);
C=max(AA+BB,[],3);

tic/toc result

>> A=rand(100);
>> B=rand(100);
>> tic; C = mp_prod(A,B); toc
Elapsed time is 0.005206 seconds.

8 件のコメント
6 件の古いコメントを表示6 件の古いコメントを非表示

Jan 2021 年 4 月 14 日

編集済み: Jan 2021 年 4 月 14 日

@Davide Zorzenon: Which machine and Matlab version are you using?

The different timings can mean, that Davide's setup is more efficient for the loop, or Bruno's setup is more efficient for the vectorized solution.

Davide Zorzenon 2021 年 4 月 14 日

@Jan: I use Matlab R2019a and my computer is a Windows 10 x64 intel core i7 (2.20GHz), with 16GB RAM and 6 cores.

サインインしてコメントする。

Fast implementation of max-plus matrix multiplication

4 件のコメント
2 件の古いコメントを表示2 件の古いコメントを非表示

採用された回答

12 件のコメント
10 件の古いコメントを表示10 件の古いコメントを非表示

その他の回答 (1 件)

8 件のコメント
6 件の古いコメントを表示6 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

Fast implementation of max-plus matrix multiplication

4 件のコメント 2 件の古いコメントを表示2 件の古いコメントを非表示

採用された回答

12 件のコメント 10 件の古いコメントを表示10 件の古いコメントを非表示

その他の回答 (1 件)

8 件のコメント 6 件の古いコメントを表示6 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

4 件のコメント
2 件の古いコメントを表示2 件の古いコメントを非表示

12 件のコメント
10 件の古いコメントを表示10 件の古いコメントを非表示

8 件のコメント
6 件の古いコメントを表示6 件の古いコメントを非表示