Is there a way to speed up matrix calculations by Parallel Computing Toolbox?

Question

Kyle 2013 年 10 月 2 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/88981-is-there-a-way-to-speed-up-matrix-calculations-by-parallel-computing-toolbox

コメント済み: Kyle 2013 年 10 月 6 日

For example, I have the following matrix calculation:

for j = 1: 1000000
    O(i,i,j) =  M(i,i).* N(i,i).* J(i,i);
end

where i is, say, 200. To speed up the calculation, can I take the advantage of Parallel Computing Toolbox? I have three macbooks, can I connect them together to execute the above codes together in an parallel way? Many thanks

2 件のコメント
なしを表示なしを非表示

Kyle 2013 年 10 月 3 日

MATLAB Online で開く

Thank you for your suggestions, Matt J and Walter Roberson. I think I need to detail my question. I was actually going to convert my codes to a parallel version running on multiple computers so that the the efficiency could be dramatically enhanced.

My codes are about solving a pair of differential equations (the Brusselator model), in other words, 2D gird-simulation in high resolution.

    % Simulate 2D Brusselator model by Euler algorithm
    clear;clc;close all
    a = 5;
    b = 19;
    Dx = 5;
    Dy = 40;
    time_end = 150;    % length of simulation, in sec
    %%dimensions for the full-resolution grid
    [Nx Ny] = deal(60);  % no. of sampling points along each axis
    [Lx Ly] = deal(60);        % square cortex (cm)
    [dx dy] = deal(Lx/Nx, Ly/Ny);  % spatial resolution (cm)
    %%time resolution and time-base
    dt = 1*1e-3;      % Nx = Ny =  60; D2 = 1.0 cm^2   
    Nsteps = round(time_end/dt)
    %%3x3 Laplacian matrix (used in grid convolution calculations)
    Laplacian = [0 1 0; 1 -4 1; 0 1 0];
    %%set up storage vectors and grids
    [U_grid  V_grid]               = deal(0.001*randn(Nx, Ny));
    % save UV_inis U_grid  V_grid
    U0 = a; V0 = b/a;
    % initialize the grids at steady-state values
    [U_grid  V_grid]             = deal(U0 + U_grid, V0 + V_grid);
    % diffusion multipliers (depend on step size)
    Dx = Dx/dx^2;
    Dy = Dy/dx^2;
    %%Simulation
    stride2 = 100;          % iterations per screen update
    time = [0:Nsteps-1]'*dt;  % timebase
    ii = 1;
    for i = 1: Nsteps
        i
        U_grid = U_grid + dt*(a-(b+1)*U_grid + U_grid.^2.*V_grid + ...
            Dx*convolve2(U_grid, Laplacian, 'wrap')); 
        V_grid = V_grid + dt*(b*U_grid - U_grid.^2.*V_grid + ...
            Dy*convolve2(V_grid, Laplacian, 'wrap'));
        if (mod(i, stride2) == 1 || i == Nsteps)        
            mesh(x, y, U_grid);
            drawnow;
            U_save(:,:, ii) = U_grid;
            ii = ii + 1;
        end
    end

In the codes, "U_save" is the final results I want to have. If the "Nx" and "Ny" are set to 300, meaning high resolution, the computing speed will be very slow.

I have Parallel Computing Toolbox and Distributed Computing Toolbox. May I have your suggestions for an efficient calculation? Many thanks

Kyle 2013 年 10 月 4 日

As all experts suggested, parallel computation may not be much helpful for my case. In the other hand, I'm thinking GPU-assisted calculation. Since my computer doesn't have a sound GPU, I need your suggestions that if it worths to get a modern graphic card to boost my simulation? Can GPU help to enhance my grid simulation? Many thanks

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Teja Muppirala 2013 年 10 月 4 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/88981-is-there-a-way-to-speed-up-matrix-calculations-by-parallel-computing-toolbox#answer_98602

MATLAB Online で開く

As others have pointed out, this will be difficult to parallelize for a single run, since you are integrating forward in time, and each successive time step depends on the previous one.

Parallelization will help if you need to run this for many different initial conditions though.

Putting parallelization aside for the moment, you will find that if you write the convolution out explicitly using matrix manipulations, your simulation will run much faster. In other words, replace these two lines:

U_grid = U_grid + dt*(a-(b+1)*U_grid + U_grid.^2.*V_grid + ...
          Dx*convolve2(U_grid, Laplacian, 'wrap'));
V_grid = V_grid + dt*(b*U_grid - U_grid.^2.*V_grid + ...
          Dy*convolve2(V_grid, Laplacian, 'wrap'));

by this instead:

UL = -4*U_grid+U_grid([2:60 1],:)+U_grid([60 1:59],:)+U_grid(:,[2:60 1])+U_grid(:,[60 1:59]);
U_grid = U_grid + dt*(a-(b+1)*U_grid + U_grid.^2.*V_grid + Dx*UL); 
VL = -4*V_grid+V_grid([2:60 1],:)+V_grid([60 1:59],:)+V_grid(:,[2:60 1])+V_grid(:,[60 1:59]);
V_grid = V_grid + dt*(b*U_grid - U_grid.^2.*V_grid + Dy*VL);

and you will see quite a speedup.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Walter Roberson 2013 年 10 月 4 日

I had written that into my posting and edited it out again when I went the route of iterating symbolically. Part of the reason I dropped it was that I didn't want to try to work out what the edge effects were, but it looks like they were exactly what my first thought was ;-)

サインインしてコメントする。

Answer 2

Matt J 2013 年 10 月 3 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/88981-is-there-a-way-to-speed-up-matrix-calculations-by-parallel-computing-toolbox#answer_98531

編集済み: Matt J 2013 年 10 月 3 日

MATLAB Online で開く

It is wasteful to repeatedly compute the same fixed M(i,i).* N(i,i).* J(i,i) for every j. Also, matrix operations are already coded to take advantage of your system's multi-threading capabilities. I would just skip the for-loop and do

O(i,i,1:1000000)=M(i,i).* N(i,i).* J(i,i);

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Kyle 2013 年 10 月 3 日

Thank you, Matt J. Matrices M, N and J are dynamical actually. I have updated my question for your reference. Thank you

サインインしてコメントする。

Answer 3

Walter Roberson 2013 年 10 月 3 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/88981-is-there-a-way-to-speed-up-matrix-calculations-by-parallel-computing-toolbox#answer_98532

MATLAB Online で開く

for j = 1: 1000000
    O(i,i,j) =  M(i,i).* N(i,i).* J(i,i);
end

can be rewritten as

MNO = M(i,i) .* N(i,i) .* J(i,i);
for j = 1 : 1000000
   O(i,i,j) = MNO;
end

which in turn can be rewritten as

joff = sub2ind( size(O), [i, i, 1] );
jskip = size(O,2) .* size(O,3);
O(joff : jskip : end) = M(i,i) .* N(i,i) .* J(i,i);

It probably is not worthwhile to convert this for use with the parallel toolbox, but if you did then you would calculate M(i,i) .* N(i,i) .* J(i,i) ahead of time, and then use a parfor or distributed array to set every jskip'th element of the O array to be the precalculated value.

In order to hook multiple computers up for use in the parallel computing toolbox, you would also need to use the distributed computing toolbox, as by itself the parallel computing can only be local.

If the above is not your pattern, then we will need to see your real pattern to advise you as to whether the parallel computing will help.

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

Walter Roberson 2013 年 10 月 3 日

MATLAB Online で開く

Your U_grid and V_grid are both incrementally updated based upon the previous iteration's result, and thus upon the history of the calculation right back to the initialization. Such calculations cannot be parallelized except in-so-far as you can parallelize the execution of the individual statement,

U_grid = U_grid + dt*(a-(b+1)*U_grid + U_grid.^2.*V_grid + ...
          Dx*convolve2(U_grid, Laplacian, 'wrap'));

and then likewise for V_grid. (You cannot update V_grid in parallel with updating U_grid because the V_grid calculation depends upon the updated U_grid.)

Your U_grid and V_grid are only 60 x 60, so they are too small to benefit from multithreading or multiple processors for those two individual statements.

What would be possible (feasible) would be to run multiple independent simulations, each with a different starting random number seed. If the simulation is plausibly going to get trapped in a local minimum that could be worthwhile, but if the simulation is expected to robustly find a (single) global minimum no matter the starting point, then there would probably be no value in doing that.

Walter Roberson 2013 年 10 月 3 日

MATLAB Online で開く

That is applicable if the order of the partial factors does not make any difference -- if the reduction operation is commutative. It is not immediately clear that it would be the case for the above code.

A 1D convolution can be written as a matrix multiplication involving a topelitz matrix, and I have read that a 2D convolution can be done as two 1D convolutions. That would seem to suggest that a 2D convolution could be done as a single matrix multiplication by the product of the two topelitz matrices, but I have not yet found anything that says that explicitly, and I have not found anything that talks about the numeric stability if that were the case.

If X and Y represent the initial U_grid and V_grid respectively, it appears that each iteration goes something like

X = X + dt * (a - (b+1) * X + X^2 * Y + X*Conv)
Y = Y - (X^2 * Y + (Conv - b - 1) * X + a)^2 * Y * dt^3 - (2 * (X^2 * Y + (Conv - b - 1) * X + a)) * (-(1/2) * b + X * Y) * dt^2 + (X * b - X^2 * Y + Dy * Conv) * dt

where Conv is the hypothetical multiplication of the two topelitz matrices as needed to represent the 2D convolution.

It does not at all appear to me that the reduction over addition would involve terms that are independent of the order of calculations, so I doubt the particular code here can be done as a parallel reduction.

Kyle 2013 年 10 月 6 日

Thank you Walter for your suggestions. I'm testing your algorithms.

サインインしてコメントする。

Is there a way to speed up matrix calculations by Parallel Computing Toolbox?

2 件のコメント
なしを表示なしを非表示

採用された回答

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (2 件)

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

Is there a way to speed up matrix calculations by Parallel Computing Toolbox?

2 件のコメント なしを表示なしを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (2 件)

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

5 件のコメント 3 件の古いコメントを表示3 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

2 件のコメント
なしを表示なしを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示