Solving multiple systems of equations using GPU and iterative methods

I am trying to solve an inverse by solving multiple systems of linear equations. The coefficient matrix is large and sparse, and direct methods produce significant rounding errors. I have used the incomplete LU factorization as preconditioner for the iterative methods, and parallelization using CPU is quite simple as follows:
[L1,U1] = ilu(U,struct('type','ilutp','droptol',tolILU));
E3aux = speye(size(U));
parfor j=1:size(U,1),
[E3(:,j),~] = bigs(U,E3aux(:,j),[],[],L1,U1);
end
However, since bigs function also works with GPUarrays, I wonder if it would be possible to make something similar using the GPU. I know that it is possible to do it for one system of equations as follows:
cpuTime = timeit(@()bicg(U,b,[],[],L1,U1));
fprintf('It takes %3.4f seconds on the CPU to execute a systems of equations.\n',cpuTime);
Ugpu = gpuArray(full(U));
bgpu = gpuArray(full(b));
L1gpu = gpuArray(full(L1));
U1gpu = gpuArray(full(U1));
gpuTime = gputimeit(@()bicg(Ugpu,bgpu,[],[],L1gpu,U1gpu));
fprintf('It takes %3.4f seconds on the GPU to execute system.\n',gpuTime);
fprintf(['Unvectorized GPU code is %3.2f times slower ',...
'than the CPU version.\n'], gpuTime/cpuTime);
Getting the following output:
It takes 0.0195 seconds on the GPU to execute system.
Unvectorized GPU code is 25.29 times slower than the CPU version.
However, I do not know:
  1. How to use preconditioners without transforming them into full matrices.
  2. How to paralellize for multiple independent vectors.

 採用された回答

Joss Knight
Joss Knight 2018 年 3 月 14 日
編集済み: Joss Knight 2018 年 3 月 14 日

0 投票

Sparse gpuArrays have been supported since R2015b and bicg since R2016b, so you should just call bicg with the original sparse matrices and you should get good performance. If you pass your system matrix U as a preconditioner then ILU is the technique used to apply it so you will get the same behaviour. You cannot pass two matrix preconditioners to the sparse gpuArray version of bicg.
The iterative solvers cannot (by their very nature) solve for multiple right-hand-sides at the same time. You can only do that by calling the solver repeatedly - or by using a direct solver.

2 件のコメント

rominsol
rominsol 2018 年 3 月 14 日
Thanks for your quick response Joss.
Regarding your last comment. I understand that iterative solvers cannot solve for multiple right-hand-sides at the same time, but I am not sure if what you are claiming is that it is not possible to parallelize the solution for multiple systems, like I did using the CPU (see the code below), but using the GPU instead.
E3aux = speye(size(U));
parfor j=1:size(U,1),
[E3(:,j),~] = bigs(U,E3aux(:,j),[],[],L1,U1);
end
Joss Knight
Joss Knight 2018 年 3 月 14 日
Sure you can, but this only makes sense if you have more than one GPU. Otherwise you're gaining nothing.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

ヘルプ センター および File ExchangeCreating and Concatenating Matrices についてさらに検索

質問済み:

2018 年 3 月 13 日

コメント済み:

2018 年 3 月 14 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by