フィルターのクリア

Parallel is slower than sequential?

2 ビュー (過去 30 日間)
Viviana Arrigoni
Viviana Arrigoni 2018 年 4 月 19 日
回答済み: Edric Ellis 2018 年 4 月 19 日

I am new with the Parallel Toolbox, and I have many doubts. I was implementing some parallel Jacobi algorithm, and it resulted to be slower than the sequential, using the same precision threshold parameters. I tried several parallel approaches, and none seemed to be fast enough. So I tried some simpler code, as the one below:

     tic;
     ticBytes(gcp);
     n = 500;
     n_mat = 50;
     C = cell(1, n_mat);
     parfor i = 1:n_mat
          A = rand(n);
          B = rand(n);
         C{i} = A * B;
     end
     tocBytes(gcp);
     toc

and it is slower than the same, with 'for' instead of parfor. I got respectively:

             BytesSentToWorkers    BytesReceivedFromWorkers
             __________________    ________________________
    1              16016                  5.2018e+07       
    2              18152                  4.8021e+07       
    Total          34168                  1.0004e+08       

Elapsed time is 1.590726 seconds.

for the parallel version,

and: Elapsed time is 0.674556 seconds.

for the sequential version.

What am I doing wrong? I also don't really understand what sliced variables are. Furthermore I noticed that using cell structures instead of arrays inside parfor doesn't give the warning of the overhead, so I always tended to prefer them, but still with the arrays things go usually faster.

回答 (1 件)

Edric Ellis
Edric Ellis 2018 年 4 月 19 日

There are a couple of reasons that your parfor loop is slower than the for loop equivalent. Firstly, there's the data transfer overhead - you're transferring quite a decent amount of data back to the client from the workers - this has to be serialized (basically like calling save on the data - but without using a file) on the worker, sent to the client, and then deserialized (equivalent of load).

Secondly, and probably most importantly for this case, if you're using only the local cluster type, then unfortunately this particular loop is pretty much guaranteed to be slower using parfor than for. That's because the for loop version is already pretty efficiently multi-threaded using mtimes - essentially, it's already taking full advantage of all the cores on your computer. The workers in a parfor loop default to running in a single-threaded mode, so each individual call to mtimes will be slower. Workers default to running in single-threaded mode to avoid overloading your computer.

カテゴリ

Help Center および File ExchangeParallel for-Loops (parfor) についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by