Is there a way to vectorize these loops for speed?

Question

Mohamed Abdalmoaty 2018 年 2 月 28 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/385459-is-there-a-way-to-vectorize-these-loops-for-speed

コメント済み: Matt J 2018 年 3 月 1 日

I need to construct symmetric matrix M as shown in the following code:

theta = 0.5;
lambda =  0.1; % usually a number between zero and one
N = 1000   % or larger  number
M = zeros(N,N); % allocate initial value 
for t = 1:N
    firstTerm = sum(theta.^(4.*(t-1:-1:0)));
    secondTerm = 0;
    for k = 1:t-1
        secondTerm = secondTerm + sum(theta.^(2*((2*t)-(2*k)-1:-1:t-k)));
    end
    M(t,t) =  3*lambda^2*firstTerm + 6*lambda^2*secondTerm;
    for s=t+1:N
        thirdTerm = 0;
        for r = t+1:s
            thirdTerm = thirdTerm + sum(theta.^(2*(t+s-r-1:-1:s-r)));
        end
        M(t,s) = 3*lambda^2*theta^(2*(s-t))*firstTerm + 6*lambda^2*theta^(2*(s-t))*secondTerm +  lambda^2*thirdTerm;
        M(s,t) = M(t,s);
    end
end

The code takes very long when N is large; is there any better way to compute M? I was hoping that it might be possible to vectorize the computations?

Observe that the diagonal elements are easy to compute, but the off diagonals required a third term in the sum. However, they can be constructed given the diagonal as shown in the code.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Matt J 2018 年 2 月 28 日

2
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/385459-is-there-a-way-to-vectorize-these-loops-for-speed#answer_307672

編集済み: Matt J 2018 年 2 月 28 日

MATLAB Online で開く

Summations like

sum(theta.^(2*((2*t)-(2*k)-1:-1:t-k)));

can be replaced with closed-form formulas for geometric series, see for example,

https://en.wikipedia.org/wiki/Geometric_progression#Geometric_series

This will save you the overhead of both constructing the series itself and the sum() command.

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

Mohamed Abdalmoaty 2018 年 3 月 1 日

Thanks Matt! I was able to reduce all the sums as well as replacing the inner loop in r with closed form expressions. I think the code is much better now! +1

Matt J 2018 年 3 月 1 日

Thanks, guys.

サインインしてコメントする。

Answer 2

Jan 2018 年 2 月 28 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/385459-is-there-a-way-to-vectorize-these-loops-for-speed#answer_307654

MATLAB Online で開く

You compute expensive powers of theta repeatedly. Better do this once and store the values in a vector. What is the largest power of theta? Let's assume it is 4*N.

theta = 0.5;
N = 1000   % or larger  number
M = zeros(N,N); % allocate initial value 
lambda2 = lambda ^ 2;
tpow = theta .^ (0:4*N);
% Or cheaper:
% tpow = cumprod([1, repmat(theta, 1, N-1)]);
for t = 1:N
  firstTerm = sum(tpow(1 + 4.*(t-1:-1:0)));
  secondTerm = 0;
  for k = 1:t-1
      secondTerm = secondTerm + sum(tpow(1 + 2*((2*t)-(2*k)-1:-1:t-k)));
  end
  M(t,t) =  lambda2*(3*firstTerm + 6*secondTerm);
  for s=t+1:N
      thirdTerm = 0;
      for r = t+1:s
          thirdTerm = thirdTerm + sum(tpow(1 + 2*(t+s-r-1:-1:s-r)));
      end
      M(t,s) = lambda2*(3*tpow(1 + 2*(s-t))*firstTerm + ...
                        6*tpow(1 + 2*(s-t))*secondTerm + thirdTerm);
      M(s,t) = M(t,s);
  end
end

Unfortunately I cannot test this. Please provide the value of lambda.

0.5^4000 is smaller than realmin. Therefore the corresponding terms in the sum are 0 and could be omitted.

I've removed some multiplications by lambda^2 and many square operations of lambda by storing it once before the loop. The idea is to avoid all repeated calculations.

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

Jan 2018 年 3 月 1 日

MATLAB Online で開く

@Mohamed: Please post the timings for the original code and my suggestion. Then show us the size of the difference of the methods. Maybe this is caused by rounding only:

t1 = lambda2*(3*tpow(1 + 2*(s-t))*firstTerm + ...
              6*tpow(1 + 2*(s-t))*secondTerm + thirdTerm)
t2 = lambda2*3*tpow(1 + 2*(s-t))*firstTerm + ...
     lambda2*6*tpow(1 + 2*(s-t))*secondTerm + 
     lambda2*thirdTerm

This can slightly differ, but this is not a bug, but an effect of rounding. But maybe I made a mistake during editing. You can find it by your own using the debugger: Compare the intermediate terms.

Mohamed Abdalmoaty 2018 年 3 月 1 日

編集済み: Mohamed Abdalmoaty 2018 年 3 月 1 日

When I compared the results yesterday, there was a big difference which I guess cannot be a rounding error! But, I'll have another look when I can. Thanks

サインインしてコメントする。

Is there a way to vectorize these loops for speed?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

その他の回答 (1 件)

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

Is there a way to vectorize these loops for speed?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

その他の回答 (1 件)

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示