how to vectorize 4 for loops with several index combinations?

5 ビュー (過去 30 日間)
Euclides
Euclides 2024 年 2 月 13 日
編集済み: Euclides 2024 年 2 月 14 日
I have the following 4 for loops, which recombine the components of a tensor B into a new tensor A, and I need to vectorize these loops to cut runtime. How can that be done? This should be a great exercise to finally understand how to vectorize whatever loops. I think we should probably use the function ndgridVecs to create a grid. What is not clear to me yet is how one should recombine the indices according to the rule in the core inner loop. I would probably eventually get there by myself, but I really don't have a lot of time available for this now. I would be grateful if someone could help me with this.
B = rand(3,3,3,3)
A = zeros(size(B));
for i=1:3
for j=1:3
for k=1:3
for l=1:3
A(i,j,k,l) = 10*...
(B(k,i,j,l) ...
+ B(k,j,i,l) ...
+ B(l,i,j,k) ...
+ B(l,j,i,k));
end
end
end
end
  1 件のコメント
Matt J
Matt J 2024 年 2 月 13 日
It would advisable to show the typical dimensions rather than a 3x3x3x3 example. The recommendation can depend on what the true sizes will be.

サインインしてコメントする。

採用された回答

Hassaan
Hassaan 2024 年 2 月 13 日
編集済み: Hassaan 2024 年 2 月 13 日
B = rand(3,3,3,3);
% Preallocating A for the output
A_loops = zeros(size(B));
tic; % Start timing
for i=1:3
for j=1:3
for k=1:3
for l=1:3
A_loops(i,j,k,l) = 10*...
(B(k,i,j,l) ...
+ B(k,j,i,l) ...
+ B(l,i,j,k) ...
+ B(l,j,i,k));
end
end
end
end
toc; % End timing and print elapsed time
A_loops
tic; % Start timing
% Vectorized operation
% The key is to correctly apply the indexing for each dimension.
% We need to permute and then reshape B to align with the required operations.
% For B(k,i,j,l) + B(k,j,i,l)
temp1 = permute(B, [3,1,2,4]) + permute(B, [3,2,1,4]);
% For B(l,i,j,k) + B(l,j,i,k)
temp2 = permute(B, [4,1,2,3]) + permute(B, [4,2,1,3]);
% Summing the permuted matrices
% Since temp1 and temp2 are now aligned with the A indexing, we sum them directly.
A = 10 * (temp1 + temp2);
toc; % End timing and print elapsed time
A
Typically, the vectorized version should be significantly faster, especially as the size of the data increases, due to MATLAB's optimization for array operations over loop iterations.
Ensure Correct Indexing: The indexing in the vectorized approach should align with the loop-based method. The indices in the vectorized formula must correspond exactly to how they are used in the loop.
Use permute and reshape if Necessary: MATLAB's permute function can rearrange the dimensions of an array, which might be necessary to align the dimensions correctly for vectorized operations. Similarly, reshape can adjust the dimensions of the output to match the expected result.
This approach takes into account the permutations required to align the indices of B with those used in the calculation for A. By permuting B to match each of the required index orders (k,i,j,l, k,j,i,l, l,i,j,k, l,j,i,k), we can then sum these permutations directly, applying the factor of 10 as in the loop-based method.
Note:
  • There are also other methods/ways that may help in vectorization
-----------------------------------------------------------------------------------------------------------------------------------------------------
If you find the solution helpful and it resolves your issue, it would be greatly appreciated if you could accept the answer. Also, leaving an upvote and a comment are also wonderful ways to provide feedback.
It's important to note that the advice and code are based on limited information and meant for educational purposes. Users should verify and adapt the code to their specific needs, ensuring compatibility and adherence to ethical standards.
Professional Interests
  • Technical Services and Consulting
  • Embedded Systems | Firmware Developement | Simulations
  • Electrical and Electronics Engineering
Feel free to contact me.
  7 件のコメント
Euclides
Euclides 2024 年 2 月 13 日
編集済み: Euclides 2024 年 2 月 14 日
I understand. I'll test it. I still have another set of 4 for loops, with yet another index recombination rule, but I'll try to vectorize that set by myself. Thank you so much for your help guys!
Euclides
Euclides 2024 年 2 月 14 日
編集済み: Euclides 2024 年 2 月 14 日
this vectorization in combination with another one I had done earlier with help from @Matt J as well as further code optimization allowed me to cut the (average) runtime for my overall function from an initial 12 s to 2.5 s, then 0.5 s, and now 0.15 s! I still have one last set of loops that I just realized I don't even need to vectorize (I can completely bypass the set), so I hope I'll be able to bring the runtime further down to 0.05 s or less. I need to evaluate the function in question between 100 thousand and 10 million times, approximately, so this runtime issue really is critical. Big thank you!

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeLoops and Conditional Statements についてさらに検索

製品


リリース

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by