Why vectorized calculations are faster than for loops?

17 ビュー (過去 30 日間)
Mikhail
Mikhail 2014 年 10 月 26 日
移動済み: Walter Roberson 2025 年 4 月 24 日
Why it's faster in Matlab? Is it because better memory treating, or paralleling? If this is only due parallel computation, on single-core laptop it will be now difference between? Thanks<

回答 (2 件)

Jan
Jan 2014 年 10 月 26 日
編集済み: Jan 2014 年 10 月 26 日
At first: There is no evidence that vectorized code is faster in general.
If a build-in function can be applied to a complete array, a vectorization is much faster than a loop appraoch. When large temporary arrays are required, the benefits of the vectorization can be dominated by the expensive allocation of the memory, when it does not match into the processor cache.
A secondray effect of vectorizing is that the code looks more clear, at least as a rule of thumb. A trivial example:
% Loops:
A = rand(10);
B = rand(10);
C = zeros(size(A));
for i2 = 1:size(A, 2)
for i1 = 1:size(A, 1) % Columns in the inner loop
C(i1, i2) = A(i1, i2) + B(i1, i2);
end
end
% Vectorized:
C = A + B;
The 2nd method is faster concerning the runtime, but also for the programming and debug time. There is almost no chance to create a bug and it will be very easy to understand the code, when the program needs changes in the future.
  3 件のコメント
Mikhail
Mikhail 2014 年 10 月 27 日
編集済み: Mikhail 2014 年 10 月 27 日
I still didn't understand why it happens? Why build-in function are faster? For what (fundamental) reason it happens?
Keldon Alleyne
Keldon Alleyne 2018 年 10 月 1 日
Multiplying matrices in loops is O(N^3), while the fastest algorithms using other methods are O(N^2.3) - O(N^2.8), which can easily explain the differences in performance.
On my laptop with Matlab 2018b I get:
Elapsed time is 0.112362 seconds.
Elapsed time is 0.214544 seconds.
Elapsed time is 0.007935 seconds.

サインインしてコメントする。


Vibhav
Vibhav 2025 年 4 月 24 日
移動済み: Walter Roberson 2025 年 4 月 24 日
Looks like your question about why vectorizing is fundamentally faster (in most cases) was never answered.
The answer is complicated and I won't provide all the details, but in addition to optimized memory access and multithreading, vectorized code uses Single Instruction Multiple Data (SIMD) instructions which are specific CPU instructions that perform the same computation on multiple data, thus speeding up the computational throughput of the same CPU / core. You can look up "SIMD" for more information on how this is done, but the TL;DR is that vectorized MATLAB core gets "Just-In-Time" (JIT) compiled to BLAS and LAPACK subroutines that exploit SIMD instructions.
Vectorization also exploits multithreaded routines on machines that support it. It is possible to have "multithreaded" code on a single core CPU, but that is not true parallelism as only one instruction can execute at a time, but from a user's standpoint it appears to be concurrent.

カテゴリ

Help Center および File ExchangePerformance and Memory についてさらに検索

製品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by