How to accelerate the execution of this code?
1 回表示 (過去 30 日間)
古いコメントを表示
In the example below I have a matrix (J_inv(pos(:,j))) which is different in each iteration of the parfor-loop. I tried to vectorize somehow, but had no success. I found it to be the fastest solution with the parallel computing toolbox. This example has an average duration of roughly 20 seconds per for-loop iteration.
Hints:
- size of pos is 6x1000000
- self.inv_j is a 3x6 sym
Does anyone know how i can accelerate the code execution without decreasing the size of pos, so that i can increase the number of overall iterations?
for i = 1:n_iterations
% here parameter values are changed to random numbers that affect the inverse jacobian
% ...
%
self.update_inv_jacobian();
J_inv = matlabFunction(self.inv_j, 'Vars', {[self.all_dofs]});
parfor j = 1:length(pos)
Q = J_inv_const(pos(:,j)) * pos(:,k);
P = lsqminnorm(J_inv(pos(:,j)), Q);
errors(:,k) = pos(:,k) - P;
end
max_errors(:,i) = max(errors, [], 2);
end
2 件のコメント
Jonas
2023 年 7 月 7 日
i would start looking into the speed of each line. for that, change the parfor to for, add a 'profile on' before the i loop and a 'profile viewer' after the end of the i loop
Steven Lord
2023 年 8 月 16 日
What is J_inv_const in this code? Is it a symbolic expression, is it a function handle (perhaps created using matlabFunction on a symbolic expression), etc.?
I am a little surprised that your errors variable is not a sliced output variable, since its index isn't a function of the parfor loop variable.
採用された回答
Ashutosh
2023 年 8 月 16 日
These steps would help you in improving the performance of the code:
- Instead of concatenating the "errors" for each iteration you can pre-allocate the "errors" matrix before the for loop to save computation time in unnecessary memory reallocation and improve the performance.
- Try adjusting the number of workers with the help of "parpool" function and create Pools of "Processes" or "Threads" based on your requirements.
- “parfor” can be modified by setting up the options and determine which partitioning to be used based on requirements with the help of link attached https://in.mathworks.com/help/parallel-computing/parforoptions.html.
- If you have access to the GPUs, you can perform the computation parallelly in multiple GPUs with the help of "spmd" command.
The links attached below can be used for improving the performance:
0 件のコメント
その他の回答 (0 件)
参考
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!