Speed up code by eliminating loop

Question

0 投票

Dear all,

I have to do a maximization of a 3-D array V(e,b,d) over the first-dimension, obtaining something that I call e_opt which will be a b*d matrix, i.e.

e_opt(b,d) = argmax_e {V(e,b,d)}

The non-standard feature is that the grid for e depends on d. So I am not able to eliminate the loop over d and I need to speed up the code. I attach a MWE so that things become clear. On my PC it runs in about 1 secs but this is too slow since I have to do this many times.

Any help is greatly appreciated!

clear
close all
clc
rng("default")
% Size of grids
n_b = 1000;
n_d = 1000;
n_e = 5000;
% Parameters
delta   = 0.2696037674949296; 
%omega_0 = 0.0015534835229589;
%omega_1 = 0.0013023827114628;
% Generate fake data
cost_mat = rand(n_e,n_d);
prob_mat = rand(n_e,n_d);
option_nob = rand(n_b,1);
d_grid = linspace(0,2.3,n_d)';
%% Create a grid for "e". The non-standard feature is that the upper bound
% of the e_grid depends on "d"
e_grid_mat = zeros(n_e,n_d);
e_min = 0;
space = 1.5;
for d_c=1:n_d
    d_val = d_grid(d_c);
    e_max = 0.1+0.2*d_val; %silly example, don't take this literally
    % e_grid is NOT equally spaced
    e_grid_mat(:,d_c) = e_min+(e_max-e_min)*(linspace(0,1,n_e).^space)';
end
%% The code below is the part that I'd like to speed up
tic
e_opt = zeros(n_b,n_d);
for d_c = 1:n_d
    % Effort grid depends on d (upper bound changes, but same no. of elements)
    e_grid = e_grid_mat(:,d_c); %each column of e_grid_mat is different!
    % V(e,b) has dim: (n_e,n_b) and I maximize with respect to the first dimension, e
    V = -cost_mat(:,d_c)-delta*prob_mat(:,d_c).*option_nob';
    [~,max_ind] = max(V,[],1); %maxind is (1,n_b) vector
    e_opt(:,d_c)  = e_grid(max_ind);
end %end d
toc
Elapsed time is 4.942008 seconds.
%To check results
disp(mean(mean(e_opt)))
    0.1285

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Alessandro Maria Marco 2024 年 1 月 3 日

編集済み: Alessandro Maria Marco 2024 年 1 月 3 日

MATLAB Online で開く

0 投票

The vectorized code after eliminating the loop over d is the following. Choose either 'loops' or 'vec' at the beginning of the script to compare the running time of the two methods: loops vs vectorized. It turns out that loops are faster! I found this relatively surprising since loops are generally slow in Matlab.

clear
close all
clc
rng("default")
method = 'loops'; % 'loops','vec'
% Size of grids
n_b = 1000;
n_d = 40;
n_e = 5000;
% Parameters
delta   = 0.2696037674949296; 
%omega_0 = 0.0015534835229589;
%omega_1 = 0.0013023827114628;
% Generate fake data
cost_mat = rand(n_e,n_d);
prob_mat = rand(n_e,n_d);
option_nob = rand(n_b,1);
d_grid = linspace(0,2.3,n_d)';
%% Create a grid for "e". The non-standard feature is that the upper bound
% of the e_grid depends on "d"
e_grid_mat = zeros(n_e,n_d);
e_min = 0;
space = 1.5;
for d_c=1:n_d
    d_val = d_grid(d_c);
    e_max = 0.1+0.2*d_val; %silly example, don't take this literally
    % e_grid is NOT equally spaced
    e_grid_mat(:,d_c) = e_min+(e_max-e_min)*(linspace(0,1,n_e).^space)';
end
%% The code below is the part that I'd like to speed up
tic
switch method
    case 'loops'
        disp('loops')
        e_opt = zeros(n_b,n_d);
        for d_c = 1:n_d
            % Effort grid depends on d (upper bound changes, but same no. of elements)
            e_grid = e_grid_mat(:,d_c); %each column of e_grid_mat is different!
            % V(e,b) has dim: (n_e,n_b) and I maximize with respect to the first dimension, e
            V = -cost_mat(:,d_c)-delta*prob_mat(:,d_c).*option_nob';
            [~,max_ind] = max(V,[],1); %maxind is (1,n_b) vector
            e_opt(:,d_c)  = e_grid(max_ind);
        end %end d
    case 'vec'
        disp('vec')
        % compute the full V, with dim (n_e,n_d,n_b)
        V = -cost_mat-delta*prob_mat.*reshape(option_nob, 1, 1, n_b);
        %V = -cost_mat-delta*prob_mat.*shiftdim(option_nob,-2);
        [~, max_ind] = max(V, [], 1); % max_ind is (1,n_d,n_b)
        max_ind = permute(max_ind, [2 3 1]); % % max_ind is now (n_d,n_b)
        %max_ind = max_ind + (0:n_d-1).'*n_e; % apply linear indexing
        % e_grid_mat is [n_e,n_d], but e_opt gives only "e" indices
        max_ind = sub2ind([n_e,n_d],max_ind,repmat((1:n_d)',1,n_b));
        e_opt = e_grid_mat(max_ind).'; % retrieve maximizing values
end
loops
toc
Elapsed time is 0.263274 seconds.
%To check results 
disp(mean(mean(e_opt)))
    0.1364

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

Answer 2

Dinesh 2024 年 1 月 3 日

MATLAB Online で開く

1 投票

Hi user,

Considering the dependency of "e_grid" on "d", full vectorization of your code may not be feasible. To optimize the code, you can explore MATLAB's parallel processing capabilities, particularly the "parfor" loop, which can execute iterations concurrently on multi-core processors. Ensure you have the Parallel Computing Toolbox installed and initiate a parallel pool before the loop to take advantage of this feature. This approach can significantly reduce the runtime if your system has multiple cores.

Here's a brief example using "parfor":

parpool;
tic
e_opt = zeros(n_b,n_d);
parfor d_c = 1:n_d
    e_grid = e_grid_mat(:,d_c);
    V = -cost_mat(:,d_c) - delta*prob_mat(:,d_c).*option_nob';
    [~, max_ind] = max(V, [], 1);
    e_opt(:, d_c) = e_grid(max_ind);
end
toc

In addition to parallelization, profile your code to identify other potential inefficiencies. Utilize MATLAB's "profile on" command to get a detailed report on the runtime of different parts of your script.

https://www.mathworks.com/help/parallel-computing/getting-started-with-parallel-computing-toolbox.html

https://www.mathworks.com/help/matlab/matlab_prog/profiling-for-improving-performance.html

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

Alessandro Maria Marco 2024 年 1 月 3 日

@Dinesh many thanks for your answer!

(1) I tried the parfor and it works (i.e. it gives the same results as the serial code) but it is 50% slower, perhaps due to overhead cost

(2) I had already profiled the code using 'profile on' and this is the bottleneck in my project

I managed to do the vectorization (with the help of another user on another forum) but again it does not beat the serial code with loops. I will post it anyways as an answer since it may be useful to others

サインインしてコメントする。

Speed up code by eliminating loop

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

その他の回答 (1 件)

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

カテゴリ

製品

リリース

タグ

Community Treasure Hunt

Speed up code by eliminating loop

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

その他の回答 (1 件)

1 件のコメント -1 件の古いコメントを表示 -1 件の古いコメントを非表示

カテゴリ

製品

リリース

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示