Hi,
I am trying to build a 2-D bilinear interpolation function as shown below. While using the profiler, I noticed that the maximum computation time is spent in finding upper and lower bound
temp = x(i,j) <= X;
[idx1, ~] = find(temp, 1);
x , y are scalars
and X, Y, V are gridded data with equal size of (m, n).
My aim is to achieve better computational performance than using the native griddedinterpolant in Matlab
V_fit = griddedInterpolant(X, Y, V, 'linear' )
v = V_fit (x, y)
At the moment, griddedinterpolant is 10 times faster than my user defined function.
Is there a better way to calculate the upper and lower bounds? Possibly, that works also when x , y are matrix of size (i,j).
function [v] = interp2D(X, Y, V, x, y)
% Calculate lower bound in x direction
temp = x <= X;
[idx1, ~] = find(temp, 1);
% Calculate upper bound in x direction
temp = x > X;
[idx2, ~] = find(temp, 1, 'last');
% Calculate lower bound in y direction
temp = y <= Y;
[~, idy1] = find(temp, 1);
% Calculate upper bound in y direction
temp = y > Y;
[~ , idy2] = find(temp, 1, 'last');
% Evaluate the function at four points
V11 = V(idx1 , idy1);
V12 = V(idx1 , idy2);
V21 = V(idx2 , idy1);
V22 = V(idx2 , idy2);
% Interpolate in x-direction
Vx1 = (X(idx2 , 1) - x) * V11 / ( X(idx2 , 1) - X(idx1 , 1)) + ...
(x - X(idx1 , 1)) * V21 / ( X(idx2, 1) - X(idx1, 1));
Vx2 = (X(idx2, 1) - x) * V12 / ( X(idx2, 1) - X(idx1, 1)) + ...
(x - X(idx1, 1)) * V22 / ( X(idx2, 1) - X(idx1, 1));
% Interpolate in y-direction
v = (Y(1, idy2) - y) * Vx1 / ( Y(1 , idy2) - Y(1, idy1)) + (y - Y(1, idy1)) * Vx2 / ( Y(1, idy2) - Y(1, idy1));
end
Edit: In my case, m = 181, n = 181. And, while comparing computational time, I assume that griddedInterpolant(X, Y, V, 'linear' ) is performed before the simulation is run i.e. I compare the time of v = V_fit (x, y) with the execution time of my code.

 採用された回答

Matt J
Matt J 2019 年 1 月 31 日
編集済み: Matt J 2019 年 1 月 31 日

0 投票

Here is a race of griddedInterpolant on the CPU (AMD Ryzen Threadripper 1900X, 3850 Mhz) against gpuArray.interp2 on the GeForce GTX 1080 Ti. As you can see, the latter is almost 5 times faster. This was in R2018a.
dtype='single';
N=512;
V=rand(N,dtype);
x=randi([1,N], [1,N^3]);
y=randi([1,N], [1,N^3]);
%%%%%%%%%%% Using griddedInterpolant on the CPU %%%%%%%%%%%%
F=griddedInterpolant(V);
tic;
F(x,y);
toc
%Elapsed time is 0.567307 seconds.
%%%%%%%%%%% Using the GPU %%%%%%%%%%%%
gd=gpuDevice;
x=gpuArray(x);y=gpuArray(y); V=gpuArray(V);
tic;
interp2(V,x,y);
wait(gd)
toc;
%Elapsed time is 0.132149 seconds.

1 件のコメント

Nimananda Sharma
Nimananda Sharma 2019 年 2 月 1 日
Thanks Matt. I have decided now to use a combination of griddedInterpolant for scalar qwery points and inter2 with gpuArray for qwery points which are double. I still managed to get 40% boost in computational efficiency. At this moment, I think I will live with it. Thanks a lot for your inputs.

サインインしてコメントする。

その他の回答 (1 件)

Matt J
Matt J 2019 年 1 月 30 日
編集済み: Matt J 2019 年 1 月 30 日

0 投票

I don't think you're going to beat griddedInterpolant in M-code, but a better way of computing of the bounds (and one which works on non-scalars) is,
idx1=discretize(x,X); idx2=idx1+1;
idy1=discretize(y,Y); idy2=idy1+1;

8 件のコメント

Nimananda Sharma
Nimananda Sharma 2019 年 1 月 31 日
Thanks Matt J for the input. I tried discretize and now I can use for scalar and non scalar inputs.
May be you are right, the griddedInterpolant is still 5 times faster. Do you think converting my m-code to MEX file will help achieve faster computation? The interpolation operation is used in a larger simulation which runs multiple serial iterations as in iterations itself can't be vectorize. And computationally, the most time consuming part is interpolation.
% Calculate upper and lower bound in x direction
idx1 = discretize(x,X(:,1),'IncludedEdge','right'); idx2 = idx1 + 1;
% Calculate upper and lower bound in y direction
idy1 = discretize(y,Y(1,:),'IncludedEdge','right'); idy2 = idy1 + 1;
Matt J
Matt J 2019 年 1 月 31 日
編集済み: Matt J 2019 年 1 月 31 日
griddedInterpolant is already MEX driven.
Do you have the Parallel Computing Toolbox and a strong GPU? If so, you could covert your data to gpuArrays and use interp2.
Nimananda Sharma
Nimananda Sharma 2019 年 1 月 31 日
Using histcount instead of discretize improves the speed. Computation is slightly faster than griddedinterpolant. Yes, I have the parallel computing toolbox. I have NVIDIA Quadro K420 GPU with 1 GB memory. Now, I am trying to use directly the MEX files insteading of calling the functions to save time from function overhead. I can also try gpuArrays. However, I have found that interp2 is 50 times slower than griddedInterpolant when griddedInterpolant is created before the simulation. I also tried converting my interpolation code with hiscounts as follows to MEX file. However, the computation was slower than using m-file. Here is the funtion, I am using right now
function [v] = interp2D(X, Y, V, x , y)
% Calculate upper and lower bound in x direction
[~ , ~, idx1] = histcounts(x, X(:,1));
idx1(end,:) = idx1(end,:)+1;
idx2 = idx1 - ones(size(idx1));
% Calculate upper and lower bound in y direction
[~ , ~, idy1] = histcounts(y,Y(1,:));
idy1(:,end) = idy1(:,end)+1;
idy2 = idy1 - ones(size(idy1));
% Evaluate the function at four points
V11 = V(idx1(1): idx1(end), idy1(1): idy1(end));
V12 = V(idx1(1): idx1(end), idy2(1): idy2(end));
V21 = V(idx2(1): idx2(end), idy1(1): idy1(end));
V22 = V(idx2(1): idx2(end), idy2(1): idy2(end));
x1 = X(idx1(1): idx1(end) , 1);
x2 = X(idx2(1): idx2(end) , 1);
y1 = Y(1 , idy1(1): idy1(end));
y2 = Y(1 , idy2(1): idy2(end));
% Interpolate in x-direction
Vx1 = (x2 - x) .* V11 ./ ( x2 - x1) + (x - x1) .* V21 ./( x2 - x1);
Vx2 = (x2 - x) .* V12 ./ ( x2 - x1) + (x - x1) .* V22 ./ ( x2 - x1);
% Interpolate in y-direction
v = (y2 - y) .* Vx1 ./(y2 - y1) + (y - y1) .* Vx2 ./(y2 - y1);
Matt J
Matt J 2019 年 1 月 31 日
編集済み: Matt J 2019 年 1 月 31 日
gpuArrays have a different implementation of interp2 than the one you‘ve tried. In all likelihood, it uses GPU texture memory and so should be much faster.
Nimananda Sharma
Nimananda Sharma 2019 年 1 月 31 日
The comparision between interp2 and griddedinterpolant is here as well. Scroll to the bottom of the page.
Matt J
Matt J 2019 年 1 月 31 日
編集済み: Matt J 2019 年 1 月 31 日
Again, irrelevant for the GPU. However, you will probably need something stronger than the Quadro K420.
Sean Sullivan
Sean Sullivan 2024 年 6 月 11 日
As of R2023b, griddedInterpolant also supports gpuArray input.
When I run similar code to Matt J in R2024a, but time griddedInterpolant running on the GPU as well, I see very little difference between the performance of interp2 and griddedInterpolant.
Matt J
Matt J 2024 年 6 月 13 日
As of R2023b, griddedInterpolant also supports gpuArray input.
Great news!! (Although, I hope we will eventually get the full complement of interpolation/extrapolation methods for 3D data).

サインインしてコメントする。

カテゴリ

ヘルプ センター および File ExchangeMATLAB についてさらに検索

質問済み:

2019 年 1 月 30 日

コメント済み:

2024 年 6 月 13 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by