chi squared dist for histograms: vectorized vs forloops

8 ビュー (過去 30 日間)
Adrian Szatmari
Adrian Szatmari 2017 年 11 月 30 日
Dear all, I was looking at the chi squared distance from histograms used by many and written by Peter Kovesi. Here it is
function [D] = chisq_forloop(X,Y)
%%%supposedly it's possible to implement this without a loop!
m = size(X,1); n = size(Y,1);
mOnes = ones(1,m); D = zeros(m,n);
for i=1:n
yi = Y(i,:); yiRep = yi( mOnes, : );
s = yiRep + X; d = yiRep - X;
D(:,i) = sum( d.^2 ./ (s+eps), 2 );
end
D = D/2;
end
So I went ahead and implemented it without a forloop. Here it is
function [D] = chisq_vec(X,Y)
%There is a block of m rows per j, for instance j = 2 goes from m+1 to 2m
%There are n blocks total, and in block j, the ith row corresponds to X
%For instance D(i,j) will be in the ith row of block j
m = size(X,1);
n = size(Y,1);
Xrep = kron(X,ones(n,1));
Yrep = repmat(Y,m,1);
chi = sum(((Xrep - Yrep).^2)./(Xrep + Yrep + eps),2)/2;
D = vec2mat(chi,n);
end
But to my surprise, it seems that the vectorized version is slower, in particular most of the computation time seems to be taken by chi = sum(((Xrep - Yrep).^2)./(Xrep + Yrep + eps),2)/2; The timing code is here:
clc
clear
bins = 30;
nb_x = 100;
nb_y = 99;
X = rand(nb_x,bins)+eps;
Y = rand(nb_y,bins)+eps;
D_loop = chisq_forloop(X,Y);
D_vec = chisq_vec(X,Y);
isequal(D_loop,D_vec)
time_vec = zeros(20,1);
time_loop = zeros(20,1);
f_loop = @() chisq_forloop(X,Y);
f_vec = @() chisq_vec(X,Y);
for i = 1:20
time_vec(i) = timeit(f_vec);
time_loop(i) = timeit(f_loop);
end
figure
hold on
plot(1:20,time_vec,'b');
plot(1:20,time_loop,'r');
But the vectorized version seems much slower (4x)!!! Is this normal, anyone any ideas to salvage this? Many use bsxfun or gpu computing somehow? Although I have no experience with them.

回答 (0 件)

カテゴリ

Help Center および File ExchangeThird-Party Cluster Configuration についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by