# Is it possible to calculate the average Manhattan distance between a set of 2D points with one function?

22 ビュー (過去 30 日間)
Bálint Udvardy 2020 年 2 月 23 日
コメント済み: Bálint Udvardy 2020 年 2 月 28 日
Is there an easier way to calculate the average Manhattan distance between a set of points easier than I have it in my code? I have a matrix, which contains a set of 2D points (the columns corespond to the x and y coordinates).
%for the first point
[idx,~]=dsearchn(isec(2:end,:),isec(1,:));
sum=pdist2(isec(1,:), isec(idx+1,:), 'cityblock');
%for the rest of the points
for i=2:size(isec,1)-1
others=[isec(1:i-1,:); isec(i+1:end,:)];
[idx,~]=dsearchn(others,isec(i,:));
man=pdist2(isec((i,:), others(idx,:), 'cityblock');
sum=sum+man;
end
%for the last point
[idx,~]=dsearchn(isec(1:end-1,:),isec(end,:));
sum=sum+pdist2(isec(end,:), isec(idx,:), 'cityblock');
avg_dist=sum/size(isec,1);
As dsearch cannot calculate other distances than euclidean, I use it to find the index of the nearest point. After that I have to call the pdist2 function to measure the cityblock distance. What complicates things even more is the fact, that I have to exclude the current point (to which I'm trying to find the closest one); that's the purpos of my 'others' variable and the 2 pieces of code above and below the for-cycle.
To give a bit of context:
The goal is image segmentation. The points are actually line intersections in a grid. I want to calculate the average Manhattan distance in order to use the result as a treshhold to determine which points are roughly in the given row/column (unlike in the case of crosswords, sudokus, nonograms I'm not interested in the square areas between the lines, rather in the intersections).

#### 2 件のコメント

David Goodmanson 2020 年 2 月 23 日
Hi Balint, are you looking for the average distance for all possible pairs of points, or the average distance for one particular path that visits all the points?
Bálint Udvardy 2020 年 2 月 24 日
Sorry, my question is quite confusing. I'm looking for the former mentioned one. More specifically: I want to calculate the distance of the nearest point (excluding the one I'm currently on) for all of the points separatelly; so there will be just as many 'nearest distances' as many points there are. After that I would like to average those distances. So I want an average nearest distance (one number) for the whole set.

サインイン to comment.

### 採用された回答

Jon 2020 年 2 月 24 日

I think this does what you want. You could probably make it more efficient recognizing the symmetry of the distance matrix, but just wanted to get the main idea across
% just some data to try, you'd put your here
x = [1 8 7 2 6 3]
y = [3 6 2 5 1 4]
numPoints = length(x)
% make array of Manhattan distances (L1 norm) between between each point to all other points
D = zeros(numPoints)
for j = 1:numPoints
for k = 1:numPoints
D(j,k) = norm([x(j),y(j)] -[x(k),y(k)],1);
end
end
% set main diagonal to inf so it won't be included when looking for minimum
% distance
D = D + diag(inf(numPoints,1));
% find minimum distance for each row
dMin = min(D,[],2);
% find average minimum
dAvg = mean(dMin)

#### 3 件のコメント

Bálint Udvardy 2020 年 2 月 25 日
Your code works too - it gives the same results as mine. Hovewer, it is like 25x slower than my original one. Maybe on your dataset the difference is not so significant, but I have 3656 points in total. I guess it's due to the nested for-loop.
I'm looking for a faster, more compact code; which preferably uses built-in matlab functions.
Anyway, thank you for your answer; it is always interesting to see another approach.
Jon 2020 年 2 月 25 日
Here's a way to do it that is very simple without any loops. I benchmarked it using the MATLAB tic toc commands on my machine and found that with a 4000 points the earlier way with loops took 5 seconds and the one below with no loops took only 0.4 seconds!
% just some data to try, you'd put your here
x = [1 8 7 2 6 3];
y = [3 6 2 5 1 4];
numPoints = length(x)
tic
% find component distances
X = x - x';
Y = y - y';
% find total "Manhattan Distance"
D = abs(X) + abs(Y);
% set main diagonal to inf so it won't be included when looking for minimum
% distance
D(1:numPoints+1:end) = inf;
% find minimum distance for each row
dMin = min(D,[],2);
% find average minimum
dAvg = mean(dMin)
Bálint Udvardy 2020 年 2 月 28 日
Thanks, it works even faster than my original one (4x faster).

サインイン to comment.

### その他の回答 (0 件)

サインイン してこの質問に回答します。

Translated by