I have a region with about 144 points. What i want to achieve is to measure the distance of a point with all others and store it in an array. I want to do this for all the points. If possible i would like to store this data in such a manner that there in no repetition. And i should be able to make queries like- All the distances between all the points without repetition, sum of all the distances for point no56 etc.
I have a 3*144 array with two columns storing the coordinates of the points.
A possible solution (I am not really clear with what you mean by no repetition, though):
X are your points with coordinates x = X(:,1), y = X(:,2)
dist = sqrt(bsxfun(#minus,X(:,1),X(:,1)').^2 + bsxfun(#minus,X(:,2),X(:,2)').^2)
so
dist(i,j) is the euclidean distance between i and j
of course the matrix is symmetric. You can easily reduce the complexity involved.
Let's say your array is A, where each column stores the coordinates of a single point. To obtain the combinations of all point pairs (without repetitions), use nchoosek:
pairs = nchoosek(1:size(A, 2), 2)
Then calculate the Euclidean distance like so:
dist = sqrt(sum((A(:, pairs(:, 1)) - A(:, pairs(:, 2))) .^ 2, 1))
If you have the Statistics Toolbox installed, you can use pdist(A) instead for the same effect.
If you have the statistics toolbox, and if you have all your data in the array X, then
D = pdist(X)
gives all the pairwise distances between all points in X.
Related
I have the lists xA, yA, zA and the lists xB, yB, zB in Matlab. The contain the the x,y and z coordinates of points of type A and type B. There may be different numbers of type A and type B points.
I would like to calculate a matrix containing the Euclidean distances between points of type A and type B. Of course, I only need to calculate one half of the matrix, since the other half contains duplicate data.
What is the most efficient way to do that?
When I'm done with that, I want to find the points of type B that are closest to one point of type A. How do I then find the coordinates the closest, second closest, third closest and so on points of type B?
Given a matrix A of size [N,3] and a matrix B of size [M,3], you can use the pdist2 function to get a matrix of size [M,N] containing all the pairwise distances.
If you want to order the points from B by their distance to the rth point in A then you can sort the rth row of the pairwise distance matrix.
% generate some example data
N = 4
M = 7
A = randn(N,3)
B = randn(M,3)
% compute N x M matrix containing pairwise distances
D = pdist2(A, B, 'euclidean')
% sort points in B by their distance to the rth point in A
r = 3
[~, b_idx] = sort(D(r,:))
Then b_idx will contain the indices of the points in B sorted by their distance to the rth point in A. So the actual points in B ordered by b_idx can be obtained with B(b_idx,:), which has the same size as B.
If you want to do this for all r you could use
[~, B_idx] = sort(D, 1)
to sort all the rows of D at the same time. Then the rth row of B_idx will contain b_idx.
If your only objective is to find the closest k points in B for each point in A (for some positive integer k which is less than M), then we would not generally want to compute all the pairwise distances. This is because space-partitioning data structures like K-D-trees can be used to improve the efficiency of searching without explicitly computing all the pairwise distances.
Matlab provides a knnsearch function that uses K-D-trees for this exact purpose. For example, if we do
k = 2
B_kidx = knnsearch(B, A, 'K', k)
then B_kidx will be the first two columns of B_idx, i.e. for each point in A the indices of the nearest two points in B. Also, note that this is only going to be more efficient than the pdist2 method when k is relatively small. If k is too large then knnsearch will automatically use the explicit method from before instead of the K-D-tree approach.
So I need to build a matrix of x and y coordinates. I have the x stored in one matrix called vx=0:6000; and y stored in Vy=repmat(300,1,6000);.
Values in x are 0,1,2,...,5999,6000.
Values in y are 300,300,...,300,300.
How do I build a "vector" with the x,y coordinates above?
It would look like this [(0,300);(1,300);...;(5999,300);(6000,300)].
After I finish doing this, I am going to want to find the distance between another fixed point x,y (that I will replicate 6000 times) and the vector above, in order to make a distance graph over time.
Thank you so much!
You can just use horizontal concatenation with []
X = [Vx(:), Vy(:)];
If you want to compute the distance between another point and every point in this 2D array, you could do the following:
point = [10, 100];
distances = sqrt(sum(bsxfun(#minus, X, point).^2, 2));
If you have R2016b or newer you can simply do
distances = sqrt(sum((X - point).^2, 2));
A slightly more elegant alternative (in my opinion) is the following:
Vx = (0:1:6000).';
C = [Vx 0*Vx+300]; % Just a trick to avoid the overly verbose `repmat`.
p = [10,100]; % Define some point of reference.
d = pdist2(C,p); % The default "distance type" is 'euclidian' - which is what you need.
This uses the pdist2 function, introduced in MATLAB 2010a, and requires the Statistics and Machine Learning Toolbox.
I am trying to calculate the distance between nearest neighbours within a nx2 matrix like the one shown below
point_coordinates =
11.4179 103.1400
16.7710 10.6691
16.6068 119.7024
25.1379 74.3382
30.3651 23.2635
31.7231 105.9109
31.8653 36.9388
%for loop going from the top of the vector column to the bottom
for counter = 1:size(point_coordinates,1)
%current point defined selected
current_point = point_coordinates(counter,:);
%math to calculate distance between the current point and all the points
distance_search= point_coordinates-repmat(current_point,[size(point_coordinates,1) 1]);
dist_from_current_point = sqrt(distance_search(:,1).^2+distance_search(:,2).^2);
%line to omit self subtraction that gives zero
dist_from_current_point (dist_from_current_point <= 0)=[];
%gives the shortest distance calculated for a certain vector and current_point
nearest_dist=min(dist_from_current_point);
end
%final line to plot the u,v vectors and the corresponding nearest neighbour
%distances
matnndist = [point_coordinates nearest_dist]
I am not sure how to structure the 'for' loop/nearest_neighbour line to be able to get the nearest neighbour distance for each u,v vector.
I would like to have, for example ;
for the first vector you could have the coordinates and the corresponding shortest distance, for the second vector another its shortest distance, and this goes on till n
Hope someone can help.
Thanks
I understand you want to obtain the minimum distance between different points.
You can compute the distance for each pair of points with bsxfun; remove self-distances; minimize. It's more computationally efficient to work with squared distances, and take the square root only at the end.
n = size(point_coordinates,1);
dist = bsxfun(#minus, point_coordinates(:,1), point_coordinates(:,1).').^2 + ...
bsxfun(#minus, point_coordinates(:,2), point_coordinates(:,2).').^2;
dist(1:n+1:end) = inf; %// remove self-distances
min_dist = sqrt(min(dist(:)));
Alternatively, you could use pdist. This avoids computing each distance twice, and also avoids self-distances:
dist = pdist(point_coordinates);
min_dist = min(dist(:));
If I can suggest a built-in function, use knnsearch from the statistics toolbox. What you are essentially doing is a K-Nearest Neighbour (KNN) algorithm, but you are ignoring self-distances. The way you would call knnsearch is in the following way:
[idx,d] = knnsearch(X, Y, 'k', k);
In simple terms, the KNN algorithm returns the k closest points to your data set given a query point. Usually, the Euclidean distance is the distance metric that is used. For MATLAB's knnsearch, X is a 2D array that consists of your dataset where each row is an observation and each column is a variable. Y would be the query points. Y is also a 2D array where each row is a query point and you need to have the same number of columns as X. We would also specify the flag 'k' to denote how many closest points you want returned. By default, k = 1.
As such, idx would be a N x K matrix, where N is the total number of query points (number of rows of Y) and K would be those k closest points to the dataset for each query point we have. idx indicates the particular points in your dataset that were closest to each query. d is also a N x K matrix that returns the smallest distances for these corresponding closest points.
As such, what you want to do is find the closest point for your dataset to each of the other points, ignoring self-distances. Therefore, you would set both X and Y to be the same, and set k = 2, discarding the first column of both outputs to get the result you're looking for.
Therefore:
[idx,d] = knnsearch(point_coordinates, point_coordinates, 'k', 2)
idx = idx(:,2);
d = d(:,2);
We thus get for idx and d:
>> idx
idx =
3
5
1
1
7
3
5
>> d
d =
17.3562
18.5316
17.3562
31.9027
13.7573
20.4624
13.7573
As such, this tells us that for the first point in your data set, it matched with point #3 the best. This matched with the closest distance of 17.3562. For the second point in your data set, it matched with point #5 the best with the closest distance being 18.5316. You can continue on with the rest of the results in a similar pattern.
If you don't have access to the statistics toolbox, consider reading my StackOverflow post on how I compute KNN from first principles.
Finding K-nearest neighbors and its implementation
In fact, it is very similar to Luis Mendo's post to you earlier.
Good luck!
I need to generate N random coordinates for a 2D plane. The distance between any two points are given (number of distance is N(N - 1) / 2). For example, say I need to generate 3 points i.e. A, B, C. I have the distance between pair of them i.e. distAB, distAC and distBC.
Is there any built-in function in MATLAB that can do this? Basically, I'm looking for something that is the reverse of pdist() function.
My initial idea was to choose a point (say A is the origin). Then, I can randomly find B and C being on two different circles with radii distAB and distAC. But then the distance between B and C might not satisfy distBC and I'm not sure how to proceed if this happens. And I think this approach will get very complicated if N is a large number.
Elaborating on Ansaris answer I produced the following. It assumes a valid distance matrix provided, calculates positions in 2D based on cmdscale, does a random rotation (random translation could be added also), and visualizes the results:
%Distance matrix
D = [0 2 3; ...
2 0 4; ...
3 4 0];
%Generate point coordinates based on distance matrix
Y = cmdscale(D);
[nPoints dim] = size(Y);
%Add random rotation
randTheta = 2*pi*rand(1);
Rot = [cos(randTheta) -sin(randTheta); sin(randTheta) cos(randTheta) ];
Y = Y*Rot;
%Visualization
figure(1);clf;
plot(Y(:,1),Y(:,2),'.','markersize',20)
hold on;t=0:.01:2*pi;
for r = 1 : nPoints - 1
for c = r+1 : nPoints
plot(Y(r,1)+D(r,c)*sin(t),Y(r,2)+D(r,c)*cos(t));
plot(Y(c,1)+D(r,c)*sin(t),Y(c,2)+D(r,c)*cos(t));
end
end
You want to use a technique called classical multidimensional scaling. It will work fine and losslessly if the distances you have correspond to distances between valid points in 2-D. Luckily there is a function in MATLAB that does exactly this: cmdscale. Once you run this function on your distance matrix, you can treat the first two columns in the first output argument as the points you need.
I'm writing a program for school and I have nested for-loops that create a 4-dimensional array (of the distances between two points with coordinates (x,y) and (x',y')) as below:
pos_x=1:20;
pos_y=1:20;
Lx = length(pos_x);
Ly = length(pos_y);
Lx2 = Lx/2;
Ly2 = Ly/2;
%Distance function, periodic boundary conditions
d_x=abs(repmat(1:Lx,Lx,1)-repmat((1:Lx)',1,Lx));
d_x(d_x>Lx2)=Lx-d_x(d_x>Lx2);
d_y=abs(repmat(1:Ly,Ly,1)-repmat((1:Ly)',1,Ly));
d_y(d_y>Ly2)=Ly-d_y(d_y>Ly2);
for l=1:Ly
for k=1:Lx
for j=1:Ly
for i=1:Lx
distance(l,k,j,i)=sqrt(d_x(k,i).^2+d_y(l,j).^2);
end
end
end
end
d_x and d_y are just 20x20 matrices and Lx=Ly for trial purposes. It's very slow and obviously not a very elegant way of doing it. I tried to vectorize the nested loops and succeeded in getting rid of the two inner loops as:
dx2=zeros(Ly,Lx,Ly,Lx);
dy2=zeros(Ly,Lx,Ly,Lx);
distance=zeros(Ly,Lx,Ly,Lx);
for l=1:Ly
for k=1:Lx
dy2(l,k,:,:)=repmat(d_y(l,:),Ly,1);
dx2(l,k,:,:)=repmat(d_x(k,:)',1,Lx);
end
end
distance=sqrt(dx2.^2+dy2.^2);
which basically replaces the 4 for-loops above. I've now been trying for 2 days but I couldn't find a way to vectorize all the loops. I wanted to ask:
whether it's possible to actually get rid of these 2 loops
if so, i'd appreciate any tips and tricks to do so.
I have so far tried using repmat again in 4 dimensions, but you can't transpose a 4 dimensional matrix so I tried using permute and repmat together in many different combinations to no avail.
Any advice will be greatly appreciated.
thanks for the replies. Sorry for the bad wording, what I basically want is to have a population of oscillators uniformly located on the x-y plane. I want to simulate their coupling and the coupling function is a function of the distance between every oscillator. And every oscillator has an x and a y coordinate, so i need to find the distance between osci(1,1) and osci(1,1),..osci(1,N),osci(2,1),..osci(N,N)... and then the same for osci(1,2) and osci(1,1)...osci(N,N) and so on.. (so basically the distance between all oscillators and all other oscillators plus the self-coupling) if there's an easier way to do it other than using a 4-D array, i'd also definitely like to know it..
If I understand you correctly, you have oscillators all over the place, like this:
Then you want to calculate the distance between oscillator 1 and oscillators 1 through 100, and then between oscillator 2 and oscillators 1 through 100 etc. I believe that this can be represented by a 2D distance matrix, were the first dimension goes from 1 to 100, and the second dimension goes from 1 to 100.
For example
%# create 100 evenly spaced oscillators
[xOscillator,yOscillator] = ndgrid(1:10,1:10);
oscillatorXY = [xOscillator(:),yOscillator(:)];
%# calculate the euclidean distance between the oscillators
xDistance = abs(bsxfun(#minus,oscillatorXY(:,1),oscillatorXY(:,1)')); %'# abs distance x
xDistance(xDistance>5) = 10-xDistance; %# add periodic boundary conditions
yDistance = abs(bsxfun(#minus,oscillatorXY(:,2),oscillatorXY(:,2)')); %'# abs distance y
yDistance(yDistance>5) = 10-yDistance; %# add periodic boundary conditions
%# and we get the Euclidean distance
euclideanDistance = sqrt(xDistance.^2 + yDistance.^2);
I find that imaginary numbers can sometimes help convey coupled information quite well while reducing clutter. My method will double the number of calculations necessary (i.e. I find the distance X and Y then Y and X), and I still need a single for loop
x = 1:20;
y = 1:20;
[X,Y] = meshgrid(x,y);
Z =X + Y*i;
z = Z(:);
leng = length(z);
store = zeros(leng);
for looper = 1:(leng-1)
dummyz = circshift(z,looper);
store(:,looper+1) = z - dummyz;
end
final = abs(store);