Finding the nearest neighbor to a single point in MATLAB - matlab

I'm trying to do a nearest neighbor search that yields a single point as the single "nearest neighbor" to another point in matlab.
I've got the following data:
A longitude grid that is size 336x264 "lon"
some random point within the bounds of the longitude grid "dxf"
I've tried using MATLAB's "knnsearch" function
https://www.mathworks.com/help/stats/knnsearch.html
But sadly when I use the command:
idx = knnsearch(lon, dxf)
I am met with the error:
"Y must be a matrix with 264 columns."
Is there an alternative nearest neighbor search I can use to find the nearest neighbor to a single point within MATLAB? Is there a simpler solution I can implement?
I literally just want to find the closest point within the "lon" matrix to point "dxf".
Thanks!
Taylor

You should first convert your grid to an n-by-2 matrix (if you created this using meshgrid, it's simply G = [XX(:) YY(:)]), you can then try it with pdist2 if you have the Statistics and Machine Learning Toolbox (which you do):
[D,I] = pdist2(P, G, 'euclidian', 'Smallest', 1);
Where G is the grid and P is your m-by-2 array of points to test.

If you're working without Toolboxes, you can construct a simple distance formula yourself:
xx = [0:364]; % Not sure what your limits were so just making some up here
yy = [0:264];
[X, Y] = meshgrid(xx,yy);
dxf = [221.7, 109.1]; % Again just pulling numbers from nether regions
G = [X(:),Y(:)];
d = sqrt( sum( (G-dxf).^2, 2) );
[minDist, idxMinDist] = min(d);
solution = G(idxMinDist,:);
You can modify the limits for xx and yy to fit your specific setup accordingly.

Related

What is wrong with my lagrange multiplier matlab code?

I want to find the two farthest points lying on a closed curve with respect to a line. Lagrange multiplier seems to be do the job. But something is wrong with my code: I have 10 solution points (length(xsol)=10, some are repeated so the picture has only 7) and only two of them are what I want (with red ticks). Why would some points not lie on the curve?
syms x y L
g = #(x,y) x^2+2*x*y^2+y^6-1;
h = #(x,y) -4*x+y; % to max or min this such that g=0 is satisfied
gradg = jacobian(g,[x,y]); gradh = jacobian(h,[x,y]);
lagr = gradh - L*gradg;
[L,xsol,ysol]=solve(lagr(1),lagr(2),g);
plot(xsol,ysol,'bo')
What is wrong?
Lagrange multiplier gives local max, local min and also complex solutions. To get the farthest two points, we need one more step: evaluate the function values of all the 'solutions', find the indices with respect to the maximum and the minimum and finally we have the two points.
xsolreal=real(double(xsol)); % take the real part for giving accurate results
ysolreal=real(double(ysol));
fnvalue=zeros(1,length(xsol));
for i=1:length(xsol)
fnvalue(i)=-4*xsolreal(i)+ysolreal(i); % function value of the given line
end
gmax=find(fnvalue == max(fnvalue(:)));
gmin=find(fnvalue == min(fnvalue(:)));
x1=xsolreal(gmax);
y1=ysolreal(gmax);
x2=xsolreal(gmin);
y2=ysolreal(gmin);

How do I build a matrix using two vectors?

So I need to build a matrix of x and y coordinates. I have the x stored in one matrix called vx=0:6000; and y stored in Vy=repmat(300,1,6000);.
Values in x are 0,1,2,...,5999,6000.
Values in y are 300,300,...,300,300.
How do I build a "vector" with the x,y coordinates above?
It would look like this [(0,300);(1,300);...;(5999,300);(6000,300)].
After I finish doing this, I am going to want to find the distance between another fixed point x,y (that I will replicate 6000 times) and the vector above, in order to make a distance graph over time.
Thank you so much!
You can just use horizontal concatenation with []
X = [Vx(:), Vy(:)];
If you want to compute the distance between another point and every point in this 2D array, you could do the following:
point = [10, 100];
distances = sqrt(sum(bsxfun(#minus, X, point).^2, 2));
If you have R2016b or newer you can simply do
distances = sqrt(sum((X - point).^2, 2));
A slightly more elegant alternative (in my opinion) is the following:
Vx = (0:1:6000).';
C = [Vx 0*Vx+300]; % Just a trick to avoid the overly verbose `repmat`.
p = [10,100]; % Define some point of reference.
d = pdist2(C,p); % The default "distance type" is 'euclidian' - which is what you need.
This uses the pdist2 function, introduced in MATLAB 2010a, and requires the Statistics and Machine Learning Toolbox.

Mahalanobis distance in Matlab

I would like to calculate the mahalanobis distance of input feature vector Y (1x14) to all feature vectors in matrix X (18x14). Each 6 vectors of X represent one class (So I have 3 classes). Then based on mahalanobis distances I will choose the vector that is the nearest to the input and classify it to one of the three classes as well.
My problem is when I use the following code I got only one value. How can I get mahalanobis distance between the input Y and every vector in X. So at the end I have 18 values and then I choose the smallest one. Any help will be appreciated. Thank you.
Note: I know that mahalanobis distance is a measure of the distance between a point P and a distribution D, but I don't how could this be applied in my situation.
Y = test1; % Y: 1x14 vector
S = cov(X); % X: 18x14 matrix
mu = mean(X,1);
d = ((Y-mu)/S)*(Y-mu)'
I also tried to separate the matrix X into 3; so each one represent the feature vectors of one class. This is the code, but it doesn't work properly and I got 3 distances and some have negative value!
Y = test1;
X1 = Action1;
S1 = cov(X1);
mu1 = mean(X1,1);
d1 = ((Y-mu1)/S1)*(Y-mu1)'
X2 = Action2;
S2 = cov(X2);
mu2 = mean(X2,1);
d2 = ((Y-mu2)/S2)*(Y-mu2)'
X3= Action3;
S3 = cov(X3);
mu3 = mean(X3,1);
d3 = ((Y-mu3)/S3)*(Y-mu3)'
d= [d1,d2,d3];
MahalanobisDist= min(d)
One last thing, when I used mahal function provided by Matlab I got this error:
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate.
If you have to implement the distance yourself (school assignment for instance) this is of absolutely no use to you, but if you just need to calculate the distance as an intermediate step for other calculations I highly recommend d = Pdist2(a,b, distance_measure) the documentation is on matlabs site
It computes the pairwise distance between a vector (or even a matrix) b and all elements in a and stores them in vector d where the columns correspond to entries in b and the rows are entries from a. So d(i,j) is the distance between row j in b and row i in a (hope that made sense). If you want it could even parameters to find the k nearest neighbors, it's a great function.
in your case you would use the following code and you'd end up with the distance between elements, and the index as well
%number of neighbors
K = 1;
% X=18x14, Y=1x14, dist=18x1
[dist, iidx] = pdist2(X,Y,'mahalanobis','smallest',K);
%to find the class, you can do something like this
num_samples_per_class = 6;
matching_class = ceil(iidx/ num_samples_per_class);

Matlab calculating nearest neighbour distance for all (u, v) vectors in an array

I am trying to calculate the distance between nearest neighbours within a nx2 matrix like the one shown below
point_coordinates =
11.4179 103.1400
16.7710 10.6691
16.6068 119.7024
25.1379 74.3382
30.3651 23.2635
31.7231 105.9109
31.8653 36.9388
%for loop going from the top of the vector column to the bottom
for counter = 1:size(point_coordinates,1)
%current point defined selected
current_point = point_coordinates(counter,:);
%math to calculate distance between the current point and all the points
distance_search= point_coordinates-repmat(current_point,[size(point_coordinates,1) 1]);
dist_from_current_point = sqrt(distance_search(:,1).^2+distance_search(:,2).^2);
%line to omit self subtraction that gives zero
dist_from_current_point (dist_from_current_point <= 0)=[];
%gives the shortest distance calculated for a certain vector and current_point
nearest_dist=min(dist_from_current_point);
end
%final line to plot the u,v vectors and the corresponding nearest neighbour
%distances
matnndist = [point_coordinates nearest_dist]
I am not sure how to structure the 'for' loop/nearest_neighbour line to be able to get the nearest neighbour distance for each u,v vector.
I would like to have, for example ;
for the first vector you could have the coordinates and the corresponding shortest distance, for the second vector another its shortest distance, and this goes on till n
Hope someone can help.
Thanks
I understand you want to obtain the minimum distance between different points.
You can compute the distance for each pair of points with bsxfun; remove self-distances; minimize. It's more computationally efficient to work with squared distances, and take the square root only at the end.
n = size(point_coordinates,1);
dist = bsxfun(#minus, point_coordinates(:,1), point_coordinates(:,1).').^2 + ...
bsxfun(#minus, point_coordinates(:,2), point_coordinates(:,2).').^2;
dist(1:n+1:end) = inf; %// remove self-distances
min_dist = sqrt(min(dist(:)));
Alternatively, you could use pdist. This avoids computing each distance twice, and also avoids self-distances:
dist = pdist(point_coordinates);
min_dist = min(dist(:));
If I can suggest a built-in function, use knnsearch from the statistics toolbox. What you are essentially doing is a K-Nearest Neighbour (KNN) algorithm, but you are ignoring self-distances. The way you would call knnsearch is in the following way:
[idx,d] = knnsearch(X, Y, 'k', k);
In simple terms, the KNN algorithm returns the k closest points to your data set given a query point. Usually, the Euclidean distance is the distance metric that is used. For MATLAB's knnsearch, X is a 2D array that consists of your dataset where each row is an observation and each column is a variable. Y would be the query points. Y is also a 2D array where each row is a query point and you need to have the same number of columns as X. We would also specify the flag 'k' to denote how many closest points you want returned. By default, k = 1.
As such, idx would be a N x K matrix, where N is the total number of query points (number of rows of Y) and K would be those k closest points to the dataset for each query point we have. idx indicates the particular points in your dataset that were closest to each query. d is also a N x K matrix that returns the smallest distances for these corresponding closest points.
As such, what you want to do is find the closest point for your dataset to each of the other points, ignoring self-distances. Therefore, you would set both X and Y to be the same, and set k = 2, discarding the first column of both outputs to get the result you're looking for.
Therefore:
[idx,d] = knnsearch(point_coordinates, point_coordinates, 'k', 2)
idx = idx(:,2);
d = d(:,2);
We thus get for idx and d:
>> idx
idx =
3
5
1
1
7
3
5
>> d
d =
17.3562
18.5316
17.3562
31.9027
13.7573
20.4624
13.7573
As such, this tells us that for the first point in your data set, it matched with point #3 the best. This matched with the closest distance of 17.3562. For the second point in your data set, it matched with point #5 the best with the closest distance being 18.5316. You can continue on with the rest of the results in a similar pattern.
If you don't have access to the statistics toolbox, consider reading my StackOverflow post on how I compute KNN from first principles.
Finding K-nearest neighbors and its implementation
In fact, it is very similar to Luis Mendo's post to you earlier.
Good luck!

Finding the belonging value of given point on a grid of 3D histogram?

I use 2D dataset like below,
37.0235000000000 18.4548000000000
28.4454000000000 15.7814000000000
34.6958000000000 20.9239000000000
26.0374000000000 17.1070000000000
27.1619000000000 17.6757000000000
28.4101000000000 15.9183000000000
33.7340000000000 17.1615000000000
34.7948000000000 18.2695000000000
34.5622000000000 19.3793000000000
36.2884000000000 18.4551000000000
26.1695000000000 16.8195000000000
26.2090000000000 14.2081000000000
26.0264000000000 21.8923000000000
35.8194000000000 18.4811000000000
to create a 3D histogram.
How can I find the histogram value of a point on a grid? For example, if [34.7948000000000 18.2695000000000] point is given, I would like to find the corresponding value of a histogram for a given point on the grid.
I used this code
point = feat_vec(i,:); // take the point given by the data set
X = centers{1}(1,:); // take center of the bins at one dimension
Y = centers{2}(1,:); // take center of the bins at other dim.
distanceX = abs(X-point(1)); // find distance to all bin centers at one dimension
distanceY = abs(Y-point(2)); // find distance to center points of other dimension
[~,indexX] = min(distanceX); // find the index of minimum distant center point
[~,indexY] = min(distanceY); // find the index of minimum distant center point for other dimension
You could use interp2 to accomplish that!
If X (1-D Vector, length N) and Y (1-D vector, length M) determine discrete coordinate on the axes where your histogram has defined values Z (matrix, size M x N). Getting value for one particular point with coordinates (XI, YI) could be done with:
% generate grid
[XM, YM] = meshgrid(X, Y);
% interpolate desired value
ZI = interp2(XM, YM, Z, XI, YI, 'spline')
In general, this kind of problem is interpolation problem. If you would want to get values for multiple points, you would have to generate grid for them in similar fashion done in code above. You could also use another interpolating method, for example linear (refer to linked documentation!)
I think you mean this:
[N,C] = hist3(X,...) returns the positions of the bin centers in a
1-by-2 cell array of numeric vectors, and does not plot the histogram.
That being said, if you have a 2D point x=[x1, x2], you are only to look up the closest points in C, and take the corresponding value in N.
In Matlab code:
[N, C] = hist3(data); % with your data format...
[~,indX] = min(abs(C{1}-x(1)));
[~,indY] = min(abs(C{2}-x(2)));
result = N(indX,indY);
done. (You can make it into your own function say result = hist_val(data, x).)
EDIT:
I just saw, that my answer in essence is just a more detailed version of #Erogol's answer.