Is there a fast method to calculate the nearest point in a data set under a point-depending distance function - distance

I am searching for a (fast) way to calculate the nearest point y in a dataset to a given point x under a (x,y)-depending distance function.
My distance function has the form: d(x,y) = 1/f(x,y) * |||x-y||^2, where ||x|| denotes the standard Euclidean-norm. The function f(x,y) fulfills all necessary properties such that d(x,y) is a distance measurement i.e. positive, symmetric,...
For a "normal" distance function I could to some transformation on the data itself and use some k-nearest neighbor approaches. But for this case I could not find something useful. Does anyone have an idea?
Right now, I am using Julia for the implementation.

You should be able to use most standard spacial indexes (kd-tree, r-tree, quadtree, and their derivatives) as long as d(x,y) is "convex".
With "convex" I mean that a curve of equidistant points around P is convex. E.g. for Euclidean this is a circle, for Manhatten/Taxi distance it is a square.
This is required because these indexes usually partition the data into squares, rectangles or half-spaces (kd-tree), so they rely on calculating the minimum distance to a group of points by calculating the distance to the corner or sides of a bounding rectangle. As long as your distance function is convex (or at least not concave) then any index of these indexes should work.

Related

Finding length between a lot of elements

I have an image of a cytoskeleton. There are a lot of small objects inside and I want to calculate the length between all of them in every axis and to get a matrix with all this data. I am trying to do this in matlab.
My final aim is to figure out if there is any axis with a constant distance between the object.
I've tried bwdist and to use connected components without any luck.
Do you have any other ideas?
So, the end goal is that you want to globally stretch this image in a certain direction (linearly) so that the distances between nearest pairs end up the closest together, hopefully the same? Or may you do more complex stretching ? (note that with arbitrarily complex one you can always make it work :) )
If linear global one, distance in x' and y' is going to be a simple multiplication of the old distance in x and y, applied to every pair of points. So, the final euclidean distance will end up being sqrt((SX*x)^2 + (SY*y)^2), with SX being stretch in x and SY stretch in y; X and Y are distances in X and Y between pairs of points.
If you are interested in just "the same" part, solution is not so difficult:
Find all objects of interest and put their X and Y coordinates in a N*2 matrix.
Calculate distances between all pairs of objects in X and Y. You will end up with 2 matrices sized N*N (with 0 on the diagonal, symmetric and real, not sure what is the name for that type of matrix).
Find minimum distance (say this is between A an B).
You probably already have this. Now:
Take C. Make N-1 transformations, which all end up in C->nearestToC = A->B. It is a simple system of equations, you have X1^2*SX^2+Y1^2*SY^2 = X2^2*SX^2+Y2*SY^2.
So, first say A->B = C->A, then A->B = C->B, then A->B = C->D etc etc. Make sure transformation is normalized => SX^2 + SY^2 = 1. If it cannot be found, the only valid transformation is SX = SY = 0 which means you don't have solution here. Obviously, SX and SY need to be real.
Note that this solution is unique except in case where X1 = X2 and Y1 = Y2. In this case, grab some other point than C to find this transformation.
For each transformation check the remaining points and find all nearest neighbours of them. If distance is always the same as these 2 (to a given tolerance), great, you found your transformation. If not, this transformation does not work and you should continue with the next one.
If you want a transformation that minimizes variations between distances (but doesn't require them to be nearly equal), I would do some optimization method and search for a minimum - I don't know how to find an exact solution otherwise. I would pick this also in case you don't have linear or global stretch.
If i understand your question correctly, the first step is to obtain all of the objects center of mass points in the image as (x,y) coordinates. Then, you can easily compute all of the distances between all points. I suggest taking a look on a histogram of those distances which may provide some information as to the nature of distance distribution (for example if it is uniformly random, or are there any patterns that appear).
Obtaining the center of mass points is not an easy task, consider transforming the image into a binary one, or some sort of background subtraction with blob detection or/and edge detector.
For building a histogram you can use histogram.

Distance matrix in kilometres from latitude and longitude data in matlab

In matlab, I have a list of 2410 locations given by their latitude and longitude. I want to create a distance matrix in kilometres. I know how to do this in degrees but how do I do this in kilometres? I have the mapping toolbox, using 2016b. Thanks!
For example, my distance matrix in degrees looks like this:
First you need to ask your self what you mean by distance.
Do you want the euclidean distance between the points? Imagine you could tunnel through the earth from one point to the other, this is the euclidean distance between the points. To calculate this distance you need to first convert each of the lat long points to ecef points. You can do this conversion with this code (https://www.mathworks.com/matlabcentral/fileexchange/7942-covert-lat--lon--alt-to-ecef-cartesian). After you've converted each point to an ecef point you can now calculate the euclidean norm https://en.wikipedia.org/wiki/Norm_(mathematics)) between each possible pair of points.
Or do you want to calculate the distance a traveler would traverse if they were to walk along the surface of the earth. From the looks of it, this is a much more difficult problem requiring an iterative solver. Fortunately someone has already done the work of implementing an algorithm to do this for you (https://www.mathworks.com/matlabcentral/fileexchange/5379-geodetic-distance-on-wgs84-earth-ellipsoid). Note if you read the comments of this function it appears as if mathworks has already implemented a different algorithm to perform the same calculation in the mapping toolbox. To calculate the matrix you simply need to iterate over each possible pairing of lat long points and plug them into the vdist function.
Following should calculate the distance matrix for you using the vdist function above. Note I have not tested this code so you may to to correct errors.
points % assuming this is a matrix of your points [2 x N] formatted as follows
% [ lat1 , lat2, ... ]
% [ lon1 , lat2, ... ]
dist = zeros(N,N); % the resulting distance matrix
for(idx1 = 1:N)
for(idx2 = 1:N)
dist(idx1,idx2) = vdist(points(1,idx1),points(2,idx1),points(1,idx2)points(2,idx2) );
end
end
Note because the earth surface is manifold (https://en.wikipedia.org/wiki/Manifold) the results will be similar if the points are close to each other. If speed is important to you and the points are closely grouped, you may want to use the first method to calculate your distance matrix. How close together the points should be to make use of this approximation will depend on how accurate you need the results to be.

MATLAB Genetic Algorithm : Distance of separation between solution points

I need to optimize the location of 10 Transmitters and 10 Receivers (modeled as points on an aperture plane) so as to minimize a certain objective scalar using Genetic Algorithm toolbox in MATLAB. My question is: I have (10+10)*2 = 40 variables (optimizing x and y positions of each point). How do I model the constraints in the form Ax <= b, such that each point is separated by a minimum distance in both x and y directions from all other points?
I'd model the objective function as the euclidian distance between the points, which you are trying to minimize. Also the transmitters and receivers must have a minimum distance between them. So this distance should be minimum, but greater than the minimum distance of the equipment. I'd look into the dimensions of the plane to identify all the constraints.

How to find the nearest points to given coordinates with MATLAB?

I need to solve a minimization problem with Matlab and I'm wondering which is the easiest solution. All the potential solutions that I've been thinking in require lot of programming effort.
Suppose that I have a lat/long coordinate point (A,B), what I need is to search for the nearest point to this one in a map of lat/lon coordinates.
In particular, the latitude and longitude arrays are two matrices of 2030x1354 elements (1km distance) and the idea is to find the unique indexes in those matrices that minimize the distance to the coordinates (A,B), i.e., to find the closest values to the given coordinates (A,B).
Any help would be very appreciated.
Thanks!
This is always a fun one :)
First off: Mohsen Nosratinia's answer is OK, as long as
you don't need to know the actual distance
you can guarantee with absolute certainty that you will never go near the polar regions
and will never go near the ±180° meridian
For a given latitude, -180° and +180° longitude are actually the same point, so simply looking at differences between angles is not sufficient. This will be more of a problem in the polar regions, since large longitude differences there will have less of an impact on the actual distance.
Spherical coordinates are very useful and practical for purposes of navigation, mapping, and that sort of thing. For spatial computations however, like the on-surface distances you are trying to compute, spherical coordinates are actually pretty cumbersome to work with.
Although it is possible to do such calculations using the angles directly, I personally don't consider it very practical: you often have to have a strong background in spherical trigonometry, and considerable experience to know its many pitfalls -- very often there are instabilities or "special points" you need to work around (the poles, for example), quadrant ambiguities you need to consider because of trig functions you've introduced, etc.
I've learned to do all this in university, but I also learned that the spherical trig approach often introduces complexity that mathematically speaking is not strictly required, in other words, the spherical trig is not the simplest representation of the underlying problem.
For example, your distance problem is pretty trivial if you convert your latitudes and longitudes to 3D Cartesian X,Y,Z coordinates, and then find the distances through the simple formula
distance (a, b) = R · arccos( a/|a| · b/|b| )
where a and b are two such Cartesian vectors on the sphere. Note that |a| = |b| = R, with R = 6371 the radius of Earth.
In MATLAB code:
% Some example coordinates (degrees are assumed)
lon = 360*rand(2030, 1354);
lat = 180*rand(2030, 1354) - 90;
% Your point of interest
P = [4, 54];
% Radius of Earth
RE = 6371;
% Convert the array of lat/lon coordinates to Cartesian vectors
% NOTE: sph2cart expects radians
% NOTE: use radius 1, so we don't have to normalize the vectors
[X,Y,Z] = sph2cart( lon*pi/180, lat*pi/180, 1);
% Same for your point of interest
[xP,yP,zP] = sph2cart(P(1)*pi/180, P(2)*pi/180, 1);
% The minimum distance, and the linear index where that distance was found
% NOTE: force the dot product into the interval [-1 +1]. This prevents
% slight overshoots due to numerical artifacts
dotProd = xP*X(:) + yP*Y(:) + zP*Z(:);
[minDist, index] = min( RE*acos( min(max(-1,dotProd),1) ) );
% Convert that linear index to 2D subscripts
[ii,jj] = ind2sub(size(lon), index)
If you insist on skipping the conversion to Cartesian and use lat/lon directly, you'll have to use the Haversine formula, as outlined on this website for example, which is also the method used by distance() from the mapping toolbox.
Now, all of this is valid for the whole Earth, provided you find the smooth spherical Earth accurate enough an approximation. If you want to include the Earth's oblateness or some higher order shape model (or God forbid, distances including terrain), you need to do far more complicated stuff. But I don't think that is your goal here :)
PS - I wouldn't be surprised that if you would write everything out that I did, you'll probably re-discover the Haversine formula. I just prefer to be able to calculate something as simple as distances along the sphere from first principles alone, rather than from some black box formula you had implanted in your head sometime long ago :)
Let Lat and Long denote latitude and longitude matrices, then
dist2=sum(bsxfun(#minus, cat(3,A,B), cat(3,Lat,Long)).^2,3);
[I,J]=find(dist2==min(dist2(:)));
I and J contain the indices in A and B that correspond to nearest point. Note that if there are multiple answers, I and J will not be scalar values, but vectors.

Selecting data based on the distance from a query point in Matlab

I have a data-set that has four columns [X Y Z C]. I would like to find all the C values that are in a given sphere centered at [X, Y, Z] with a radius r. What is the best approach to address this problem? Should I use the clusterdata command?
Here is one solution that uses naively euclidean distance:
say V = [X Y Z C] is your dataset, Center = [x,y,z] is the center of the sphere, then
dist = bsxfun(#minus,V(:,1:3),Center); % // finds the distance vectors
% // between the points and the center
dist = sum(dist.^2,2); % // evaluate the squares of the euclidean distances (scalars)
idx = (dist < r^2); % // Find the indexes of the matching points
The good C values are
good = V(idx,4); % // here I kept just the C column
This is not "cluster analysis": You do not attempt to discover structure in your data.
Instead, what you are doing, is commonly called a "range query" or "radius query". In classic database terms, a SELECT, with a distance selector.
You probably want to define your sphere using euclidean distance. For computational purposes, it actually is beneficial to instead of squared Euclidean, by simply taking the square of your radius.
I don't use matlab, but there must be tons of examples of how to compute the distance of each instance in a data set from a query point, and then selecting those objects where the distance is small enough.
I don't know if there is any good index structures package for Matlab. But in general, at 3D, this can be well accelerated with index structures. Computing all distances is O(n), but with an index structure only O(log n).