What metric does pgrouting use to calculate distances? - distance

I was comparing distances between points using pgrouting and comparing it with Vincenty and great circle distances computed using geopy, but I'm finding that the min distance between two points is lower bounded by the pgrouting distances. This makes no sense because the "straight" (as the crow flies) distance between two points should be the min distance.
How is pgrouting computing the "cost" between two points?
I'm sure I can find this by digging through the source code, but I'm not sure how to find my way around, and it could be faster to ask...

PgRouting doesn't calculate distances. It takes a 'cost' field for each edge and adds up the cost for each edge in a path. You need to figure out what cost value you're passing to PgRouting and get that straight. Maybe you measured on a different unit or projection.

Related

Comparing 2 sets of coordinates and finding points that are apart by less than a certain distance

I have two large databases (500k to 3M rows) inside PostgreSQL, each containing a set of GPS lat longs. I need to compare all coordinates in one database with that of the other database, and find points that are within 300m of each other.
I started using PostgreSQL as I had heard about its spatial indexing which speeds up geometry-related tasks a lot. The idea is to use spatial indexing e.g. R Trees, to only check nodes that were already determined to be close to each other, instead of checking the entire database every time O(n^2)
However, I couldn't find anything related to this.
*Edit: I am not looking for a distance calculation algorithm, I am looking for optimizations to speed up the comparison of locations in my 2 tables. So it is not a duplicate question.
Just giving a tip here.
Distance is a result of a static function (calculation) that has input the 2 sets of long and lat.
Work with the formula you can find on the internet to find the deviation (sum of lat and long) bigger than which the distance is certainly greater than 300 km .
This deviation will be bigger the closer the point to the poles is and will be lesser the closer the point to the equator is. So divide the coords in 3-4 different areas based on how close to the equator they are, then calculate how different the other coords must be in order to be certain that distance is higher than 300km and then work only with the remaining coords.
that would drastically improve the performance of any type of search you plan to perform.
good luck!

Matlab calculate geographical distance to lat/lng polyline

In Matlab I would like to calculate the (shortest) distances between a set of independent points (m-by-2 matrix of lat/lng) and a set of polylines (n-by-2 matrix of lat/lng). The resulting table should be an n-m matrix with distances in KM.
I have rewritten this JavaScript implementation (http://www.bdcc.co.uk/Gmaps/BdccGeo.js) to Matlab, but it does not seem to perform well.
Currently I am working on a project with a relatively large set of data and running into performance issues. I have roughly 40.000 points and 150 polylines. The polylines are subsets of the original set of 40.000 points. With about 15 seconds per polyline, calculating all these distances can take up to an hour. Also, the intermediate matrixes of 40000x150x3 cause out of memory errors on my lesser machines.
Instead of optimizing or revising this implementation I am wondering if Matlab doesn't already have some (smarter) functions built in for this. But as far as I can see, the documentation mainly has information on how to display geodata as opposed to doing calculations on it.
Does anyone know or have experience with these kind of calculations in Matlab? Has anything like this already been written which I can reuse so I don't have to reinvent the wheel. And finally, is this expected performance, given these numbers, or should my function be able to perform much better?

Mongodb 3D geometry - query by distance

I have a collection of (X,Y,Z) points in Mongo database - is there any way to precisely query for points within given distance of some coordinate? For example, within 50 units of 0,0,0.
The most prevalent answer seems to be to strip the least significant coordinate and use 2D $near query, but this approach would lead to unacceptable loss of precision with my data set.
Are there any plugins available for 3D queries? Or some clever query tricks? Failing that, could you suggest some external service like elasticsearch that works well with 3D geometry?
If it changes anything, I'm dealing with ~10 pre-set distances, couple thousand points that change relatively rarely, always querying for points within a distance from another point, and reaching maximum ~20% of other points from any given one at the biggest distance. So pre-computing is an option, but I'd like to avoid it if possible.

Shape Context - Rotation Invariance

I was trying to implement Shape Context (in MatLab). I was trying to achieve rotation invariance.
The general approach for shape context is to compute distances and angles between each set of interest points in a given image. You then bin into a histogram based on whether these calculated values fall into certain ranges. You do this for both a standard and a test image. To match two different images, from this you use a chi-square function to estimate a "cost" between each possible pair of points in the two different histograms. Finally, you use an optimization technique such as the hungarian algorithm to find optimal assignments of points and then sum up the total cost, which will be lower for good matches.
I've checked several websites and papers, and they say that to make the above approach rotation invariant, you need to calculate each angle between each pair of points using the tangent vector as the x-axis. (ie http://www.cs.berkeley.edu/~malik/papers/BMP-shape.pdf page 513)
What exactly does this mean? No one seems to explain it clearly. Also, from which of each pair of points would you get the tangent vector - would you average the two?
A couple other people suggested I could use gradients (which are easy to find in Matlab) and use this as a substitute for the tangent points, though it does not seem to compute reasonable cost scores with this. Is it feasible to do this with gradients?
Should gradient work for this dominant orientation?
What do you mean by ordering the bins with respect to that orientation? I was originally going to have a square matrix of bins - with the radius between two given points determining the column in the matrix and the calculated angle between two given points determining the row.
Thank you for your insight.
One way of achieving (somewhat) rotation invariance is to make sure that where ever you compute your image descriptor their orientation (that is ordering of the bins) would be (roughly) the same. In order to achieve that you pick the dominant orientation at the point where you extract each descriptor and order the bins with respect to that orientation. This way you can compare bin-to-bin of different descriptors knowing that their ordering is the same: with respect to their local dominant orientation.
From my personal experience (which is not too much) these methods looks better on paper than in practice.

Measuring density for three dimensional data (in Matlab)

I have a dataset consisting of a large collection of points in three dimensional euclidian space. In this collection of points, i am trying to find the point that is nearest to the area with the highest density of points.
So my problem consists of two steps:
1: Determine where density of the distribution of points is at its highest
2: Determine which point is nearest to the point found in 1
Point 2 i can manage, but i'm not sure how to solve point 1. I know there are a lot of functions for density estimation in Matlab, but i'm not sure which one would be the most suitable, or straightforward to use.
Does anyone know?
My command of statistics is a little bit rusty, but as far as i can tell, this type of problem calls for multivariate analysis. Someone suggested i use multivariate kernel density estimation, but i'm not really sure if that's the best solution.
Density is a measure of mass per unit volume. On the assumption that your points all have the same mass then you are, I suppose, trying to measure the number of points per unit volume. So one approach is to divide your subset of Euclidean space into lots of little unit volumes (let's call them voxels like everyone does) and count how many points there are in each one. The voxel with the most points is where the density of points is at its highest. This is, of course, numerical integration of a sort. If your points were distributed according to some analytic function (and I guess they are not) you could solve the problem with pencil and paper.
You might make this approach as sophisticated as you like, perhaps initially dividing your space into 2 x 2 x 2 voxels, then choosing the voxel with most points and sub-dividing that in turn until your criteria are satisfied.
I hope this will get you started on your point 1; you seem to be OK with point 2 so I'll stop now.
EDIT
It looks as if triplequad might be what you are looking for.