Most efficient way to find points within a certain radius from a given point - postgresql

I've read several questions + answers here on SO about this theme, but I can't understand which is the common way (if there is one...) to find all the points whithin a "circle" having a certain radius, centered on a given point.
In particular I found two ways that seem the most convincing:
select id, point
from my_table
where st_Distance(point, st_PointFromText('POINT(-116.768347 33.911404)', 4326)) < 10000;
and:
select id, point
from my_table
where st_Within(point, st_Buffer(st_PointFromText('POINT(-116.768347 33.911404)', 4326), 10000));
Which is the most efficient way to query my database? Is there some other option to consider?

Creating a buffer to find the points is a definite no-no because of (1) the overhead of creating the geometry that represents the buffer, and (2) the point-in-polygon calculation is much less efficient than a simple distance calculation.
You are obviously working with (longitude, latitude) data so you should convert that to an appropriate Cartesian coordinate system which has the same unit of measure as your distance of 10,000. If that distance is in meter, then you could also cast the point from the table to geography and calculate directly on the (long, lat) coordinates. Since you only want to identify the points that are within the specified distance, you could use the ST_DWithin() function with calculation on the sphere for added speed (don't do this when at very high latitudes or with very long distances):
SELECT id, point
FROM my_table
WHERE ST_DWithin(point::geography,
ST_GeogFromText('POINT(-116.768347 33.911404)'),
10000, false);

I have used following query
SELECT *, ACOS(SIN(latitude) * SIN(Lat)) + COS(latitude) * COS(Lat) * COS(longitude) - (Long)) ) * 6380 AS distance FROM Table_tab WHERE ACOS( SIN(latitude) * SIN(Lat) + COS(latitude) * COS(Lat) * COS(longitude) - Long )) * 6380 < 10
In above query latitude and longitude are from database and lat, long are the points from we want to search.
WORKING : it will calculate the distance(In KM) between all the points in database from search points and check if the distance is less then 10 KM. It will return all the co-ordinates within 10 KM.

I do not know how postgis does it best, but in general:
Depending on your data it might be best to first search in a square bounding box (which contains the search area circle) in order to eliminate a lot of candidates, this should be extremely fast as you can use simple range operators on lon/lat which are ideally indexed properly for this.
In a second step search using the radius.
Also if your limit max points is relatively low and you know you have a lot of candidates, you may simply do a first 'optimistic' attempt with a box inside your circle, if you find enough points you are done !

Related

how to find ou lat long with 2KM radius using Postgres

I have a postgres table with lat-long values stored as point (Geometric data type). I am interested to query all rows with 2km radius for the given lat-long values. Also, I am expecting for a suitable datatype for this, currently I stored these values as POINT. But on some investigation, I found to use POLYGON here. But even though I couldn't able to achieve the results what expected.
Can any one point me the exact query with suitable GTS functions to achieve this
https://postgis.net/workshops/postgis-intro/spatial_relationships.html
example explanation:
SELECT name
FROM nyc_streets
WHERE ST_DWithin(
geom,
ST_GeomFromText('POINT(583571 4506714)',26918),
10
);
The first "geom" refer to the column name in nyc_streets.
ST_GeomFromText: transform text to geom. 26918 is srid.
10 : 10 meters.
To query geometries within a certain radius you might wanna use ST_DWithin. In order to use it with metres you have to either use a SRS that has metre as unit, such as EPSG:26918, or use geography instead of geometry:
SELECT *
FROM mytable
WHERE ST_DWithin(ST_MakePoint(1,2)::geography, -- 1=long / 2=lat
geom::geography, -- casting the 'geometry' to 'geography'
2000); -- 2km radius
In case you're dealing with different geometry types, such as polygon or linestring, you might wanna use ST_GeogFromText instead of ST_MakePoint:
SELECT *
FROM mytable
WHERE ST_DWithin(ST_GeogFromText('POINT(1 2)'),
geom::geography,2000);
SELECT *
FROM mytable
WHERE ST_DWithin(ST_GeogFromText('POLYGON((1 1,2 2,3 3,1 1))'),
geom::geography,2000);
Keep in mind that transforming a geometry is much more than just change its SRID - check ST_Transform.
Further reading
Getting all Buildings in range of 5 miles from specified coordinates
How is postgis treating coordinates sent with different SRID

Use ST_DWithin to query pairs of geometry points within 400 miles in a PostgreSQL table

I have a table called h2combines in Postgres that has two points-geometry fields: geom1 and geom2, and some other fields. I want to select all records that geom1 and gemo2 are with 400 miles. I tried this:
SELECT * from h2combines WHERE ST_DWithin(geom1, geom2, 643738);
However, it returned all rows. Seems I misunderstand something here.
Thanks!
Welcome to SO.
To get the distance in meters/miles you have to cast your geometries to geography, e.g.
SELECT * FROM h2combines
WHERE ST_DWithin(geom1::geography, geom2::geography, 643737.6);
Keep in mind that calculations using GEOMETRY and GEOGRAPHY are made differently, and so are their results. GEOGRAPHY calculates the coordinates over an spherical surface (which can be much slower than GEOMETRY) and uses meters as unit of measurement, while GEOMETRY uses a planar projection and uses the SRS unit.

Efficient way to select all points inside radius

I'm working with meanstack application. so I have mongodb collection that contain worldwide locations. Schema loooks like follows.
[{
address : "example address",
langitude : 79.8816,
latitude : 6.773
},
{...}]
What is the efficient way to select all points inside Circle( pre defined point and pre defined radius) ..??
Using SQL Query also we can do this. But I want a efficient way without iterating one by one and check weather it is inside that radius or not.
Distance d between two points with coordinates {lat1,lon1} and {lat2,lon2} is given by
d = 2*asin(sqrt((sin((lat1-lat2)/2))^2 +
cos(lat1)*cos(lat2)*(sin((lon1-lon2)/2))^2))
Above formula gives d in radians. To get the answer in km, use the formula below.
distance_km ≈ radius_km * distance_radians ≈ 6371 * d
Here, the magic number 6371 is approx. the radius of planet earth. This is the minimum computation that you will have to do in your case. You can compute this distance and select all the records having a distance less than your radius value.
Better approach
You can use Elasticsearch which supports geolocation queries for better performance.

How to configure PostgreSQL with Postgis to calculate distances

I know that it might be dumb question, but I'm searching for some time and can't find proper answer.
I have PostgreSQL database with PostGIS installed. In one table I have entries with lon lat (let's assume that columns are place, lon, lat).
What should I add to this table or/and what procedure I can use, to be able to count distance between those places in meters.
I've read that it is necessary to know SRID of a place to be able to count distance. Is it possible to not know/use it and still be able to count distance in meters basing only on lon lat?
Short answer:
Just convert your x,y values on the fly using ST_MakePoint (mind the overhead!) and calculate the distance from a given point, the default SRS will be WGS84:
SELECT ST_Distance(ST_MakePoint(lon,lat)::GEOGRAPHY,
ST_MakePoint(23.73,37.99)::GEOGRAPHY) FROM places;
Using GEOGRAPHY you will get the result in meters, while using GEOMETRY will give it in degrees. Of course, knowing the SRS of coordinate pairs is imperative for calculating distances, but if you have control of the data quality and the coordinates are consistent (in this case, omitting the SRS), there is not much to worry about. It will start getting tricky if you're planing to perform operations using external data, from which you're also unaware of the SRS and it might differ from yours.
Long answer:
Well, if you're using PostGIS you shouldn't be using x,y in separated columns in the first place. You can easily add a geometry / geography column doing something like this.
This is your table ...
CREATE TABLE places (place TEXT, lon NUMERIC, lat NUMERIC);
Containing the following data ..
INSERT INTO places VALUES ('Budva',18.84,42.92),
('Ohrid',20.80,41.14);
Here is how you add a geography type column:
ALTER TABLE places ADD COLUMN geo GEOGRAPHY;
Once your column is added, this is how you convert your coordinates to geography / geometry and update your table:
UPDATE places SET geo = ST_MakePoint(lon,lat);
To compute the distance you just need to use the function ST_Distance, as follows (distance in meters):
SELECT ST_Distance(geo,ST_MakePoint(23.73,37.99)) FROM places;
st_distance
-----------------
686560.16822422
430876.07368955
(2 Zeilen)
If you have your location parameter in WKT, you can also use:
SELECT ST_Distance(geo,'POINT(23.73 37.99)') FROM places;
st_distance
-----------------
686560.16822422
430876.07368955
(2 Zeilen)

Which GEO implementation to use for millions of points

I am trying to figure out which GEO implementation to use to find the nearest points based on long/lat to a certain point. I will have millions if not billions of different latitude/longitude points that will need to be compared. I have been looking at many different implementations to do the job I need to be done. I have looked into Postgis (looks like it is very popular and performs well), Neo4J (Graph databases are a new concept to me and I am unsure how they perfrom), AWS dynamodb geohash (Scales very well, but only library is written in Java, I am hoping to write a library in node.js), etc but can't figure out which would perform best. I am purely looking into performance opposed to number of features. All I need to be able is to compare one point to all points and find the closest (read operation), and as well, be able to change a point in the database quickly (write operation). Could anyone suggest based on these requirements a good implementation
PostGIS has several function for geohashing. If you make your strings long enough the search becomes quicker (fewer collisions per box + its 8 neighbours) but the geohash generation slower on inserting new points.
The question is also how accurate you want to be. At increasing latitude, lat/long "distance" deteriorates because a degree of longitude shrinks from about 110km at the Equator to 0 at the poles, while a degree of latitude is always about 110km. At the mid-latitude of 45 degrees a degree of longitude is nearly 79km, giving an error in distance of a factor of 2 (sqr(110/79)). Spherical distance to give you true distance between lat/long pairs is very expensive to calculate (lots of trigonometry going on) and then you geohashing won't work (unless you convert all points to planar coordinates).
A solution that might work is the following:
CREATE INDEX hash8 ON tablename(substring(hash_column FROM 1 FOR 8)). This gives you an index on a box twice as large as your resolution, which helps finding points and reducing the need to search neighbouring hash boxes.
On INSERT of a point, compute its geohash of length 9 (10m resolution approx.) into hash_column, using PostGIS. You could use a BEFORE INSERT TRIGGER here.
In a function:
Given a point, find the nearest point by looking for all points with a geohash value shortened to 8 chars equal to the given points 8-char geohash (hence the index above).
Compute the distance to each of the encountered points using spherical coordinates, keeping the closest point. But since you are only looking for the nearest point (at least initially), do not search on distance using spherical coordinates, but use the below optimization, which should make the search much much faster.
Compute if the given point is closer to the edge of the box determined by the 8-char geohash than the closest computed point. If so, repeat the procedure with the 7-char geohash on all points in its 8 neighbours. This can be highly optimized by computing distances to individual box sides and corners and evaluating only the relevant neighbour hash boxes; I leave this to you to tinker with.
In any case, this will not be particularly speedy. If you are indeed going towards billions of points you might want to think about clustering which has a rather "natural" solution for geohashing (break up your table on substring(hash_column FROM 1 FOR 2) for instance, giving you four quadrants). Just make sure that you account for cross-boundary searches.
Two optimizations can be made fairly quickly:
First, "normalize" your spherical coordinates (meaning: compensate for the reduced length of a degree of longitude with increasing latitude) so that you can search for nearest points using a "pseudo-cartesian" approach. This only works if points are close together, but since you are working with lots of points this should not be a problem. More specifically, this should work for all points in geohash boxes of length 6 or more.
Assuming the WGS84 ellipsoid (which is used in all GPS devices), the Earth's major axis (a) is 6,378,137 meter, with an ellipticity (e2) of 0.00669438. A second of longitude has a length of
longSec := Pi * a * cos(lat) / sqrt(1 - e2 * sqr(sin(lat))) / 180 / 3600
or
longSec := 30.92208078 * cos(lat) / sqrt(1 - 0.00669438 * sqr(sin(lat)))
For a second of latitude:
latSec := 30.870265 - 155.506 * cos(2 * lat) + 0.0003264 + cos(4 * lat)
The correction factor to make your local coordinate system "square" is by multiplying your longitude values by longSec/latSec.
Secondly, since you are looking for the nearest point, do not search on distance because of the computationally expensive square root. Instead, search on the term within the square root, the squared distance if you will, because this has the same property of selecting for the nearest point.
In pseudo-code:
CREATE FUNCTION nearest_point(pt geometry, ptHash8 char(8)) RETURNS integer AS $$
DECLARE
corrFactor double precision;
ptLat double precision;
ptLong double precision;
currPt record;
minDist double precision;
diffLat double precision;
diffLong double precision;
minId integer;
BEGIN
minDist := 100000000.; -- a large value, 10km (squared)
ptLat := ST_Y(pt);
ptLong := ST_X(pt);
corrFactor := 30.92208078 * cos(radians(ptLat)) / (sqrt(1 - 0.00669438 * power(sin(radians(ptLat)), 2)) *
(30.870265 - 155.506 * cos(2 * radians(ptLat)) + 0.0003264 + cos(4 * radians(ptLat))));
FOR currPt IN SELECT * FROM all_points WHERE hash8 = ptHash8
LOOP
diffLat := ST_Y(currPt.pt) - ptLat;
diffLong := (ST_X(currPt.pt) - ptLong) * corrFactor; -- "square" things out
IF (diffLat * diffLat) < (minDist * diffLong * diffLong) THEN -- no divisions here to speed thing up a little further
minDist := (diffLat * diffLat) / (diffLong * diffLong); -- this does not happen so often
minId := currPt.id;
END IF;
END LOOP;
IF minDist < 100000000. THEN
RETURN minId;
ELSE
RETURN NULL;
END IF;
END; $$ LANGUAGE PLPGSQL STRICT;
Needless to say, this would be a lot faster in a C language function. Also, do not forget to do boundary checking to see if neighbouring geohash boxes need to be searched.
Incidentally, "spatial purists" would not index on the 8-char geohash and search from there; instead they would start from the 9-char hash and work outwards from there. However, a "miss" in your initial hash box (because there are no other points or you are close to a hash box side) is expensive because you have to start computing distances to neighbouring hash boxes and pull in more data. In practice you should work from a hash box which is about twice the size of the typical nearest point; what that distance is depends on your point set.