Writing a query for getting all nodes within multiple Bounding Boxes - openstreetmap

nwr(51.477,-0.001,51.478,0.001);
out;
This is the most standard query, but I am trying to have multiple of these bboxes within one query. I have no idea how to achieve this and not sure whether it is possible.

You can do a union of two queries as described in the Overpass wiki:
(
nwr(51.477,-0.001,51.478,0.001);
nwr(51.477,0.001,51.478,0.002);
);
out;
Or you could try to combine them in a polygon, and do a query with a polygon. You just have to be a bit careful on how they overlap. You have to use 5 points per box, to make sure that it closes (last point is the same as the first), so you don't get areas between your boxes. So it might not be easier than the union above.
nwr(poly:"latitude_1 longitude_1 latitude_2 longitude_2 latitude_3 longitude_3 …"));
for example:
node(poly:"51.477 -0.001 51.477 0.01 51.48 0.01 51.48 -0.001 51.477 -0.001 51.470 -0.01 51.470 0.001 51.472 0.001 51.472 -0.01 51.470 -0.01");
out geom;

Related

"Sparse" Geospatial Queries with ST_Contains are Much Slower than dense ones

I am hitting a wall when it comes to trying to explain what is happening with a query I have. For simplicity, I have stripped it down to the minimum. Simply put, I am try to find all points within a simple envelope like so:
SELECT
lines.start_point AS point
FROM
"objects"
INNER JOIN "lines" ON "lines"."id" = "objects"."line_id"
WHERE
(ST_Contains
(
ST_MakeEnvelope (142.055256,-10.798657,142.385532,-10.485534, 4326),lines.start_point::geometry
)
)
LIMIT 50 OFFSET 0;
I have indexes setup on lines.start_point and everything is very fast in areas with lots of data. I get data returned in the sub 500ms range for areas that have lots of data.
What I did not expect is that areas with very little data would be super slow - sometimes > 90,000ms. Is there something I am totally missing here with ST_Contains that would explain this?
As with the example bounding box above and screenshot of return points my data only has 63 start points within this box but the query took 2min 54sec to find them. My only thought is that maybe ST_Contains is quite fast when it can just pick up points quickly when it can find a lot in a single pass but if it has to scan the entire area, it is really slow.
Additionally, I have tried using other ways to looks for points - like the && operator. When this is the case, the roles reverse. Dense areas take a really long time and sparse areas are lightning fast. And example of that query is here:
SELECT
lines.start_point AS point
FROM
"objects"
INNER JOIN "lines" ON "lines"."id" = "objects"."line_id"
WHERE
lines.start_point && ST_MakeEnvelope (142.055256,-10.798657,142.385532,-10.485534, 4326)
LIMIT 50 OFFSET 0;
Any information would help. Thanks
EDIT: Add && query example
Because of the limit clause, the planner may think it is faster not to use the spatial index. You can try to query all rows and then to apply the limit. Make sure to keep the offset 0 to prevent inlining.
SELECT * FROM (
SELECT
lines.start_point AS point
FROM
"objects"
INNER JOIN "lines" ON "lines"."id" = "objects"."line_id"
WHERE
(ST_Contains
(
ST_MakeEnvelope (142.055256,-10.798657,142.385532,-10.485534, 4326),lines.start_point::geometry
)
)
OFFSET 0)
LIMIT 50;
Also I see you do a cast to geometry, so make sure the index is on the geometry too!

How to get a road path which can be travelled in 10 minutes from a location

I have postgis road network table data base with speed limits based on type of road. I can able to get shortest path/route between two points by using Dijkstra or any other algorithm. Now I want to get possible paths that can be travelled from a location (point) in 10 minutes of time. Because of I'm having a speed limits based on road type the resultant paths may not be of same length.in this case single source all destinations algorithms may be helpful but my destination points are may or may not available as a nodes in the network because of my time as cost. Please help me.
pgr_drivingDistance uses the cost value you provide, and in the units you implicitly specify, meaning that when you add a column <traveling_time> (note that I use seconds in my example) as the time needed to traverse an edge (given the length and speed limit) and select that as cost, the functions result will represent the equal driving time limits.
As for the parts where the algorithm couldn´t fully traverse the next edge 'in time', you will need to add those yourself. The general idea here is to identify all possible edges connected to the end vertices in the result set of pgr_drivingDistance, but not equal to any of the involved edges, and interpolate a new end point along those lines.
- Updated -
The following query is an out-of-my-head attempt and not tested at all, but in theory should is tested and returns a polygon all full and partial edges representing a 600 seconds trip along your network:
WITH
dd AS (
SELECT pg.id1 AS node,
pg.id2 AS edge,
pg.cost
FROM pgr_drivingDistance('SELECT id,
source,
target,
<travel_time_in_sec> AS cost
FROM <edge_table>',
<start_id>,
600,
false,
false
) AS pg
),
dd_edgs AS (
SELECT edg.id,
edg.geom
FROM <edge_table> AS edg
JOIN dd AS d1
ON edg.source = d1.node
JOIN dd AS d2
ON edg.target = d2.node
),
dd_ext AS (
SELECT edg.id,
CASE
WHEN dd.node = edg.source
THEN ST_LineSubstring(edg.geom, 0, (600 - dd.cost) / edg.<travel_time>)
ELSE ST_LineSubstring(edg.geom, 1 - ((600 - dd.cost) / edg.<travel_time>), 1)
END AS geom
FROM dd
JOIN <edge_table> AS edg
ON dd.node IN (edg.source, edg.target) AND edg.id NOT IN (SELECT id FROM dd_edgs)
)
SELECT id,
geom
FROM dd_ext
UNION ALL
SELECT id,
geom
FROM dd_edgs;
The CASE statement decides if, for any follow-up edge, the fraction of line length will be calculated from the start or end point.
As a sidenote: the current version of pgRouting provides a set of functions where inter-edge-points are to be considered; if updating your (rather outdated) PostGIS/pgRouting versions is an option, consider those functions instead.

Optimize query for intersection of ST_Buffer layer in PostGIS

I have two tables stored in PostGIS:
1. a multipolygon vector with about 590000 rows (layerA) and
2. a single multipart (1 row) vector layer (layerB)
and I want to find the area of the intersection between each polygon's buffer in layerA and layerB. My query so far is
SELECT ST_Area(ST_Intersection(a.geom, b.geom)) AS myarea, a.gid AS mygid FROM
(SELECT ST_Buffer(geom, 500) AS geom, gid FROM layerA) AS a,
layerB AS b
So far, I can see my query working but I calculate that it needs 17 hours to be completed (with my PC). Is there another way to execute this query more efficiently and faster?
What if you check intersects of overlapping area before intersection and area calculation, it might lower time.
SELECT ST_Area(ST_Intersection(a.geom, b.geom)) AS myarea, a.gid AS mygid FROM
(SELECT ST_Buffer(geom, 500) AS geom, gid FROM layerA) AS a,
layerB AS b WHERE ST_intersects(a.geom, b.geom)
You would probably get more answers to this at gis.stackexchange.com.
Therea are several things you can do.
You should make sure you get that first filtering of polygons actually intersecting with help of index.
Put a gist index on the table with many geometries and use st_dwithin(geom,500) instead of st_intersects on the buffered geometries. That is because the buffered geometries cannot use the index calculated on the unbuffered geometries.
Also, you say you have multi polygons. If there actually is more than 1 polygon in each multipolygon you might get a lot more speed if you first split the polygons to single polygons before building the index. That will make the.index doing a much bigger part of the job.
There is actually a function in postgis to split even single polygons into smaller pieces for the same reason.
ST_SubDivide
So first use ST_Dump to get single polygons:
CREATE table a_singles AS
SELECT id, (ST_Dump(geom)).geom geom FROM a;
Then create index:
CREATE INDEX idx_a_s_geom
ON a_singles
USING gist(geom);
At last the query, something like
SELECT ST_Area(ST_Intersection(ST_Buffer(a_s.geom,500), b.geom))
FROM a_singles AS a_s
INNER JOIN b
on ST_DWithin(a_s.geom,b.geom,500);
If that still is slow you can start playing with ST_SubDivide.
One more thing. If the single multipolygon in table b contains many geometries, also split them and put an index also there.
It might be slow also after all those things. That depends on how many vertex points there is in the splitted polygons that actually intersect (and for st_dwithin also on how many vertexpoints there is in polygons with overlapping bounding boxes)
But now you don't have any index helping you so this should make it quite a lot faster.

Simple PostGIS XYZ set up?

I am new to PostGIS. I am looking to have a simple bounded (-200 < x, y, z < 200) data set of 1,000,000 points on a plain XYZ graph. The only query I need is a fast K nearest neighbors and all neighbors such that the distance is less than < N. It seems that PostGIS has a LOT of extra features that I do not need.
What do SRID do I need? One that does not concern with feet or meters.
Am I right that I need to use the function
ST_3DDistance to query for the K nearest neighbors with LIMIT K? or with a maximum distance of N.
To add a column, I need to use SELECT AddGeometryColumn ('my_schema','my_spatial_table','geom_c',4326,'POINT',3, false);. Is that correct?
What is the difference between a 3D point and a PointZ?
Will AddGeometryColumn ensure that my distance query is fast?
Is PostGIS the right choice for my use case? The rest of my DB is already integrated with PostgreSQL
Thanks!
What do SRID do I need? One that does not concern with feet or meters.
You don't "need" a srid. If your data is a in a coordinate system, find the right srid, otherwise, use 0.
Am I right that I need to use the function ST_3DDistance to query for the K nearest neighbors with LIMIT K? or with a maximum distance of N.
Yes, you're right.
To add a column, I need to use SELECT AddGeometryColumn ('my_schema','my_spatial_table','geom_c',4326,'POINT',3, false);. Is that correct?
Yes, but I'd use 0 for srid, instead of 4326 (that is for degrees).
What is the difference between a 3D point and a PointZ?
PointZ is a 3d Point.
Will AddGeometryColumn ensure that my distance query is fast?
AddGeometryColumn will just add some constraints to the table, ensuring that the geometries you insert are coherent with the column definition.
I don't think you need it, but you could try adding an index to your geometry column using CREATE INDEX index_name ON schema.table USING gist (geom_col);
Is PostGIS the right choice for my use case? The rest of my DB is already integrated with PostgreSQL
I think it is the easiest way, not necessarly the "right" one.
You could also implement a distance function without postgis, storing the three coordinates in three numeric fields.

Most efficient way to find points within a certain radius from a given point

I've read several questions + answers here on SO about this theme, but I can't understand which is the common way (if there is one...) to find all the points whithin a "circle" having a certain radius, centered on a given point.
In particular I found two ways that seem the most convincing:
select id, point
from my_table
where st_Distance(point, st_PointFromText('POINT(-116.768347 33.911404)', 4326)) < 10000;
and:
select id, point
from my_table
where st_Within(point, st_Buffer(st_PointFromText('POINT(-116.768347 33.911404)', 4326), 10000));
Which is the most efficient way to query my database? Is there some other option to consider?
Creating a buffer to find the points is a definite no-no because of (1) the overhead of creating the geometry that represents the buffer, and (2) the point-in-polygon calculation is much less efficient than a simple distance calculation.
You are obviously working with (longitude, latitude) data so you should convert that to an appropriate Cartesian coordinate system which has the same unit of measure as your distance of 10,000. If that distance is in meter, then you could also cast the point from the table to geography and calculate directly on the (long, lat) coordinates. Since you only want to identify the points that are within the specified distance, you could use the ST_DWithin() function with calculation on the sphere for added speed (don't do this when at very high latitudes or with very long distances):
SELECT id, point
FROM my_table
WHERE ST_DWithin(point::geography,
ST_GeogFromText('POINT(-116.768347 33.911404)'),
10000, false);
I have used following query
SELECT *, ACOS(SIN(latitude) * SIN(Lat)) + COS(latitude) * COS(Lat) * COS(longitude) - (Long)) ) * 6380 AS distance FROM Table_tab WHERE ACOS( SIN(latitude) * SIN(Lat) + COS(latitude) * COS(Lat) * COS(longitude) - Long )) * 6380 < 10
In above query latitude and longitude are from database and lat, long are the points from we want to search.
WORKING : it will calculate the distance(In KM) between all the points in database from search points and check if the distance is less then 10 KM. It will return all the co-ordinates within 10 KM.
I do not know how postgis does it best, but in general:
Depending on your data it might be best to first search in a square bounding box (which contains the search area circle) in order to eliminate a lot of candidates, this should be extremely fast as you can use simple range operators on lon/lat which are ideally indexed properly for this.
In a second step search using the radius.
Also if your limit max points is relatively low and you know you have a lot of candidates, you may simply do a first 'optimistic' attempt with a box inside your circle, if you find enough points you are done !