How to spatial join 2 point layers in postgres/postgis - postgresql

I was trying to join pole(point) and customer(point) layers using spatial join. I created a 20 meter buffer around pole then wrote below query.It takes too much time
SELECT distinct h.gid, d.polecode FROM
buildings as h left join
pole_buffer_20 as d
on
ST_Intersects(d.the_geom,h.the_geom)
Help me with a query without using buffering

For checking if points are contained in polygons you should use the the function ST_DWithin, which is faster:
SELECT distinct h.gid, d.polecode FROM
buildings as h left join
pole_buffer_20 as d
on
ST_DWithin(d.the_geom,h.the_geom, 0);
ST_DWithin accepts a distance parameter too, so, depending of your needs, you could avoid the step with the buffer table and use directly you original table:
SELECT distinct h.gid, d.polecode FROM
buildings as h left join
pole as d
on
ST_DWithin(d.the_geom,h.the_geom, 20);
The distance parameter is in unit of the geometry. If you want a search radius of 20 meters, you need geometries which are expressed in meter (or use geography type)

Related

Why does st_intersection return non-polygons?

I have two polygon layers. I want to run st_intersection on them, to give the result of the areas where they overlap as a new layer. The new layer should contain the attributes from both input layers. I found this image which seems to illustrate my desired end results.
My two input layers are both polygons:
SELECT st_geometrytype(geom),
COUNT(*)
FROM a
GROUP BY st_geometrytype(geom)
-- Result is 1368 st_polygons
SELECT st_geometrytype(geom),
COUNT(*)
FROM b
GROUP BY st_geometrytype(geom)
-- Result is 539548 st_polygons
The query I run is as below:
SELECT a.*,
b.*,
st_intersection(a.geom, b.geom) as geom
FROM a,b
WHERE st_intersects(a.geom, b.geom)
However in the result I get not just polygons (which I expect), but lines, points, multipolygons and geometry collections. I guess because some of my input polygons share points but not true intersections perhaps?
Grateful for some advice please on how to deal with this, whether my query is correct, anything I can do to improve performance etc. Thanks.
ST_intersect returns several geometry types, depending on the relative topology.
For example, running ST_intersect on two adjacent polygons returns the common part of the shared boundary.
While it ouptuts a single table (as you can verify in pgadmin, for example), in the Browser swatch of QGIS it will be shown as multiple tables of different geometry types (for example: POLYGON, MULTIPOLY, LINE, and POINT) but (somewhat confusingly) with the same name.
Visually, you can tell them apart observing the accompaining icons on the left:
You can however select which type of geometry you want, for example by adding a WHERE filter with ST_Dimension:
SELECT a.*,
b.*,
st_intersection(a.geom, b.geom) as geom
FROM a,b
WHERE st_intersects(a.geom, b.geom)
AND ST_Dimension(st_intersects(a.geom, b.geom)) = 2;
or, for performance sake, re-write it in a fashion similar to:
SELECT clipped.*
FROM (
SELECT a.id, b."fieldName",
(ST_Dump(ST_Intersection(a.geom, b.geom))).geom AS geom
FROM "public"."table_A_name" AS a INNER JOIN "public"."table_B_name" AS b
ON ST_Intersects(a.geom, b.geom)
) AS clipped
WHERE ST_Dimension("clipped"."geom") = 2;
The latter solution creates an anonymous temporary table, which allows ST_Intersection to run only once.
You might have noticed thath the trick is in ST_Dimension("clipped"."geom") = 2.
ST_Dimensions which filters the outputs from ST_Intersection so as to keep only polygons (which have a topological dimension of 2).

How to find the shortest distance from the point to the polygon?

Actually the question in the title.
There is a table (osm_buildings) in which the addresses of buildings and their polygons are located. And there is a point, and you need to find the nearest polygons to this point.
Finding the distances between points is very simple and predictable, but how to correctly and most importantly quickly find the distance from the point to the polygon?
The distance operator <-> works well between points and polygons.
You can query like this:
SELECT b.*
FROM osm_buildings AS b
ORDER BY b.polygon <-> 'POINT(3.14 2.78)'::geometry
LIMIT 10;
This will get the 10 buildings closest to that point.
That query can use an index on the polygon column.
You can use ST_DISTANCE between a point and a polygon, it will return the shortest distance.
SELECT ST_Distance(
'SRID=4326;POINT(-70 42)'::geometry,
'SRID=4326;POLYGON((-72 42, -73 42, -73 43, -72 43, -72 42))'::geometry
);
--> 2
When you want to return result just for one point at once then answer of Laurenz Albe is perfect. But if you want to return results for more than one point at once
I assume you stored buildings in some geometry/geography type field, not as text.
select t2.*, a.*
from target t2,
lateral (select o.*
from osm_buildings o, target t
where t2.id=t.id
order by st_distance(o.geom::geography, t.geom::geography) limit 1) a
Also if your data set is big and you accept that from some points there is now close polygon in some acceptable range (for example 1 km) you can add st_dwithin(o.geom,t.geom, your_max_distance) in where clauses in the lateral subquery.
If you want to return more then one "closest polygon" just increase the limit.

Optimize query for intersection of ST_Buffer layer in PostGIS

I have two tables stored in PostGIS:
1. a multipolygon vector with about 590000 rows (layerA) and
2. a single multipart (1 row) vector layer (layerB)
and I want to find the area of the intersection between each polygon's buffer in layerA and layerB. My query so far is
SELECT ST_Area(ST_Intersection(a.geom, b.geom)) AS myarea, a.gid AS mygid FROM
(SELECT ST_Buffer(geom, 500) AS geom, gid FROM layerA) AS a,
layerB AS b
So far, I can see my query working but I calculate that it needs 17 hours to be completed (with my PC). Is there another way to execute this query more efficiently and faster?
What if you check intersects of overlapping area before intersection and area calculation, it might lower time.
SELECT ST_Area(ST_Intersection(a.geom, b.geom)) AS myarea, a.gid AS mygid FROM
(SELECT ST_Buffer(geom, 500) AS geom, gid FROM layerA) AS a,
layerB AS b WHERE ST_intersects(a.geom, b.geom)
You would probably get more answers to this at gis.stackexchange.com.
Therea are several things you can do.
You should make sure you get that first filtering of polygons actually intersecting with help of index.
Put a gist index on the table with many geometries and use st_dwithin(geom,500) instead of st_intersects on the buffered geometries. That is because the buffered geometries cannot use the index calculated on the unbuffered geometries.
Also, you say you have multi polygons. If there actually is more than 1 polygon in each multipolygon you might get a lot more speed if you first split the polygons to single polygons before building the index. That will make the.index doing a much bigger part of the job.
There is actually a function in postgis to split even single polygons into smaller pieces for the same reason.
ST_SubDivide
So first use ST_Dump to get single polygons:
CREATE table a_singles AS
SELECT id, (ST_Dump(geom)).geom geom FROM a;
Then create index:
CREATE INDEX idx_a_s_geom
ON a_singles
USING gist(geom);
At last the query, something like
SELECT ST_Area(ST_Intersection(ST_Buffer(a_s.geom,500), b.geom))
FROM a_singles AS a_s
INNER JOIN b
on ST_DWithin(a_s.geom,b.geom,500);
If that still is slow you can start playing with ST_SubDivide.
One more thing. If the single multipolygon in table b contains many geometries, also split them and put an index also there.
It might be slow also after all those things. That depends on how many vertex points there is in the splitted polygons that actually intersect (and for st_dwithin also on how many vertexpoints there is in polygons with overlapping bounding boxes)
But now you don't have any index helping you so this should make it quite a lot faster.

Calculating total area of polygons that intersects with other polygons in Postgis

I want to calculate in Postgis the total area of 'a' polygons, that intersects with others 'b'.
SELECT DISTINCT a.fk_sites,
SUM(ST_Area(a.the_geom)/100) as area
FROM parcelles a, sites b
WHERE st_intersects(a.the_geom,b.the_geom)
GROUP BY a.fk_sites
I need to do a SELECT DISTINCT because 'a' polygons may intersect with several 'b' polygons, so that the returned 'a' polygons appear a few times.
This works fine, I just have the problem, that not all areas are calculated correctly. A few seam to ignore the DISTINCT case, so that the calculated area reflects the SUM of all, even the duplicated 'a' records (even if they should be eliminated).
When I do a query without the SUM function, I get the correct number of 'a' polygons and while adding their area I get the right value.
SELECT DISTINCT a.fk_sites,
ST_Area(a.the_geom)/100 as area
FROM parcelles a, sites b
WHERE st_intersects(a.the_geom,b.the_geom)
ORDER BY a.fk_sites
Is the combination of SELECT DISTINCT and the SUM / GROUP BY not correct?
This may have something to do with you fk_sites column because the query itself should be ok, although doing a DISTINCT on a double precision value is never a good thing.
You can solve this by identifying the distinct rows from a in a sub-query, then sum() in the main query:
SELECT fk_sites, sum(ST_Area(the_geom)/100) AS area
FROM (
SELECT a.fk_sites, a.the_geom
FROM parcelles a
JOIN sites b ON ST_Intersects(a.the_geom, b.the_geom)
) sub
GROUP BY fk_sites
ORDER BY fk_sites;

How to count up a value until all geometry features from one table are selected

For example, I have this query to find the minimum distance between two geometries (stored in 2 tables) with a PostGIS function called ST_Distance.
Having thousands of geometries (in both tables) it takes to much time without using ST_DWithin. ST_DWithin returns true if the geometries are within the specified distance of one another (here 2000m).
SELECT DISTINCT ON
(id)
table1.id,
table2.id
min(ST_Distance(a.geom, b.geom)) AS distance
FROM table1 a, table2 b
WHERE ST_DWithin(a.geom, b.geom, 2000.0)
GROUP BY table1.id, table2.id
ORDER BY table1.id, distance
But you have to estimate the distance value to fetch all geometries (e.g. stored in table1). Therefore you have to look at your data in some way in a GIS, or you have to calculate the maximum distance for all (and that takes a lot of time).
In the moment I do it in that way that I approximate the distance value until all features are queried from table1, for example.
Would it be efficient that my query automatically increases (with a reasonable value) the distance value until the count of all geometries (e.g. for table1) is reached? How can I put this in execution?
Would it be slow down everything because the query needs maybe a lot of approaches to find the distance value?
Do I have to use a recursive query for this purpose?
See this post here: K-Nearest Neighbor Query in PostGIS
Basically, the <-> operator is a bit unusual in that it works in the order by clause, but it avoids having to make a guess as to how far you want to search in ST_DWithin. There is a major gotcha with this operator though, which is that the geometry in the order by clause must be a constant that is you CAN NOT write:
select a.id, b.id from table a, table b order by geom.a <-> geom.b limit 1;
Instead you would have to create a loop, substituting in a value above for geom.b
More information can be found here: http://boundlessgeo.com/2011/09/indexed-nearest-neighbour-search-in-postgis/