Select Intersect Polygon From Single Feature Layer - postgresql

I have hundreds of polygon (circles) where some of the polygon intersected with each others. This polygon is come from single feature layer. What I am trying to do is to delete the intersected circles.
It is similar to this question: link, but those were using two different layer. In my case the intersection is from single feature layers.

If I understood your question right, you just need to either create a CTE or simple subquery.
This might give you a good idea of how to solve your issue:
CREATE TABLE t (id INTEGER, geom GEOMETRY);
INSERT INTO t VALUES
(1,'POLYGON((-4.54 54.30,-4.46 54.30,-4.46 54.29,-4.54 54.29,-4.54 54.30))'),
(2,'POLYGON((-4.66 54.16,-4.56 54.16,-4.56 54.14,-4.66 54.14,-4.66 54.16))'),
(3,'POLYGON((-4.60 54.19,-4.57 54.19,-4.57 54.15,-4.60 54.15,-4.60 54.19))'),
(4,'POLYGON((-4.40 54.40,-4.36 54.40,-4.36 54.38,-4.40 54.38,-4.40 54.40))');
This data set contains 4 polygons in total and two of them overlap, as seen in the following picture:
Applying a CTE with a subquery might give you what you want, which is the non-overlapping polygons from the same table:
SELECT id, ST_AsText(geom) FROM t
WHERE id NOT IN (
WITH j AS (SELECT * FROM t)
SELECT j.id
FROM j
JOIN t ON t.id <> j.id
WHERE ST_Intersects(j.geom,t.geom)
);
id | st_astext
----+---------------------------------------------------------------------
1 | POLYGON((-4.54 54.3,-4.46 54.3,-4.46 54.29,-4.54 54.29,-4.54 54.3))
4 | POLYGON((-4.4 54.4,-4.36 54.4,-4.36 54.38,-4.4 54.38,-4.4 54.4))
(2 rows)

You can write quite clear delete statement using EXISTS clause. You literally want to delete the rows, for which there exists other rows which geometry intersects:
DELETE
FROM myTable t1
WHERE EXISTS (SELECT 1 FROM myTable t2 WHERE t2.id <> t1.id AND ST_Intersects(t1.geom, t2.geom))

Related

how to count how many buffers intersects in PostGIS

How can I count how many buffers intersects and select only those which have between 2 AND 6 intersections with other buffers? I can create a buffer with following query, but I don't have an idea on how to count intersections
SELECT ST_Buffer(geom::geography, 400)
FROM mytable;
I appreciate any help. Thanks.
It is wrong to use a buffer in such case, as the buffer is only an approximation. Instead, use the index compatible st_dwithin() function.
The idea is to select all points (polygons or else) that are within twice the distance, to group the result and keep the ones having at least 6 nearby features.
The example below use 2 tables, but you can use the same table twice.
SELECT myTable.ID, count(*), array_agg(myOtherTable.ID) as nearby_ids
FROM mytable
JOIN myOtherTable ON st_Dwithin(mytable.geom::geography, myOtherTable.geom::geography, 800)
GROUP BY myTable.ID
HAVING count(*) >= 6;
To use the same table twice, you can alias them:
SELECT a.ID, count(*), array_agg(b.ID) as nearby_ids
FROM mytable a
JOIN mytable b ON st_Dwithin(a.geom::geography, b.geom::geography, 800)
GROUP BY a.ID
HAVING count(*) >= 6;

Delete duplicate rows based on columns

I have a table called Aircraft and there are many records. The problem is that some are duplicates. I know how to select the duplicates and their counts:
SELECT flight_id, latitude, longitude, altitude, call_sign, measurement_time, COUNT(*)
FROM Aircraft
GROUP BY flight_id, latitude, longitude, altitude, call_sign, measurement_time
HAVING COUNT(*) > 1;
This returns something like:
Now, what I need to do is remove the duplicates, leaving just one each so that when I run the query again, all counts become 1.
I know that I can use the DELETE keyword, but I'm not sure how to delete it from the SELECT.
I'm sure I am missing an easy step, but I do not want to ruin my DB being a newbie.
How do I do this?
SELECT
flight_id, latitude, longitude, altitude, call_sign, measurement_time
FROM Aircraft a
WHERE EXISTS (
SELECT * FROM Aircraft x
WHERE x.flight_id = a.flight_id
AND x.latitude = a.latitude
AND x.longitude = a.longitude
AND x.altitude = a.altitude
AND x.call_sign = a.call_sign
AND x.measurement_time = a.measurement_time
AND x.id < a.id
)
;
If the query above returns thecorrect rows (to be deleted)
you can change it into a delete statement:
DELETE
FROM Aircraft a
WHERE EXISTS (
SELECT * FROM Aircraft x
WHERE x.flight_id = a.flight_id
AND x.latitude = a.latitude
AND x.longitude = a.longitude
AND x.altitude = a.altitude
AND x.call_sign = a.call_sign
AND x.measurement_time = a.measurement_time
AND x.id < a.id
)
;
I have always used the CTE method in SQL SERVER. This allows you to define columns that you want to compare, once you have established what columns make up a duplicate, then you can assign a CTE value to it and then go back and cleanup the CTE values that are greater than 1. This is an example of duplicate checking that I do.
WITH CTE AS
(select d.UID
,d.LotKey
,d.SerialNo
,d.HotWeight
,d.MarketValue
,RN = ROW_NUMBER()OVER(PARTITION BY d.HotWeight, d.serialNo, d.MarketValue order by d.SerialNo)
from LotDetail d
where d.LotKey = ('1~20161019~305')
)
DELETE FROM CTE WHERE RN <> 1
In my example I am looking at the LotDetail table where the d.hotweight and d.serial no are matching. if there is a match then the original gets CTE 1 and any duplicates get CTE 2 or greater depending on the amount of duplicates. Then you use the last DELETE statement to clear the entries that come up as duplicate. THis is really flexible so you should be able to adapt it to your issue.
Here is an example tailored to your situation.
WITH CTE AS
(select d.Flight_ID
,d.Latitude
,d.Longitude
,d.Altitude
,d.Call_sign
,d.Measurement*
,RN = ROW_NUMBER()OVER(PARTITION BY d.Flight_ID, d.Latitude, d.Longitude, d.Altitude, d.Call_Sign, d.Measurement* order by d.SerialNo)
from Aircraft d
where d.flight_id = ('**INSERT VALUE HERE')
)
DELETE FROM CTE WHERE RN <> 1
If it's a one-time operation you can create a temp table with the same schema and then copy unique rows over like so:
insert into Aircraft_temp
select distinct on (flight_id, measurement_time) Aircraft.* from Aircraft
Then swap them out by renaming, or truncate Aircraft and copy the temp contents back (truncate Aircraft; insert into Aircraft select * from Aircraft_temp;).
Safer to rename Aircraft to Aircraft_old and Aircraft_temp to Aircraft so you keep your original data until you are sure things are correct. Or at least check that the number of rows in your count query above match the count of rows in the temp table before doing the truncate.
Update2: With a separate valid primary key (assuming it is called id) you can do a DELETE based on a self join like this:
delete from Aircraft using (
select a1.id
from Aircraft a1
left join (select flight_id, measurement_time, min(id) as id from Aircraft group by 1,2) a2
on a1.id = a2.id
where a2.id is null
) as d
where Aircraft.id=d.id
This finds the minimum id (could do max too for the "latest") for each flight and identifies all the records from the full set having an id that is not the minimum (no match in the join). The unmatched ids are deleted.

Identifying rows with multiple IDs linked to a unique value

Using ms-sql 2008 r2; am sure this is very straightforward. I am trying to identify where a unique value {ISIN} has been linked to more than 1 Identifier. An example output would be:
isin entity_id
XS0276697439 000BYT-E
XS0276697439 000BYV-E
This is actually an error and I want to look for other instances where there may be more than one entity_id linked to a unique ISIN.
This is my current working but it's obviously not correct:
select isin, entity_id from edm_security_entity_map
where isin is not null
--and isin = ('XS0276697439')
group by isin, entity_id
having COUNT(entity_id) > 1
order by isin asc
Thanks for your help.
Elliot,
I don't have a copy of SQL in front of me right now, so apologies if my syntax isn't spot on.
I'd start by finding the duplicates:
select
x.isin
,count(*)
from edm_security_entity_map as x
group by x.isin
having count(*) > 1
Then join that back to the full table to find where those duplicates come from:
;with DuplicateList as
(
select
x.isin
--,count(*) -- not used elsewhere
from edm_security_entity_map as x
group by x.isin
having count(*) > 1
)
select
map.isin
,map.entity_id
from edm_security_entity_map as map
inner join DuplicateList as dup
on dup.isin = map.isin;
HTH,
Michael
So you're saying that if isin-1 has a row for both entity-1 and entity-2 that's an error but isin-3, say, linked to entity-3 in two separe rows is OK? The ugly-but-readable solution to that is to pre-pend another CTE on the previous solution
;with UniqueValues as
(select distinct
y.isin
,y.entity_id
from edm_security_entity_map as y
)
,DuplicateList as
(
select
x.isin
--,count(*) -- not used elsewhere
from UniqueValues as x
group by x.isin
having count(*) > 1
)
select
map.isin
,map.entity_id
from edm_security_entity_map as map -- or from UniqueValues, depening on your objective.
inner join DuplicateList as dup
on dup.isin = map.isin;
There are better solutions with additional GROUP BY clauses in the final query. If this is going into production I'd be recommending that. Or if your table has a bajillion rows. If you just need to do some analysis the above should suffice, I hope.

Simple SELECT, but adding JOIN returns too many rows

The query below returns 9,817 records. Now, I want to SELECT one more field from another table. See the 2 lines that are commented out, where I've simply selected this additional field and added a JOIN statement to bind this new columns. With these lines added, the query now returns 649,200 records and I can't figure out why! I guess something is wrong with my WHERE criteria in conjunction with the JOIN statement. Please help, thanks.
SELECT DISTINCT dbo.IMPORT_DOCUMENTS.ITEMID, BEGDOC, BATCHID
--, dbo.CATEGORY_COLLECTION_CATEGORY_RESULTS.CATEGORY_ID
FROM IMPORT_DOCUMENTS
--JOIN dbo.CATEGORY_COLLECTION_CATEGORY_RESULTS ON
dbo.CATEGORY_COLLECTION_CATEGORY_RESULTS.ITEMID = dbo.IMPORT_DOCUMENTS.ITEMID
WHERE (BATCHID LIKE 'IC0%' OR BATCHID LIKE 'LP0%')
AND dbo.IMPORT_DOCUMENTS.ITEMID IN
(SELECT dbo.CATEGORY_COLLECTION_CATEGORY_RESULTS.ITEMID FROM
CATEGORY_COLLECTION_CATEGORY_RESULTS
WHERE SCORE >= .7 AND SCORE <= .75 AND CATEGORY_ID IN(
SELECT CATEGORY_ID FROM CATEGORY_COLLECTION_CATS WHERE COLLECTION_ID IN (11,16))
AND Sample_Id > 0)
AND dbo.IMPORT_DOCUMENTS.ITEMID NOT IN
(SELECT ASSIGNMENT_FOLDER_DOCUMENTS.Item_Id FROM ASSIGNMENT_FOLDER_DOCUMENTS)
One possible reason is because one of your tables contains data at lower level, lower than your join key. For example, there may be multiple records per item id. The same item id is repeated X number of times. I would fix the query like the below. Without data knowledge, Try running the below modified query.... If output is not what you're looking for, convert it into SELECT Within a Select...
Hope this helps....
Try this SQL: SELECT DISTINCT a.ITEMID, a.BEGDOC, a.BATCHID, b.CATEGORY_ID FROM IMPORT_DOCUMENTS a JOIN (SELECT DISTINCT ITEMID FROM CATEGORY_COLLECTION_CATEGORY_RESULTS WHERE SCORE >= .7 AND SCORE <= .75 AND CATEGORY_ID IN (SELECT DISTINCT CATEGORY_ID FROM CATEGORY_COLLECTION_CATS WHERE COLLECTION_ID IN (11,16)) AND Sample_Id > 0) B ON a.ITEMID =b.ITEMID WHERE a.(a.BATCHID LIKE 'IC0%' OR a.BATCHID LIKE 'LP0%') AND a.ITEMID NOT IN (SELECT DIDTINCT Item_Id FROM ASSIGNMENT_FOLDER_DOCUMENTS)

postgis advanced (?) selection query

The problem: I need to select, for each building in my table that has say at least 2 pharmacies and 2 education centers within a radius of 1km, all POIs (pharmacies, comercial centres, medical centers, education centers, police stations, fire stations) which are within 1km of the respective building. table structure->
building (id serial, name varchar )
poi_category(id serial, cname varchar) --cname being the category name of course
poi(id serial, name varchar, c_id integer)-- c_id is the FK referencing poi_category(id)
all coordinate columns are of type geometry not geography (let's call them geom)
here's the way i thought it should be done but i'm not sure it's even correct let alone the optimal solution to this problem
SELECT r.id_b, r.id_p
FROM (
SELECT b.id AS id_b, p.id AS id_p, pc.id AS id_pc,pc.cname
FROM building AS b, poi AS p, poi_category AS pc
WHERE ST_DWithin(b.geom,p.geom, 1000) AND p.c_id=pc.id
) AS r,
(
SELECT * FROM r GROUP BY id_b
) AS r1
HAVING count (
SELECT *
FROM r, r1
WHERE r1.id_b=r.id_b AND r.id_pc='pharmacy'
)>1
AND
count (
SELECT *
FROM r, r1
WHERE r1.id_b=r.id_b AND r.id_pc='ed. centre'
)>1
Is this the way to go for what i need ? What solution would be better from a performance point of view? What about the most elegant solution?
I've also posted here :http://gis.stackexchange.com/questions/11445/postgis-advanced-selection-query
This is a solution I elaborated. It's the fastest one I could find but it's still slow. Given the nature of the task I doubt it can be made faster...
WITH
building AS (
SELECT way, osm_id
FROM osm_polygon
WHERE tags #> hstore('building','yes')
--ORDER BY 1
LIMIT 1000
),
pharmacy AS (
SELECT way
FROM osm_poi
WHERE tags #> hstore('amenity','pharmacy')
),
school AS (
SELECT way
FROM osm_poi
WHERE tags #> hstore('amenity','school')
)
SELECT ST_AsText(building.way) AS geom, building.osm_id AS label
FROM building
WHERE
(SELECT count(*) > 1
FROM pharmacy
WHERE ST_DWithin(building.way,pharmacy.way,1000))
AND
(SELECT count(*) > 1
FROM school
WHERE ST_DWithin(building.way,school.way,1000))
Yours. S.