I have thought of following query but it seems syntactically wrong.
delete from table where geom_line in(select distinct(a.geom_line) from table
a ,table b where a.source=b.target and b.source=a.target);
my idea is to delete on of the edges of there exists edges where same source to target is target to source in another edge that is two edges for nodes 1 and 2, from 1->2 and 2->1 and i will delete one of them.
When running only the subquery i get distinct edges with there "geom_lines" as 2000 rows while when i run the whole query 4000 rows are deleted what could be wrong here?
That should be as simple as
DELETE FROM "table" a
USING "table" b
WHERE a.source = b.target
AND a.target = b.source
AND a.source > a.target;
Related
I have seen a strange behavior in PostgreSQL (PostGIS).
I have two tables in PostGIS with geometry columns. one table is a grid and the other one is lines. I want to delete all grid cells that no line passes through them.
In other words, I want to delete the rows from a table when that row has no spatial intersection with any rows of second table.
First, in a subquery, I find the ids of the rows that have any intersection. Then, I delete any row that its id is not in that returned list of ids.
DELETE FROM base_grid_916453354
WHERE id NOT IN
(
SELECT DISTINCT bg.id
FROM base_grid_916453354 bg, (SELECT * FROM tracks_heatmap_1000
LIMIT 100000) tr
WHERE bg.geom && tr.buffer
);
The following subquery returns in only 12 seconds
SELECT DISTINCT bg.id
FROM base_grid_916453354 bg, (SELECT * FROM tracks_heatmap_1000 LIMIT
100000) tr
WHERE bg.geom && tr.buffer
, while the whole query did not return even in 1 hour!!
I ran explain command and it is the result of it, but I cannot interpret it:
How can I improve this query and why deleting from the returned list takes so much of time?
It is very strange because the subquery is a spatial query between 2 tables of 9 million and 100k rows, while the delete part is just checking a list and deleting!! In my mind, the delete part is much much easier.
Don't post text as images of text!
Increase work_mem until the subplan becomes a hashed subplan.
Or rewrite it to use 'NOT EXISTS' rather than NOT IN
I found a fast way to do this query. As #jjanes said, I used EXISTS() function:
DELETE FROM base_grid_916453354 bg
WHERE NOT EXISTS
(
SELECT 1
FROM tracks_heatmap_1000 tr
WHERE bg.geom && tr.buffer
);
This query takes around 1 minute and it is acceptable for the size of my tables.
Several of the same geometries appears in the database. I an output with only one object_id for every distinct geometry.
I have managed to get a result with two overlapping geometries.
SELECT
a.objekt_id,
b.objekt_id,
a.geometri
b.geometri
from
plandk.theme_pdk_tilslutningspligtomraade_vedtaget_v a,
plandk.theme_pdk_tilslutningspligtomraade_vedtaget_v b
WHERE
ST_EQUALS(a.geometri, b.geometri)
AND
a.objekt_id != b.objekt_id;
The result is a table showing a row for two overlapping geometries. Though sometimes there are three rows with six overlapping geometries. I want the result to have all these in one row.
DELETE FROM plandk.theme_pdk_tilslutningspligtomraade_vedtaget_v t
WHERE EXISTS (SELECT FROM plandk.theme_pdk_tilslutningspligtomraade_vedtaget_v t2
WHERE t.objekt_id > t2.objekt_id
AND ST_EQUALS(a.geometri, b.geometri))
It will delete all doubled geometries except this with lowest id for each equal geometry group.
The query is like so:
CREATE TABLE Edge_Table AS
SELECT a.*gid,nextval('ty') AS edge_gid,
ST_SetSRID(ST_MakeLine(a.geom, getcentroids(a.gid)),4326) AS geom_line
FROM Points_table a;
where my getcentroids Function returns 8 nearest points to each point creating an edge with each one of them, the problem arises with duplicates as same edge is created from 1->2 and 2->1, how do I optimise this query itself as large data has to be processed, could index or UNIQUE constraint help?
I am currently trying to join two tables, where both of the tables have very many different in the columns I am joining.
Here's the tsql
from AVG(Position) as Position from MonitoringGsc_Keywords as sk
Join GSC_RankingData on sk.Id = GSC_RankingData.KeywordId
groupy by sk.Id
The execution plan shows me, that it takes very much time to perform the join. I think it is because a huge group from the first table has to be compared with a huge group of values in the second table.
MonitoringGsc_Keywords.Id has about 60.000 different values
GSC_RankingData hat about 100.000.000 Values
MonitoringGsc_Keywords.Id is Primary-Key of MonitoringGsc_Keywords GSC_RankingData.KeywordId is indexed.
So, what can i do to increase performance?
Is Position column from GSC_RankingData table? If yes then JOIN is redundant and query should looks like this:
SELECT AVG(rd.Position) as Position
FROM GSC_RankingData rd
GROUP BY rd.KeywordId;
If Position column is in GSC_RankingData table then index on GSC_RankingData should include this column and looks like this:
CREATE INDEX IX_GSC_RankingData_KeywordId_Position ON GSC_RankingData(KeywordId) INCLUDE(Position);
You should check indexes fragmentation for this tables, to do this you could use this query:
SELECT * FROM sys.dm_db_index_physical_stats(db_id(), object_id('MonitoringGsc_Keywords'), null, null, 'DETAILED')
if avg_fragmentation_in_percent > 5% and < 30% then
ALTER INDEX [index name] on [table name] REORGANIZE;
if avg_fragmentation_in_percent >= 30% then
ALTER INDEX [index name] on [table name] REBUILD;
It could be problem with statistics, you could check it with query:
SELECT
sp.stats_id, name, filter_definition, last_updated, rows, rows_sampled,
steps, unfiltered_rows, modification_counter
FROM sys.stats AS stat
CROSS APPLY sys.dm_db_stats_properties(stat.object_id, stat.stats_id) AS sp
WHERE stat.object_id = object_id('GSC_RankingData');
check last update date, rows count, if it not be current then update statistics. Also it could be possible that statistics not exist, then you must create it.
I have two tables, table1 and table2, both of which contain columns that store postgis geometries. What I want to do is see where the geometry stored in any row of table2 geometrically intersects with the geometry stored in any row of table1 and update a count column in table1 with the number of intersections. Therefore, if I have a geometry in row 1 of table1 that intersects with the geometries stored in 5 rows in table2, I want to store a count of 5 in a separate column in table one. The tricky part for me is that I want to do this for every row of column 1 at the same time.
I have the following:
UPDATE circles SET intersectCount = intersectCount + 1 FROM rectangles
WHERE ST_INTERSECTS(cirlces.geom, rectangles.geom);
...which doesn't seem to be working. I'm not too familiar with postgres (or sql in general) and I'm wondering if I can do this all in one statement or if I need a few. I have some ideas for how I would do this with multiple statements (or using for loop) but I'm really looking for a concise solution. Any help would be much appreciated.
Thanks!
something like:
update t1 set ctr=helper.ctr
from (
select t1.id, count(*) as cnt
from t1, t2
where st_intersects(t1.col, t2.col)
group by t1.id
) helper
where helper.id=t1.id
?
btw: Your version does not work, because a row can get updated only once in a single update statement.