posgtres version 9.1.9
Following query produces different plan when run in two different databases.
explain (analyze,buffers) SELECT group_.groupid AS groupId,
group_.name AS groupName,
group_.type_ AS groupType,
group_.friendlyurl AS groupFriendlyURL
FROM group_
inner join groups_orgs
ON ( groups_orgs.groupid = group_.groupid )
inner join users_orgs
ON ( users_orgs.organizationid = groups_orgs.organizationid )
WHERE ( group_.livegroupid = 0 )
AND ( users_orgs.userid = '27091470' )
AND ( group_.companyid = '20002' )
AND ( group_.classnameid = 10001
OR group_.classnameid = 10003 )
AND ( group_.name != 'Control Panel' )
AND ( group_.type_ != 4 )
;
Plan from Production database.
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=6.36..16.60 rows=1 width=37) (actual time=0.133..95.323 rows=3 loops=1)
Merge Cond: (group_.groupid = groups_orgs.groupid)
Buffers: shared hit=30829
-> Index Scan using group__pkey on group_ (cost=0.00..87997.62 rows=17244 width=37) (actual time=0.030..85.166 rows=13906 loops=1)
Filter: (((name)::text <> 'Control Panel'::text) AND (type_ <> 4) AND (livegroupid = 0) AND (companyid = 20002::bigint) AND ((classnameid = 10001) OR (classnameid = 10003)))
Buffers: shared hit=30824
-> Sort (cost=6.36..6.37 rows=3 width=8) (actual time=0.076..0.079 rows=3 loops=1)
Sort Key: groups_orgs.groupid
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=5
-> Merge Join (cost=1.05..6.34 rows=3 width=8) (actual time=0.045..0.054 rows=3 loops=1)
Merge Cond: (users_orgs.organizationid = groups_orgs.organizationid)
Buffers: shared hit=5
-> Index Scan using users_orgs_pkey on users_orgs (cost=0.00..10.47 rows=2 width=8) (actual time=0.012..0.014 rows=2 loops=1)
Index Cond: (userid = 27091470::bigint)
Buffers: shared hit=4
-> Sort (cost=1.05..1.06 rows=3 width=16) (actual time=0.028..0.030 rows=3 loops=1)
Sort Key: groups_orgs.organizationid
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=1
-> Seq Scan on groups_orgs (cost=0.00..1.03 rows=3 width=16) (actual time=0.003..0.005 rows=3 loops=1)
Buffers: shared hit=1
Plan from database which is created by exporting/importing data from production
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..18.19 rows=1 width=36) (actual time=0.053..0.104 rows=3 loops=1)
Buffers: shared hit=18
-> Nested Loop (cost=0.00..9.77 rows=1 width=8) (actual time=0.036..0.065 rows=3 loops=1)
Join Filter: (groups_orgs.organizationid = users_orgs.organizationid)
Buffers: shared hit=6
-> Seq Scan on groups_orgs (cost=0.00..1.03 rows=3 width=16) (actual time=0.007..0.010 rows=3 loops=1)
Buffers: shared hit=1
-> Materialize (cost=0.00..8.66 rows=2 width=8) (actual time=0.008..0.012 rows=3 loops=3)
Buffers: shared hit=5
-> Index Scan using ix_fb646ca6 on users_orgs (cost=0.00..8.65 rows=2 width=8) (actual time=0.016..0.021 rows=3 loops=1)
Index Cond: (userid = 27091470::bigint)
Buffers: shared hit=5
-> Index Scan using group__pkey on group_ (cost=0.00..8.41 rows=1 width=36) (actual time=0.008..0.010 rows=1 loops=3)
Index Cond: (groupid = groups_orgs.groupid)
Filter: (((name)::text <> 'Control Panel'::text) AND (type_ <> 4) AND (livegroupid = 0) AND (companyid = 20002::bigint) AND ((classnameid = 10001) OR (classnameid = 10003)))
Buffers: shared hit=12
Production query takes around 100ms and in other DB takes 0.1ms
Difference seems to be slow index scan on group_ table (Index Scan using group__pkey on group_)
Can anyone explain the difference in execution time?
Tables in production are regularly vacuumed and analyzed.
Production DB is more busy than other DB.
Thanks,
Sameer
I was facing the same issue, in the "backup" database a query was blazing fast, and the same query when run in the production database was horribly slow. It turns out I only needed to perform Routine Database Maintenance Tasks more often to solve the problem:
The PostgreSQL query planner relies on statistical information about the contents of tables in order to generate good plans for queries. These statistics are gathered by the ANALYZE command, which can be invoked by itself or as an optional step in VACUUM. It is important to have reasonably accurate statistics, otherwise poor choices of plans may degrade database performance.
So if you're experiencing a similar issue try running the following command on your "slow" database:
$ vacuumdb --host localhost --port 5432 --username "MyUser" -d "MyDatabase" --analyze --verbose
Related
I have a PostgreSQL database that I cloned.
Database 1 has varchar(36) as primary keys
Database 2 (the clone) has UUID as primary keys.
Both contain the same data. What I don't understand is why queries on Database 1 will use the index but Database 2 will not. Here's the query:
EXPLAIN (ANALYZE, BUFFERS)
select * from table1
INNER JOIN table2 on table1.id = table2.table1_id
where table1.id in (
'541edffc-7179-42db-8c99-727be8c9ffec',
'eaac06d3-e44e-4e4a-8e11-1cdc6e562996'
);
Database 1
Nested Loop (cost=16.13..7234.96 rows=14 width=803) (actual time=0.072..0.112 rows=8 loops=1)
Buffers: shared hit=23
-> Index Scan using table1_pk on table1 (cost=0.56..17.15 rows=2 width=540) (actual time=0.042..0.054 rows=2 loops=1)
" Index Cond: ((id)::text = ANY ('{541edffc-7179-42db-8c99-727be8c9ffec,eaac06d3-e44e-4e4a-8e11-1cdc6e562996}'::text[]))"
Buffers: shared hit=12
-> Bitmap Heap Scan on table2 (cost=15.57..3599.86 rows=904 width=263) (actual time=0.022..0.023 rows=4 loops=2)
Recheck Cond: ((table1_id)::text = (table1.id)::text)
Heap Blocks: exact=3
Buffers: shared hit=11
-> Bitmap Index Scan on table2_table1_id_fk (cost=0.00..15.34 rows=904 width=0) (actual time=0.019..0.019 rows=4 loops=2)
Index Cond: ((table1_id)::text = (table1.id)::text)
Buffers: shared hit=8
Planning:
Buffers: shared hit=416
Planning Time: 1.869 ms
Execution Time: 0.330 ms
Database 2
Gather (cost=1000.57..1801008.91 rows=14 width=740) (actual time=11.580..42863.893 rows=8 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=863 read=631539 dirtied=631979 written=2523
-> Nested Loop (cost=0.56..1800007.51 rows=6 width=740) (actual time=28573.119..42856.696 rows=3 loops=3)
Buffers: shared hit=863 read=631539 dirtied=631979 written=2523
-> Parallel Seq Scan on table1 (cost=0.00..678896.46 rows=1 width=519) (actual time=28573.112..42855.524 rows=1 loops=3)
" Filter: (id = ANY ('{541edffc-7179-42db-8c99-727be8c9ffec,eaac06d3-e44e-4e4a-8e11-1cdc6e562996}'::uuid[]))"
Rows Removed by Filter: 2976413
Buffers: shared hit=854 read=631536 dirtied=631979 written=2523
-> Index Scan using table2_table1_id_fk on table2 (cost=0.56..1117908.70 rows=320236 width=221) (actual time=1.736..1.745 rows=4 loops=2)
Index Cond: (table1_id = table1.id)
Buffers: shared hit=9 read=3
Planning:
Buffers: shared hit=376 read=15
Planning Time: 43.594 ms
Execution Time: 42864.044 ms
Some notes:
The query is orders of magnitude faster in Database 1
Having only one ID in the WHERE clause activates the index in both databases
Casting to ::uuid has no impact
I understand that these results are because the query planner calculates that the cost of the index in the UUID (Database 2) case is too high. But I'm trying to understand why it thinks that and if there's something I can do.
I have this very slow query:
SELECT DISTINCT et.id
FROM elementtype et
where et.id = any
(SELECT elementtypeid
FROM
(SELECT ic.elementtypeid
FROM
(SELECT categoryid
FROM issue
WHERE clientid = '833e1f2f-ff44-4aca-bd12-0e4f67969a11'
AND deleteddate IS NULL
GROUP BY categoryid) i
JOIN issuecategory ic ON ic.id = i.categoryid
UNION SELECT tc.elementtypeid
FROM
(SELECT categoryid
FROM task
WHERE clientid = '833e1f2f-ff44-4aca-bd12-0e4f67969a11'
AND deleteddate IS NULL
GROUP BY categoryid) t
JOIN taskcategory tc ON tc.id = t.categoryid) icc)
I have tried to change the ANY operator with IN, made an join instead of IN (in line 3 of the query) but it is still very slow, when the result is not cached.
I think it might be the nested loop making the problem - but I dont know if I can get rid of it - and why et only
As you can see, I use a couple of indexes _idx an of course primary keys on every table.
the elementtype table has ~6000 rows
the issue sub-query with these conditions (not group by) returns ~33000 rows
the task sub-query with these conditions (not group by) returns ~148000 rows
Is there any way to optimize the query?
EDIT:
As requested by #a_horse_with_no_name I add a query plan using the command he/she surgested. The best way to post it in here, is is using an image, I think:
QUERY PLAN
Unique (cost=473976.82..474453.63 rows=4453 width=16) (actual time=69897.728..69897.737 rows=1 loops=1)
Buffers: shared hit=61346 read=19651
-> Merge Join (cost=473976.82..474442.49 rows=4453 width=16) (actual time=69897.724..69897.731 rows=1 loops=1)
Merge Cond: (et.id = ic.elementtypeid)
Buffers: shared hit=61346 read=19651
-> Index Only Scan using elementtype_pkey on elementtype et (cost=0.28..384.47 rows=5879 width=16) (actual time=0.021..32.618 rows=1784 loops=1)
Heap Fetches: 1784
Buffers: shared hit=1699 read=54
-> Sort (cost=473976.54..473987.67 rows=4453 width=16) (actual time=69863.461..69863.464 rows=1 loops=1)
Sort Key: ic.elementtypeid
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=59647 read=19597
-> HashAggregate (cost=473617.61..473662.14 rows=4453 width=16) (actual time=69863.432..69863.436 rows=1 loops=1)
Group Key: ic.elementtypeid
Buffers: shared hit=59647 read=19597
-> Append (cost=107927.43..473606.48 rows=4453 width=16) (actual time=114.259..69863.317 rows=55 loops=1)
Buffers: shared hit=59647 read=19597
-> Hash Join (cost=107927.43..109170.43 rows=3625 width=16) (actual time=114.257..208.716 rows=46 loops=1)
Hash Cond: (ic.id = issue.categoryid)
Buffers: shared hit=15431
-> Seq Scan on issuecategory ic (cost=0.00..1100.36 rows=54336 width=32) (actual time=0.011..47.327 rows=54336 loops=1)
Buffers: shared hit=557
-> Hash (cost=107882.12..107882.12 rows=3625 width=16) (actual time=113.850..113.850 rows=46 loops=1)
Buckets: 4096 Batches: 1 Memory Usage: 35kB
Buffers: shared hit=14874
-> HashAggregate (cost=107809.62..107845.87 rows=3625 width=16) (actual time=113.738..113.795 rows=46 loops=1)
Group Key: issue.categoryid
Buffers: shared hit=14874
-> Bitmap Heap Scan on issue (cost=1801.41..107730.88 rows=31493 width=16) (actual time=7.279..81.266 rows=33670 loops=1)
Recheck Cond: (clientid = '833e1f2f-ff44-4aca-bd12-0e4f67969a11'::uuid)
Filter: (deleteddate IS NULL)
Rows Removed by Filter: 1362
Heap Blocks: exact=14636
Buffers: shared hit=14874
-> Bitmap Index Scan on issue_clientid_ix (cost=0.00..1793.54 rows=32681 width=0) (actual time=5.165..5.166 rows=35064 loops=1)
Index Cond: (clientid = '833e1f2f-ff44-4aca-bd12-0e4f67969a11'::uuid)
Buffers: shared hit=238
-> Nested Loop (cost=360635.19..364391.52 rows=828 width=16) (actual time=69603.779..69654.505 rows=9 loops=1)
Buffers: shared hit=44216 read=19597
-> HashAggregate (cost=360634.78..360643.06 rows=828 width=16) (actual time=69592.635..69592.657 rows=9 loops=1)
Group Key: task.categoryid
Buffers: shared hit=44198 read=19579
-> Bitmap Heap Scan on task (cost=3438.67..360280.46 rows=141728 width=16) (actual time=33.283..69416.182 rows=147931 loops=1)
Recheck Cond: (clientid = '833e1f2f-ff44-4aca-bd12-0e4f67969a11'::uuid)
Filter: (deleteddate IS NULL)
Rows Removed by Filter: 2329
Heap Blocks: exact=63193
Buffers: shared hit=44198 read=19579
-> Bitmap Index Scan on task_clientid_ix (cost=0.00..3403.24 rows=148091 width=0) (actual time=20.865..20.866 rows=150975 loops=1)
Index Cond: (clientid = '833e1f2f-ff44-4aca-bd12-0e4f67969a11'::uuid)
Buffers: shared hit=584
-> Index Scan using taskcategory_pkey on taskcategory tc (cost=0.42..4.52 rows=1 width=32) (actual time=6.865..6.865 rows=1 loops=9)
Index Cond: (id = task.categoryid)
Buffers: shared hit=18 read=18
Planning time: 1.173 ms
Execution time: 69899.380 ms
EDIT2:
issuecategory has index on id, clintid, elementypeid
issue has index on clientid, deleteddate and categoryid
taskcategory has index on id, clientid, elementtypeid,
task has index on clientid, id, deleteddate, categoryid
The problem is the bitmap heap scans. They seem to be jumping to a lot of different parts of the disk to fetch the data they need.
The best solution is probably to create indexes on (clientid, categoryid, deleteddate) on each table, or maybe (clientid, categoryid) where deleteddate is null. This will allow those bitmap heap scans to be replaced with index-only scans (assuming your tables are vacuumed well enough).
Other approaches would be to CLUSTER the tables so that rows with the same clientid are physically grouped together, or increase effective_io_concurrency so more IO can be done at the same time (assuming your storage system has multiple spindles in RAID/JBOD, or whatever the SSD equivalent to that is).
I have the following two tables.
person_addresses
address_normalization
The person_addresses table has a field named address_id as the primary key and address_normalization has the corresponding field address_id which has an index on it.
Now, when I explain the following query, I see a sequential scan.
SELECT
count(*)
FROM
mp_member2.person_addresses pa
JOIN mp_member2.address_normalization an ON
an.address_id = pa.address_id
WHERE
an.sr_modification_time >= 1550692189468;
-- Result: 2654
Please refer to the following screenshot.
You see that there is a sequential scan after the hash join. I'm not sure I understand this part; why would a sequential scan follow a hash join.
And as seen in the query above, the set of records returned is also low.
Is this expected behaviour or am I doing something wrong?
Update #1: I also have indices on the sr_modification_time fields of both the tables
Update #2: Full execution plan
Aggregate (cost=206944.74..206944.75 rows=1 width=0) (actual time=2807.844..2807.844 rows=1 loops=1)
Buffers: shared hit=4629 read=82217
-> Hash Join (cost=2881.95..206825.15 rows=47836 width=0) (actual time=0.775..2807.160 rows=2654 loops=1)
Hash Cond: (pa.address_id = an.address_id)
Buffers: shared hit=4629 read=82217
-> Seq Scan on person_addresses pa (cost=0.00..135924.93 rows=4911993 width=8) (actual time=0.005..1374.610 rows=4911993 loops=1)
Buffers: shared hit=4588 read=82217
-> Hash (cost=2432.05..2432.05 rows=35992 width=18) (actual time=0.756..0.756 rows=1005 loops=1)
Buckets: 4096 Batches: 1 Memory Usage: 41kB
Buffers: shared hit=41
-> Index Scan using mp_member2_address_normalization_mod_time on address_normalization an (cost=0.43..2432.05 rows=35992 width=18) (actual time=0.012..0.424 rows=1005 loops=1)
Index Cond: (sr_modification_time >= 1550692189468::bigint)
Buffers: shared hit=41
Planning time: 0.244 ms
Execution time: 2807.885 ms
Update #3: I tried with a newer timestamp and it used an index scan.
EXPLAIN (
ANALYZE
, buffers
, format TEXT
) SELECT
COUNT(*)
FROM
mp_member2.person_addresses pa
JOIN mp_member2.address_normalization an ON
an.address_id = pa.address_id
WHERE
an.sr_modification_time >= 1557507300342;
-- count: 1364
Query Plan:
Aggregate (cost=295.48..295.49 rows=1 width=0) (actual time=2.770..2.770 rows=1 loops=1)
Buffers: shared hit=1404
-> Nested Loop (cost=4.89..295.43 rows=19 width=0) (actual time=0.038..2.491 rows=1364 loops=1)
Buffers: shared hit=1404
-> Index Scan using mp_member2_address_normalization_mod_time on address_normalization an (cost=0.43..8.82 rows=14 width=18) (actual time=0.009..0.142 rows=341 loops=1)
Index Cond: (sr_modification_time >= 1557507300342::bigint)
Buffers: shared hit=14
-> Bitmap Heap Scan on person_addresses pa (cost=4.46..20.43 rows=4 width=8) (actual time=0.004..0.005 rows=4 loops=341)
Recheck Cond: (address_id = an.address_id)
Heap Blocks: exact=360
Buffers: shared hit=1390
-> Bitmap Index Scan on idx_mp_member2_person_addresses_address_id (cost=0.00..4.46 rows=4 width=0) (actual time=0.003..0.003 rows=4 loops=341)
Index Cond: (address_id = an.address_id)
Buffers: shared hit=1030
Planning time: 0.214 ms
Execution time: 2.816 ms
That is the expected behavior because you don't have index for sr_modification_time so after create the hash join db has to scan the whole set to check each row for the sr_modification_time value
You should create:
index for (sr_modification_time)
or composite index for (address_id , sr_modification_time )
I have a psql DB containing various Materialized Views, on running a query, i.e., query_a we complete the query execution in 2800ms and re-running the same query again we get an execution time of 53ms. This can be explained by the caching done by psql. Now comes the tricky part, I create a dump of this db and restore it in NewDB, when I re-run query_a I get an execution time of 2253ms and on re-running get the same time, i.e., it seems that the psql caching is not working on the NewDB.
I conducted various experiments to rectify the same and noticed that there is no improvement when I explicitly refresh the views but if I drop these views and re create it in my NewDB, it gives me the original performance.
Note that the data is constant in DB and NewDB and I have used the commands mentioned here for dump creation and restore.
The result for re running the query on DB is ->
The results for running the same query on NewDB for 1st and 2nd time are as follows ->
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=113790614477.61..113790614477.62 rows=1 width=8) (actual time=2284.605..2284.605 rows=1 loops=1)
Buffers: shared hit=3540872
CTE t
-> Merge Join (cost=40600.92..11846650.56 rows=763041594 width=425) (actual time=3.693..1909.916 rows=6005 loops=1)
Merge Cond: (n.node_id = nd.node_id)
Buffers: shared hit=3524063
-> Index Scan using nodes_node_id on nodes n (cost=0.43..350865.91 rows=3824099 width=389) (actual time=0.014..1651.025 rows=3598491 loops=1)
Buffers: shared hit=3523372
-> Sort (cost=40600.49..40700.26 rows=39907 width=40) (actual time=3.668..4.227 rows=6005 loops=1)
Sort Key: nd.node_id
Sort Method: quicksort Memory: 623kB
Buffers: shared hit=691
-> Bitmap Heap Scan on nodes_depths nd (cost=1153.11..37550.73 rows=39907 width=40) (actual time=0.627..2.846 rows=6005 loops=1)
Recheck Cond: ((ancestor_1 = 1) OR (ancestor_2 = 1))
Heap Blocks: exact=658
Buffers: shared hit=691
-> BitmapOr (cost=1153.11..1153.11 rows=40007 width=0) (actual time=0.547..0.547 rows=0 loops=1)
Buffers: shared hit=33
-> Bitmap Index Scan on nodes_depths_1 (cost=0.00..566.58 rows=20003 width=0) (actual time=0.032..0.032 rows=156 loops=1)
Index Cond: (ancestor_1 = 1)
Buffers: shared hit=4
-> Bitmap Index Scan on nodes_depths_2 (cost=0.00..566.58 rows=20003 width=0) (actual time=0.515..0.515 rows=5849 loops=1)
Index Cond: (ancestor_2 = 1)
Buffers: shared hit=29
-> Merge Right Join (cost=169565733.26..97549168801.28 rows=6491839610305 width=0) (actual time=1915.721..2284.175 rows=6005 loops=1)
Merge Cond: (nodes_fts.node_id = t.node_id)
Buffers: shared hit=3540872
-> Index Only Scan using nodes_fts_idx on nodes_fts (cost=0.43..97055.96 rows=1701569 width=4) (actual time=0.041..277.890 rows=1598712 loops=1)
Heap Fetches: 1598712
Buffers: shared hit=16805
-> Materialize (cost=169565732.84..173380940.81 rows=763041594 width=4) (actual time=1915.675..1916.583 rows=6005 loops=1)
Buffers: shared hit=3524067
-> Sort (cost=169565732.84..171473336.82 rows=763041594 width=4) (actual time=1915.672..1916.057 rows=6005 loops=1)
Sort Key: t.node_id
Sort Method: quicksort Memory: 474kB
Buffers: shared hit=3524067
-> CTE Scan on t (cost=0.00..15260831.88 rows=763041594 width=4) (actual time=3.698..1914.771 rows=6005 loops=1)
Buffers: shared hit=3524063
Planning time: 68.064 ms
Execution time: 2285.084 ms
(40 rows)
and for the second run ->
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=113790614477.61..113790614477.62 rows=1 width=8) (actual time=2295.319..2295.319 rows=1 loops=1)
Buffers: shared hit=3540868
CTE t
-> Merge Join (cost=40600.92..11846650.56 rows=763041594 width=425) (actual time=15.324..1926.744 rows=6005 loops=1)
Merge Cond: (n.node_id = nd.node_id)
Buffers: shared hit=3524063
-> Index Scan using nodes_node_id on nodes n (cost=0.43..350865.91 rows=3824099 width=389) (actual time=0.027..1648.277 rows=3598491 loops=1)
Buffers: shared hit=3523372
-> Sort (cost=40600.49..40700.26 rows=39907 width=40) (actual time=15.254..15.903 rows=6005 loops=1)
Sort Key: nd.node_id
Sort Method: quicksort Memory: 623kB
Buffers: shared hit=691
-> Bitmap Heap Scan on nodes_depths nd (cost=1153.11..37550.73 rows=39907 width=40) (actual time=3.076..10.752 rows=6005 loops=1)
Recheck Cond: ((ancestor_1 = 1) OR (ancestor_2 = 1))
Heap Blocks: exact=658
Buffers: shared hit=691
-> BitmapOr (cost=1153.11..1153.11 rows=40007 width=0) (actual time=2.524..2.525 rows=0 loops=1)
Buffers: shared hit=33
-> Bitmap Index Scan on nodes_depths_1 (cost=0.00..566.58 rows=20003 width=0) (actual time=0.088..0.088 rows=156 loops=1)
Index Cond: (ancestor_1 = 1)
Buffers: shared hit=4
-> Bitmap Index Scan on nodes_depths_2 (cost=0.00..566.58 rows=20003 width=0) (actual time=2.434..2.435 rows=5849 loops=1)
Index Cond: (ancestor_2 = 1)
Buffers: shared hit=29
-> Merge Right Join (cost=169565733.26..97549168801.28 rows=6491839610305 width=0) (actual time=1933.113..2294.894 rows=6005 loops=1)
Merge Cond: (nodes_fts.node_id = t.node_id)
Buffers: shared hit=3540868
-> Index Only Scan using nodes_fts_idx on nodes_fts (cost=0.43..97055.96 rows=1701569 width=4) (actual time=0.077..271.313 rows=1598712 loops=1)
Heap Fetches: 1598712
Buffers: shared hit=16805
-> Materialize (cost=169565732.84..173380940.81 rows=763041594 width=4) (actual time=1933.030..1933.903 rows=6005 loops=1)
Buffers: shared hit=3524063
-> Sort (cost=169565732.84..171473336.82 rows=763041594 width=4) (actual time=1933.026..1933.375 rows=6005 loops=1)
Sort Key: t.node_id
Sort Method: quicksort Memory: 474kB
Buffers: shared hit=3524063
-> CTE Scan on t (cost=0.00..15260831.88 rows=763041594 width=4) (actual time=15.336..1932.145 rows=6005 loops=1)
Buffers: shared hit=3524063
Planning time: 1.154 ms
Execution time: 2295.801 ms
(40 rows)
The estimated number of rows is off from the actual numbers by orders of magnitude:
CTE Scan on t (cost=0.00..15260831.88 rows=763041594 width=4)
(actual time=15.336..1932.145 rows=6005 loops=1)
When Postgres can't make accurate estimates of how much work a particular way of executing your query is compared to another it will generate inefficient query plans and that is why the same query can be slow even if all the data is in RAM.
When you backup a table the dump does not contain the statistics used by the optimizer so you need to wait for the autovacuum daemon or run 'ANALYZE ' manually after restoring from the dump.
I'm using postgres 10, and have the following query
select
count(task.id) over() as _total_ ,
json_agg(u.*) as users,
task.*
from task
left outer join taskuserlink_history tu on (task.id = tu.taskid)
left outer join "user" u on (tu.userId = u.id)
group by task.id offset 10 limit 10;
this query takes approx 800ms to execute
if I remove the count(task.id) over() as _total_ , line, then it executes in 250ms
I have to confess being a complete sql noob, so the query itself may be completely borked
I was wondering if anyone could point to the flaws in the query, and make suggestions on how to speed it up.
The number of tasks is approx 15k, with an average of 5 users per task, linked through taskuserlink
I have looked at the pgadmin "explain" diagram
but to be honest can't really figure it out yet ;)
the table definitions are
task , with id (int) as primary column
taskuserlink_history, with taskId (int) and userId (int) (both as foreign key constraints, indexed)
user, with id (int) as primary column
the query plan is as follows
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=4.74..12.49 rows=10 width=44) (actual time=1178.016..1178.043 rows=10 loops=1)
Buffers: shared hit=3731, temp read=6655 written=6914
-> WindowAgg (cost=4.74..10248.90 rows=13231 width=44) (actual time=1178.014..1178.040 rows=10 loops=1)
Buffers: shared hit=3731, temp read=6655 written=6914
-> GroupAggregate (cost=4.74..10083.51 rows=13231 width=36) (actual time=0.417..1049.294 rows=13255 loops=1)
Group Key: task.id
Buffers: shared hit=3731
-> Nested Loop Left Join (cost=4.74..9586.77 rows=66271 width=36) (actual time=0.103..309.372 rows=66162 loops=1)
Join Filter: (taskuserlink_history.userid = user_archive.id)
Rows Removed by Join Filter: 1182904
Buffers: shared hit=3731
-> Merge Left Join (cost=0.58..5563.22 rows=66271 width=8) (actual time=0.044..73.598 rows=66162 loops=1)
Merge Cond: (task.id = taskuserlink_history.taskid)
Buffers: shared hit=3629
-> Index Only Scan using task_pkey on task (cost=0.29..1938.30 rows=13231 width=4) (actual time=0.026..7.683 rows=13255 loops=1)
Heap Fetches: 13255
Buffers: shared hit=1810
-> Index Scan using taskuserlink_history_task_fk_idx on taskuserlink_history (cost=0.29..2764.46 rows=66271 width=8) (actual time=0.015..40.109 rows=66162 loops=1)
Filter: (timeend IS NULL)
Rows Removed by Filter: 13368
Buffers: shared hit=1819
-> Materialize (cost=4.17..50.46 rows=4 width=36) (actual time=0.000..0.001 rows=19 loops=66162)
Buffers: shared hit=102
-> Bitmap Heap Scan on user_archive (cost=4.17..50.44 rows=4 width=36) (actual time=0.050..0.305 rows=45 loops=1)
Recheck Cond: (archived_at IS NULL)
Heap Blocks: exact=11
Buffers: shared hit=102
-> Bitmap Index Scan on user_unique_username (cost=0.00..4.16 rows=4 width=0) (actual time=0.014..0.014 rows=46 loops=1)
Buffers: shared hit=1
SubPlan 1
-> Aggregate (cost=8.30..8.31 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=45)
Buffers: shared hit=90
-> Index Scan using task_assignedto_idx on task task_1 (cost=0.29..8.30 rows=1 width=4) (actual time=0.002..0.002 rows=0 loops=45)
Index Cond: (assignedtoid = user_archive.id)
Buffers: shared hit=90
Planning time: 0.989 ms
Execution time: 1191.451 ms
(37 rows)
without the window function it is
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=4.74..12.36 rows=10 width=36) (actual time=0.510..1.763 rows=10 loops=1)
Buffers: shared hit=91
-> GroupAggregate (cost=4.74..10083.51 rows=13231 width=36) (actual time=0.509..1.759 rows=10 loops=1)
Group Key: task.id
Buffers: shared hit=91
-> Nested Loop Left Join (cost=4.74..9586.77 rows=66271 width=36) (actual time=0.073..0.744 rows=50 loops=1)
Join Filter: (taskuserlink_history.userid = user_archive.id)
Rows Removed by Join Filter: 361
Buffers: shared hit=91
-> Merge Left Join (cost=0.58..5563.22 rows=66271 width=8) (actual time=0.029..0.161 rows=50 loops=1)
Merge Cond: (task.id = taskuserlink_history.taskid)
Buffers: shared hit=7
-> Index Only Scan using task_pkey on task (cost=0.29..1938.30 rows=13231 width=4) (actual time=0.016..0.031 rows=11 loops=1)
Heap Fetches: 11
Buffers: shared hit=4
-> Index Scan using taskuserlink_history_task_fk_idx on taskuserlink_history (cost=0.29..2764.46 rows=66271 width=8) (actual time=0.009..0.081 rows=50 loops=1)
Filter: (timeend IS NULL)
Rows Removed by Filter: 11
Buffers: shared hit=3
-> Materialize (cost=4.17..50.46 rows=4 width=36) (actual time=0.001..0.009 rows=8 loops=50)
Buffers: shared hit=84
-> Bitmap Heap Scan on user_archive (cost=4.17..50.44 rows=4 width=36) (actual time=0.040..0.382 rows=38 loops=1)
Recheck Cond: (archived_at IS NULL)
Heap Blocks: exact=7
Buffers: shared hit=84
-> Bitmap Index Scan on user_unique_username (cost=0.00..4.16 rows=4 width=0) (actual time=0.012..0.012 rows=46 loops=1)
Buffers: shared hit=1
SubPlan 1
-> Aggregate (cost=8.30..8.31 rows=1 width=8) (actual time=0.005..0.005 rows=1 loops=38)
Buffers: shared hit=76
-> Index Scan using task_assignedto_idx on task task_1 (cost=0.29..8.30 rows=1 width=4) (actual time=0.003..0.003 rows=0 loops=38)
Index Cond: (assignedtoid = user_archive.id)
Buffers: shared hit=76
Planning time: 0.895 ms
Execution time: 1.890 ms
(35 rows)|
I believe the LIMIT clause is making the difference. LIMIT is limiting the number of rows returned, not neccessarily the work involved:
Your second query can be aborted early after 20 rows have been constructed (10 for OFFSET and 10 for LIMIT).
However, your first query needs to go through the whole set to calculate the count(task.id).
Not what you were asking, but I say it anyway:
"user" is not a table, but a view. That is were both queries actually get slower than they should be (The "Materialize" in the plan).
Using OFFSET for paging calls for trouble because it will get slow when the OFFSET increases
Using OFFSET and LIMIT without an ORDER BY is most likely not what you want. The result sets might not be identical on consecutive calls.