PostgreSQL stable functions in query - postgresql
I'm have some data model which consists of couple tables and I need to filter them.
It is two functions funcFast and funcList. funcFast can return fast result is table need to be filtered by funcList or not. funcList return list of allowed ids. I marked functions as STABLE but they run not as fast as I expect:)
I create couple of example functions:
CREATE OR REPLACE FUNCTION funcFastPlPgSql(res boolean)
returns boolean as $$
begin return res; end
$$ language plpgsql stable;
CREATE OR REPLACE FUNCTION funcList(cnt int)
returns setof integer as $$
select generate_series(1, cnt)
$$ language sql stable;
And tests.
Case 1. Filter only by fast function work OK:
explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where funcFastPlPgSql(true)
Query plan is:
Aggregate (cost=27.76..27.77 rows=1 width=8) (actual time=573.258..573.259 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..114.327 rows=1000000 loops=1)
-> Result (cost=0.25..20.25 rows=1000 width=0) (actual time=0.038..489.942 rows=1000000 loops=1)
One-Time Filter: funcfastplpgsql(true)
-> CTE Scan on obs (cost=0.25..20.25 rows=1000 width=0) (actual time=0.012..392.504 rows=1000000 loops=1)
Planning time: 0.184 ms
Execution time: 576.177 ms
Case 2. Filter only by slow function work OK too:
explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where id in (select funcList(1000))
Query plan is:
Aggregate (cost=62.26..62.27 rows=1 width=8) (actual time=469.344..469.344 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..106.144 rows=1000000 loops=1)
-> Hash Join (cost=22.25..56.00 rows=500 width=0) (actual time=1.566..469.202 rows=1000 loops=1)
Hash Cond: (obs.id = (funclist(1000)))
-> CTE Scan on obs (cost=0.00..20.00 rows=1000 width=4) (actual time=0.009..359.580 rows=1000000 loops=1)
-> Hash (cost=19.75..19.75 rows=200 width=4) (actual time=1.548..1.548 rows=1000 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 44kB
-> HashAggregate (cost=17.75..19.75 rows=200 width=4) (actual time=1.101..1.312 rows=1000 loops=1)
Group Key: funclist(1000)
-> Result (cost=0.00..5.25 rows=1000 width=4) (actual time=0.058..0.706 rows=1000 loops=1)
Planning time: 0.141 ms
Execution time: 472.183 ms
Case 3. But then two function combined I expect what the best case should be close to [case 1] and worst case should be close to [case 2], but:
explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where funcFastPlPgSql(true) or id in (select funcList(1000))
Query plan is:
Aggregate (cost=286.93..286.94 rows=1 width=8) (actual time=1575.775..1575.775 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.008..131.372 rows=1000000 loops=1)
-> CTE Scan on obs (cost=7.75..280.25 rows=667 width=0) (actual time=0.035..1468.007 rows=1000000 loops=1)
Filter: (funcfastplpgsql(true) OR (hashed SubPlan 2))
SubPlan 2
-> Result (cost=0.00..5.25 rows=1000 width=4) (never executed)
Planning time: 0.100 ms
Execution time: 1578.624 ms
What I am missing here? Why query with two together functions runs much longer and how to fix it?
Related
How to get "memory usage" on postgres CTE query plan?
I'm comparing different queries options in order to improve perfomance and I cannot understand why my CTE query plan does not show the "memory usage" option when I run it with explain analyze. Here's my query: EXPLAIN ANALYZE WITH CTE AS ( SELECT id, jsonb_agg(CASE WHEN each_result->>'another_key' = 'A' THEN jsonb_set(each_result, '{another_key}', '"B"', false) ELSE each_result END ) AS result_updated FROM my_table, jsonb_array_elements(column->'key') as result GROUP BY id ) UPDATE my_table SET column = jsonb_set(column, '{each_result}', (SELECT result_updated FROM CTE WHERE CTE.id = my_table.id), false) ; And here's the query plan result: QUERY PLAN ------------------------------------------------------------------------------------------------------- Update on custom_sequence (cost=3305.32..25976.06 rows=1003 width=83) (actual time=4071.870..4071.870 rows=0 loops=1) CTE cte -> HashAggregate (cost=3292.78..3305.32 rows=1003 width=48) (actual time=536.656..618.699 rows=1003 loops=1) Group Key: my_table_1.id -> Nested Loop (cost=0.01..2039.04 rows=100300 width=48) (actual time=0.323..59.200 rows=78234 loops=1) -> Seq Scan on my_table my_table_1 (cost=0.00..33.03 rows=1003 width=34) (actual time=0.023..0.289 rows=1003 loops=1) -> Function Scan on jsonb_array_elements step (cost=0.01..1.00 rows=100 width=32) (actual time=0.044..0.049 rows=78 loops=1003) -> Seq Scan on my_table (cost=0.00..22670.74 rows=1003 width=83) (actual time=629.743..3701.285 rows=1003 loops=1) SubPlan 2 -> CTE Scan on cte (cost=0.00..22.57 rows=5 width=32) (actual time=1.992..3.482 rows=1 loops=1003) Filter: (id = my_table.id) Rows Removed by Filter: 1002 Planning time: 0.458 ms Execution time: 4081.778 ms (14 rows) My another option is an UPDATE FROM: EXPLAIN ANALYZE UPDATE my_table SET column = jsonb_set(column, '{each_result}', q2.result_updated, false) FROM ( SELECT id, jsonb_agg(CASE WHEN each_result->>'anoter_key' = 'A' THEN jsonb_set(each_result, '{another_key}', '"B"', false) ELSE each_result END ) AS result_updated FROM my_table, jsonb_array_elements(column->'key') AS result GROUP BY id ) q2 WHERE q2.id = my_table.id ; When I run query plan for this query, I got the "memory usage" info : QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Update on my_table (cost=3748.30..4223.44 rows=1001 width=153) (actual time=992.003..992.003 rows=0 loops=1) -> Hash Join (cost=3748.30..4223.44 rows=1001 width=153) (actual time=633.204..758.817 rows=1004 loops=1) Hash Cond: (my_table.id = q2.id) -> Seq Scan on my_table (cost=0.00..460.01 rows=1001 width=67) (actual time=0.031..0.550 rows=1004 loops=1) -> Hash (cost=3735.79..3735.79 rows=1001 width=120) (actual time=632.951..632.951 rows=1004 loops=1) Buckets: 1024 (originally 1024) Batches: 2 (originally 1) Memory Usage: 32791kB -> Subquery Scan on q2 (cost=3713.26..3735.79 rows=1001 width=120) (actual time=537.310..605.349 rows=1004 loops=1) -> HashAggregate (cost=3713.26..3725.78 rows=1001 width=48) (actual time=537.306..604.269 rows=1004 loops=1) Group Key: my_table_1.id -> Nested Loop (cost=0.01..2462.01 rows=100100 width=48) (actual time=0.373..47.725 rows=78312 loops=1) -> Seq Scan on my_table my_table_1 (cost=0.00..460.01 rows=1001 width=34) (actual time=0.014..0.483 rows=1004 loops=1) -> Function Scan on jsonb_array_elements step (cost=0.01..1.00 rows=100 width=32) (actual time=0.036..0.039 rows=78 loops=1004) Planning time: 0.924 ms Execution time: 998.818 ms (14 rows) How can I get the memory usage information on my CTE query too?
Can PostgreSQL 12 do partition pruning at execution time with subquery returning a list?
I'm trying to take advantages of partitioning in one case: I have table "events" which partitioned by list by field "dt_pk" which is foreign key to table "dates". -- Schema drop schema if exists test cascade; create schema test; -- Tables create table if not exists test.dates ( id bigint primary key, dt date not null ); create sequence test.seq_events_id; create table if not exists test.events ( id bigint not null, dt_pk bigint not null, content_int bigint, foreign key (dt_pk) references test.dates(id) on delete cascade, primary key (dt_pk, id) ) partition by list (dt_pk); -- Partitions create table test.events_1 partition of test.events for values in (1); create table test.events_2 partition of test.events for values in (2); create table test.events_3 partition of test.events for values in (3); -- Fill tables insert into test.dates (id, dt) select id, dt from ( select 1 id, '2020-01-01'::date as dt union all select 2 id, '2020-01-02'::date as dt union all select 3 id, '2020-01-03'::date as dt ) t; do $$ declare dts record; begin for dts in ( select id from test.dates ) loop for k in 1..10000 loop insert into test.events (id, dt_pk, content_int) values (nextval('test.seq_events_id'), dts.id, random_between(1, 1000000)); end loop; commit; end loop; end; $$; vacuum analyze test.dates, test.events; I want to run select like this: select * from test.events e join test.dates d on e.dt_pk = d.id where d.dt between '2020-01-02'::date and '2020-01-03'::date; But in this case partition pruning doesn't work. It's clear, I don't have constant for partition key. But from documentation I know that there is partition pruning at execution time, which works with value obtained from a subquery: Partition pruning can be performed not only during the planning of a given query, but also during its execution. This is useful as it can allow more partitions to be pruned when clauses contain expressions whose values are not known at query planning time, for example, parameters defined in a PREPARE statement, using a value obtained from a subquery, or using a parameterized value on the inner side of a nested loop join. So I rewrite my query like this and I expected partitionin pruning: select * from test.events e where e.dt_pk in ( select d.id from test.dates d where d.dt between '2020-01-02'::date and '2020-01-03'::date ); But explain for this select says: Hash Join (cost=1.07..833.07 rows=20000 width=24) (actual time=3.581..15.989 rows=20000 loops=1) Hash Cond: (e.dt_pk = d.id) -> Append (cost=0.00..642.00 rows=30000 width=24) (actual time=0.005..6.361 rows=30000 loops=1) -> Seq Scan on events_1 e (cost=0.00..164.00 rows=10000 width=24) (actual time=0.005..1.104 rows=10000 loops=1) -> Seq Scan on events_2 e_1 (cost=0.00..164.00 rows=10000 width=24) (actual time=0.005..1.127 rows=10000 loops=1) -> Seq Scan on events_3 e_2 (cost=0.00..164.00 rows=10000 width=24) (actual time=0.008..1.097 rows=10000 loops=1) -> Hash (cost=1.04..1.04 rows=2 width=8) (actual time=0.006..0.006 rows=2 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on dates d (cost=0.00..1.04 rows=2 width=8) (actual time=0.004..0.004 rows=2 loops=1) Filter: ((dt >= '2020-01-02'::date) AND (dt <= '2020-01-03'::date)) Rows Removed by Filter: 1 Planning Time: 0.206 ms Execution Time: 17.237 ms So, we read all partitions. I even tried to the planner to use nested loop join, because I read in documentation "parameterized value on the inner side of a nested loop join", but it didn't work: set enable_hashjoin to off; set enable_mergejoin to off; And again: Nested Loop (cost=0.00..1443.05 rows=20000 width=24) (actual time=9.160..25.252 rows=20000 loops=1) Join Filter: (e.dt_pk = d.id) Rows Removed by Join Filter: 30000 -> Append (cost=0.00..642.00 rows=30000 width=24) (actual time=0.008..6.280 rows=30000 loops=1) -> Seq Scan on events_1 e (cost=0.00..164.00 rows=10000 width=24) (actual time=0.008..1.105 rows=10000 loops=1) -> Seq Scan on events_2 e_1 (cost=0.00..164.00 rows=10000 width=24) (actual time=0.008..1.047 rows=10000 loops=1) -> Seq Scan on events_3 e_2 (cost=0.00..164.00 rows=10000 width=24) (actual time=0.007..1.082 rows=10000 loops=1) -> Materialize (cost=0.00..1.05 rows=2 width=8) (actual time=0.000..0.000 rows=2 loops=30000) -> Seq Scan on dates d (cost=0.00..1.04 rows=2 width=8) (actual time=0.004..0.004 rows=2 loops=1) Filter: ((dt >= '2020-01-02'::date) AND (dt <= '2020-01-03'::date)) Rows Removed by Filter: 1 Planning Time: 0.202 ms Execution Time: 26.516 ms Then I noticed that in every example of "partition pruning at execution time" I see only = condition, not in. And it really works that way: explain (analyze) select * from test.events e where e.dt_pk = (select id from test.dates where id = 2); Append (cost=1.04..718.04 rows=30000 width=24) (actual time=0.014..3.018 rows=10000 loops=1) InitPlan 1 (returns $0) -> Seq Scan on dates (cost=0.00..1.04 rows=1 width=8) (actual time=0.007..0.008 rows=1 loops=1) Filter: (id = 2) Rows Removed by Filter: 2 -> Seq Scan on events_1 e (cost=0.00..189.00 rows=10000 width=24) (never executed) Filter: (dt_pk = $0) -> Seq Scan on events_2 e_1 (cost=0.00..189.00 rows=10000 width=24) (actual time=0.004..2.009 rows=10000 loops=1) Filter: (dt_pk = $0) -> Seq Scan on events_3 e_2 (cost=0.00..189.00 rows=10000 width=24) (never executed) Filter: (dt_pk = $0) Planning Time: 0.135 ms Execution Time: 3.639 ms And here is my final question: does partition pruning at execution time work only with subquery returning one item, or there is a way to get advantages of partition pruning with subquery returning a list? And why doesn't it work with nested loop join, did I understand something wrong in words: This includes values from subqueries and values from execution-time parameters such as those from parameterized nested loop joins. Or "parameterized nested loop joins" is something different from regular nested loop joins?
There is no partition pruning in your nested loop join because the partitioned table is on the outer side, which is always scanned completely. The inner side is scanned with the join key from the outer side as parameter (hence parameterized scan), so if the partitioned table were on the inner side of the nested loop join, partition pruning could happen. Partition pruning with IN lists can take place if the list vales are known at plan time: EXPLAIN (COSTS OFF) SELECT * FROM test.events WHERE dt_pk IN (1, 2); QUERY PLAN --------------------------------------------------- Append -> Seq Scan on events_1 Filter: (dt_pk = ANY ('{1,2}'::bigint[])) -> Seq Scan on events_2 Filter: (dt_pk = ANY ('{1,2}'::bigint[])) (5 rows) But no attempts are made to flatten a subquery, and PostgreSQL doesn't use partition pruning, even if you force the partitioned table to be on the inner side (enable_material = off, enable_hashjoin = off, enable_mergejoin = off): EXPLAIN (ANALYZE) SELECT * FROM test.events WHERE dt_pk IN (SELECT 1 UNION SELECT 2); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.06..2034.09 rows=20000 width=24) (actual time=0.057..15.523 rows=20000 loops=1) Join Filter: (events_1.dt_pk = (1)) Rows Removed by Join Filter: 40000 -> Unique (cost=0.06..0.07 rows=2 width=4) (actual time=0.026..0.029 rows=2 loops=1) -> Sort (cost=0.06..0.07 rows=2 width=4) (actual time=0.024..0.025 rows=2 loops=1) Sort Key: (1) Sort Method: quicksort Memory: 25kB -> Append (cost=0.00..0.05 rows=2 width=4) (actual time=0.006..0.009 rows=2 loops=1) -> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=1) -> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1) -> Append (cost=0.00..642.00 rows=30000 width=24) (actual time=0.012..4.334 rows=30000 loops=2) -> Seq Scan on events_1 (cost=0.00..164.00 rows=10000 width=24) (actual time=0.011..1.057 rows=10000 loops=2) -> Seq Scan on events_2 (cost=0.00..164.00 rows=10000 width=24) (actual time=0.004..0.641 rows=10000 loops=2) -> Seq Scan on events_3 (cost=0.00..164.00 rows=10000 width=24) (actual time=0.002..0.594 rows=10000 loops=2) Planning Time: 0.531 ms Execution Time: 16.567 ms (16 rows) I am not certain, but it may be because the tables are so small. You might want to try with bigger tables.
If you care more about get it working than the fine details, and you haven't tried this yet: you can rewrite the query to something like explain analyze select * from test.dates d join test.events e on e.dt_pk = d.id where d.dt between '2020-01-02'::date and '2020-01-03'::date and e.dt_pk in (extract(day from '2020-01-02'::date)::int, extract(day from '2020-01-03'::date)::int); which will give the expected pruning.
Planner not using index order to sort the records using CTE
I am trying to pass some ids into an in-clause on a sorted index with the same order by condition but the query planner is explicitly sorting the data after performing index search. below are my queries. Generate a temporary table. SELECT a.n/20 as n, md5(a.n::TEXT) as b INTO temp_table From generate_series(1, 100000) as a(n); create an index CREATE INDEX idx_temp_table ON temp_table(n ASC, b ASC); In below query, planner uses index ordering and doesn't explicitly sorts the data.(expected) EXPLAIN ANALYSE SELECT * from temp_table WHERE n = 10 ORDER BY n, b limit 5; Query Plan QUERY PLAN Limit (cost=0.42..16.07 rows=5 width=36) (actual time=0.098..0.101 rows=5 loops=1) -> Index Only Scan using idx_temp_table on temp_table (cost=0.42..1565.17 rows=500 width=36) (actual time=0.095..0.098 rows=5 loops=1) Index Cond: (n = 10) Heap Fetches: 5 Planning time: 0.551 ms Execution time: 0.128 ms but when i use one or more ids from a cte and pass them in clause then planner only uses index to fetch the values but explicitly sorts them afterwards (not expected). EXPLAIN ANALYSE WITH cte(x) AS (VALUES (10)) SELECT * from temp_table WHERE n IN ( SELECT x from cte) ORDER BY n, b limit 5; then planner uses below query plan QUERY PLAN QUERY PLAN Limit (cost=85.18..85.20 rows=5 width=37) (actual time=0.073..0.075 rows=5 loops=1) CTE cte -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=4) (actual time=0.001..0.002 rows=2 loops=1) -> Sort (cost=85.16..85.26 rows=40 width=37) (actual time=0.072..0.073 rows=5 loops=1) Sort Key: temp_table.n, temp_table.b Sort Method: top-N heapsort Memory: 25kB -> Nested Loop (cost=0.47..84.50 rows=40 width=37) (actual time=0.037..0.056 rows=40 loops=1) -> Unique (cost=0.05..0.06 rows=2 width=4) (actual time=0.009..0.010 rows=2 loops=1) -> Sort (cost=0.05..0.06 rows=2 width=4) (actual time=0.009..0.010 rows=2 loops=1) Sort Key: cte.x Sort Method: quicksort Memory: 25kB -> CTE Scan on cte (cost=0.00..0.04 rows=2 width=4) (actual time=0.004..0.005 rows=2 loops=1) -> Index Only Scan using idx_temp_table on temp_table (cost=0.42..42.02 rows=20 width=37) (actual time=0.012..0.018 rows=20 loops=2) Index Cond: (n = cte.x) Heap Fetches: 40 Planning time: 0.166 ms Execution time: 0.101 ms I tried putting an explicit sorting while passing the ids in where clause so that sorted order in ids is maintained but still planner sorted explicitly EXPLAIN ANALYSE WITH cte(x) AS (VALUES (10)) SELECT * from temp_table WHERE n IN ( SELECT x from cte) ORDER BY n, b limit 5; Query plan QUERY PLAN Limit (cost=42.62..42.63 rows=5 width=37) (actual time=0.042..0.044 rows=5 loops=1) CTE cte -> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1) -> Sort (cost=42.61..42.66 rows=20 width=37) (actual time=0.042..0.042 rows=5 loops=1) Sort Key: temp_table.n, temp_table.b Sort Method: top-N heapsort Memory: 25kB -> Nested Loop (cost=0.46..42.28 rows=20 width=37) (actual time=0.025..0.033 rows=20 loops=1) -> HashAggregate (cost=0.05..0.06 rows=1 width=4) (actual time=0.009..0.009 rows=1 loops=1) Group Key: cte.x -> Sort (cost=0.03..0.04 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=1) Sort Key: cte.x Sort Method: quicksort Memory: 25kB -> CTE Scan on cte (cost=0.00..0.02 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=1) -> Index Only Scan using idx_temp_table on temp_table (cost=0.42..42.02 rows=20 width=37) (actual time=0.014..0.020 rows=20 loops=1) Index Cond: (n = cte.x) Heap Fetches: 20 Planning time: 0.167 ms Execution time: 0.074 ms Can anyone explain why planner is using an explicit sort on the data? Is there a way to by pass this and make planner use the index sorting order so additional sorting on the records can be saved. In production, we have similar case but size of our selection is too big but only a handful of records needs to fetched with pagination. Thanks in anticipation!
It is actually a decision made by the planner, with a larger set of values(), Postgres will switch to a smarter plan, with the sort done before the merge. select version(); \echo +++++ Original EXPLAIN ANALYSE WITH cte(x) AS (VALUES (10)) SELECT * from temp_table WHERE n IN ( SELECT x from cte) ORDER BY n, b limit 5; \echo +++++ TEN Values EXPLAIN ANALYSE WITH cte(x) AS (VALUES (10),(11),(12),(13),(14),(15),(16),(17),(18),(19) ) SELECT * from temp_table WHERE n IN ( SELECT x from cte) ORDER BY n, b limit 5; \echo ++++++++ one row from table EXPLAIN ANALYSE WITH cte(x) AS (SELECT n FROM temp_table WHERE n = 10) SELECT * from temp_table WHERE n IN ( SELECT x from cte) ORDER BY n, b limit 5; \echo ++++++++ one row from table TWO ctes EXPLAIN ANALYSE WITH val(x) AS (VALUES (10)) , cte(x) AS ( SELECT n FROM temp_table WHERE n IN (select x from val) ) SELECT * from temp_table WHERE n IN ( SELECT x from cte) ORDER BY n, b limit 5; Resulting plans: version ------------------------------------------------------------------------------------------------------- PostgreSQL 11.3 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4, 64-bit (1 row) +++++ Original QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=13.72..13.73 rows=5 width=37) (actual time=0.197..0.200 rows=5 loops=1) CTE cte -> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1) -> Sort (cost=13.71..13.76 rows=20 width=37) (actual time=0.194..0.194 rows=5 loops=1) Sort Key: temp_table.n, temp_table.b Sort Method: top-N heapsort Memory: 25kB -> Nested Loop (cost=0.44..13.37 rows=20 width=37) (actual time=0.083..0.097 rows=20 loops=1) -> HashAggregate (cost=0.02..0.03 rows=1 width=4) (actual time=0.018..0.018 rows=1 loops=1) Group Key: cte.x -> CTE Scan on cte (cost=0.00..0.02 rows=1 width=4) (actual time=0.007..0.008 rows=1 loops=1) -> Index Only Scan using idx_temp_table on temp_table (cost=0.42..13.14 rows=20 width=37) (actual time=0.058..0.068 rows=20 loops=1) Index Cond: (n = cte.x) Heap Fetches: 20 Planning Time: 1.328 ms Execution Time: 0.360 ms (15 rows) +++++ TEN Values QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.91..89.11 rows=5 width=37) (actual time=0.179..0.183 rows=5 loops=1) CTE cte -> Values Scan on "*VALUES*" (cost=0.00..0.12 rows=10 width=4) (actual time=0.001..0.007 rows=10 loops=1) -> Merge Semi Join (cost=0.78..3528.72 rows=200 width=37) (actual time=0.178..0.181 rows=5 loops=1) Merge Cond: (temp_table.n = cte.x) -> Index Only Scan using idx_temp_table on temp_table (cost=0.42..3276.30 rows=100000 width=37) (actual time=0.030..0.123 rows=204 loops=1) Heap Fetches: 204 -> Sort (cost=0.37..0.39 rows=10 width=4) (actual time=0.023..0.023 rows=1 loops=1) Sort Key: cte.x Sort Method: quicksort Memory: 25kB -> CTE Scan on cte (cost=0.00..0.20 rows=10 width=4) (actual time=0.003..0.013 rows=10 loops=1) Planning Time: 0.197 ms Execution Time: 0.226 ms (13 rows) ++++++++ one row from table QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=14.39..58.52 rows=5 width=37) (actual time=0.168..0.173 rows=5 loops=1) CTE cte -> Index Only Scan using idx_temp_table on temp_table temp_table_1 (cost=0.42..13.14 rows=20 width=4) (actual time=0.010..0.020 rows=20 loops=1) Index Cond: (n = 10) Heap Fetches: 20 -> Merge Semi Join (cost=1.25..3531.24 rows=400 width=37) (actual time=0.167..0.170 rows=5 loops=1) Merge Cond: (temp_table.n = cte.x) -> Index Only Scan using idx_temp_table on temp_table (cost=0.42..3276.30 rows=100000 width=37) (actual time=0.025..0.101 rows=204 loops=1) Heap Fetches: 204 -> Sort (cost=0.83..0.88 rows=20 width=4) (actual time=0.039..0.039 rows=1 loops=1) Sort Key: cte.x Sort Method: quicksort Memory: 25kB -> CTE Scan on cte (cost=0.00..0.40 rows=20 width=4) (actual time=0.012..0.031 rows=20 loops=1) Planning Time: 0.243 ms Execution Time: 0.211 ms (15 rows) ++++++++ one row from table TWO ctes QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=14.63..58.76 rows=5 width=37) (actual time=0.224..0.229 rows=5 loops=1) CTE val -> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1) CTE cte -> Nested Loop (cost=0.44..13.37 rows=20 width=4) (actual time=0.038..0.052 rows=20 loops=1) -> HashAggregate (cost=0.02..0.03 rows=1 width=4) (actual time=0.007..0.007 rows=1 loops=1) Group Key: val.x -> CTE Scan on val (cost=0.00..0.02 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=1) -> Index Only Scan using idx_temp_table on temp_table temp_table_1 (cost=0.42..13.14 rows=20 width=4) (actual time=0.029..0.038 rows=20 loops=1) Index Cond: (n = val.x) Heap Fetches: 20 -> Merge Semi Join (cost=1.25..3531.24 rows=400 width=37) (actual time=0.223..0.226 rows=5 loops=1) Merge Cond: (temp_table.n = cte.x) -> Index Only Scan using idx_temp_table on temp_table (cost=0.42..3276.30 rows=100000 width=37) (actual time=0.038..0.114 rows=204 loops=1) Heap Fetches: 204 -> Sort (cost=0.83..0.88 rows=20 width=4) (actual time=0.082..0.082 rows=1 loops=1) Sort Key: cte.x Sort Method: quicksort Memory: 25kB -> CTE Scan on cte (cost=0.00..0.40 rows=20 width=4) (actual time=0.040..0.062 rows=20 loops=1) Planning Time: 0.362 ms Execution Time: 0.313 ms (21 rows) Beware of CTEs!. For the planner, CTEs are more or less black boxes, and very little is known about expected number of rows, statistics distribution, or ordering inside. In cases where CTEs result in a bad plan (the original question is not such a case), a CTE can often be replaced by a (temp) view, which is seen by the planner in its full naked glory. Update Starting with version 11, CTEs are handled differently by the planner: if they do not have side effects, they are candidates for being merged with the main query. (but is still a good idea to check your query plans)
The optimizet isn't aware that the CTE is sorted. If you scan an index for multiple values and have an ORDER BY, PostgreSQL will always sort. The only thing that comes to my mind is to create a temporary table with the values from the IN list and put an index on that temporary table. Then when you join with that table, PostgreSQL will be aware of the ordering and might for example choose a merge join that can use the indexes. Of course that means a lot of overhead, and it could easily be that the original sort wins out.
Postgresql IN operator Performance: List vs Subquery
For a list of ~700 ids the query performance is over 20x slower than passing a subquery that returns those 700 ids. It should be the opposite. e.g. (first query takes under 400ms, the later 9600 ms) select date_trunc('month', day) as month, sum(total) from table_x where y_id in (select id from table_y where prop = 'xyz') and day between '2015-11-05' and '2016-11-04' group by month is 20x faster on my machine than passing the array directly: select date_trunc('month', day) as month, sum(total) from table_x where y_id in (1625, 1871, ..., 1640, 1643, 13291, 1458, 13304, 1407, 1765) and day between '2015-11-05' and '2016-11-04' group by month Any idea what could be the problem or how to optimize and obtain the same performance?
The difference is a simple filter vs a hash join: explain analyze select i from t where i in (500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Seq Scan on t (cost=0.00..140675.00 rows=101 width=4) (actual time=0.648..1074.567 rows=101 loops=1) Filter: (i = ANY ('{500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600}'::integer[])) Rows Removed by Filter: 999899 Planning time: 0.170 ms Execution time: 1074.624 ms explain analyze select i from t where i in (select i from r); QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Hash Semi Join (cost=3.27..17054.40 rows=101 width=4) (actual time=0.382..240.389 rows=101 loops=1) Hash Cond: (t.i = r.i) -> Seq Scan on t (cost=0.00..14425.00 rows=1000000 width=4) (actual time=0.030..117.193 rows=1000000 loops=1) -> Hash (cost=2.01..2.01 rows=101 width=4) (actual time=0.074..0.074 rows=101 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 12kB -> Seq Scan on r (cost=0.00..2.01 rows=101 width=4) (actual time=0.010..0.035 rows=101 loops=1) Planning time: 0.245 ms Execution time: 240.448 ms To have the same performance join the array: explain analyze select i from t inner join unnest( array[500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600]::int[] ) u (i) using (i) ; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Hash Join (cost=2.25..18178.25 rows=100 width=4) (actual time=0.267..243.768 rows=101 loops=1) Hash Cond: (t.i = u.i) -> Seq Scan on t (cost=0.00..14425.00 rows=1000000 width=4) (actual time=0.022..118.709 rows=1000000 loops=1) -> Hash (cost=1.00..1.00 rows=100 width=4) (actual time=0.063..0.063 rows=101 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 12kB -> Function Scan on unnest u (cost=0.00..1.00 rows=100 width=4) (actual time=0.028..0.041 rows=101 loops=1) Planning time: 0.172 ms Execution time: 243.816 ms Or use the values syntax: explain analyze select i from t where i = any (values (500),(501),(502),(503),(504),(505),(506),(507),(508),(509),(510),(511),(512),(513),(514),(515),(516),(517),(518),(519),(520),(521),(522),(523),(524),(525),(526),(527),(528),(529),(530),(531),(532),(533),(534),(535),(536),(537),(538),(539),(540),(541),(542),(543),(544),(545),(546),(547),(548),(549),(550),(551),(552),(553),(554),(555),(556),(557),(558),(559),(560),(561),(562),(563),(564),(565),(566),(567),(568),(569),(570),(571),(572),(573),(574),(575),(576),(577),(578),(579),(580),(581),(582),(583),(584),(585),(586),(587),(588),(589),(590),(591),(592),(593),(594),(595),(596),(597),(598),(599),(600)) ; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Hash Semi Join (cost=2.53..17053.65 rows=101 width=4) (actual time=0.279..239.888 rows=101 loops=1) Hash Cond: (t.i = "*VALUES*".column1) -> Seq Scan on t (cost=0.00..14425.00 rows=1000000 width=4) (actual time=0.022..117.199 rows=1000000 loops=1) -> Hash (cost=1.26..1.26 rows=101 width=4) (actual time=0.059..0.059 rows=101 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 12kB -> Values Scan on "*VALUES*" (cost=0.00..1.26 rows=101 width=4) (actual time=0.002..0.027 rows=101 loops=1) Planning time: 0.242 ms Execution time: 239.933 ms
Try to change the critical line to this: where y_id = any (values (1625, 1871, ..., 1640, 1643, 13291, 1458, 13304, 1407, 1765) )
Postgres Slow group by query with max
I am using postgres 9.1 and I have a table with about 3.5M rows of eventtype (varchar) and eventtime (timestamp) - and some other fields. There are only about 20 different eventtype's and the event time spans about 4 years. I want to get the last timestamp of each event type. If I run a query like: select eventtype, max(eventtime) from allevents group by eventtype it takes around 20 seconds. Selecting distinct eventtype's is equally slow. The query plan shows a full sequential scan of the table - not surprising it is slow. Explain analyse for the above query gives: HashAggregate (cost=84591.47..84591.68 rows=21 width=21) (actual time=20918.131..20918.141 rows=21 loops=1) -> Seq Scan on allevents (cost=0.00..66117.98 rows=3694698 width=21) (actual time=0.021..4831.793 rows=3694392 loops=1) Total runtime: 20918.204 ms If I add a where clause to select a specific eventtype, it takes anywhere from 40ms to 150ms which is at least decent. Query plan when selecting specific eventtype: GroupAggregate (cost=343.87..24942.71 rows=1 width=21) (actual time=98.397..98.397 rows=1 loops=1) -> Bitmap Heap Scan on allevents (cost=343.87..24871.07 rows=14325 width=21) (actual time=6.820..89.610 rows=19736 loops=1) Recheck Cond: ((eventtype)::text = 'TEST_EVENT'::text) -> Bitmap Index Scan on allevents_idx2 (cost=0.00..340.28 rows=14325 width=0) (actual time=6.121..6.121 rows=19736 loops=1) Index Cond: ((eventtype)::text = 'TEST_EVENT'::text) Total runtime: 98.482 ms Primary key is (eventtype, eventtime). I also have the following indexes: allevents_idx (event time desc, eventtype) allevents_idx2 (eventtype). How can I speed up the query? Results of query play for correlated subquery suggested by #denis below with 14 manually entered values gives: Function Scan on unnest val (cost=0.00..185.40 rows=100 width=32) (actual time=0.121..8983.134 rows=14 loops=1) SubPlan 2 -> Result (cost=1.83..1.84 rows=1 width=0) (actual time=641.644..641.645 rows=1 loops=14) InitPlan 1 (returns $1) -> Limit (cost=0.00..1.83 rows=1 width=8) (actual time=641.640..641.641 rows=1 loops=14) -> Index Scan using allevents_idx on allevents (cost=0.00..322672.36 rows=175938 width=8) (actual time=641.638..641.638 rows=1 loops=14) Index Cond: ((eventtime IS NOT NULL) AND ((eventtype)::text = val.val)) Total runtime: 8983.203 ms Using the recursive query suggested by #jjanes, the query runs between 4 and 5 seconds with the following plan: CTE Scan on t (cost=260.32..448.63 rows=101 width=32) (actual time=0.146..4325.598 rows=22 loops=1) CTE t -> Recursive Union (cost=2.52..260.32 rows=101 width=32) (actual time=0.075..1.449 rows=22 loops=1) -> Result (cost=2.52..2.53 rows=1 width=0) (actual time=0.074..0.074 rows=1 loops=1) InitPlan 1 (returns $1) -> Limit (cost=0.00..2.52 rows=1 width=13) (actual time=0.070..0.071 rows=1 loops=1) -> Index Scan using allevents_idx2 on allevents (cost=0.00..9315751.37 rows=3696851 width=13) (actual time=0.070..0.070 rows=1 loops=1) Index Cond: ((eventtype)::text IS NOT NULL) -> WorkTable Scan on t (cost=0.00..25.58 rows=10 width=32) (actual time=0.059..0.060 rows=1 loops=22) Filter: (eventtype IS NOT NULL) SubPlan 3 -> Result (cost=2.53..2.54 rows=1 width=0) (actual time=0.059..0.059 rows=1 loops=21) InitPlan 2 (returns $3) -> Limit (cost=0.00..2.53 rows=1 width=13) (actual time=0.057..0.057 rows=1 loops=21) -> Index Scan using allevents_idx2 on allevents (cost=0.00..3114852.66 rows=1232284 width=13) (actual time=0.055..0.055 rows=1 loops=21) Index Cond: (((eventtype)::text IS NOT NULL) AND ((eventtype)::text > t.eventtype)) SubPlan 6 -> Result (cost=1.83..1.84 rows=1 width=0) (actual time=196.549..196.549 rows=1 loops=22) InitPlan 5 (returns $6) -> Limit (cost=0.00..1.83 rows=1 width=8) (actual time=196.546..196.546 rows=1 loops=22) -> Index Scan using allevents_idx on allevents (cost=0.00..322946.21 rows=176041 width=8) (actual time=196.544..196.544 rows=1 loops=22) Index Cond: ((eventtime IS NOT NULL) AND ((eventtype)::text = t.eventtype)) Total runtime: 4325.694 ms
What you need is a "skip scan" or "loose index scan". PostgreSQL's planner does not yet implement those automatically, but you can trick it into using one by using a recursive query. WITH RECURSIVE t AS ( SELECT min(eventtype) AS eventtype FROM allevents UNION ALL SELECT (SELECT min(eventtype) as eventtype FROM allevents WHERE eventtype > t.eventtype) FROM t where t.eventtype is not null ) select eventtype, (select max(eventtime) from allevents where eventtype=t.eventtype) from t; There may be a way to collapse the max(eventtime) into the recursive query rather than doing it outside that query, but if so I have not hit upon it. This needs an index on (eventtype, eventtime) in order to be efficient. You can have it be DESC on the eventtime, but that is not necessary. This is efficiently only if eventtype has only a few distinct values (21 of them, in your case).
Based on the question you already have the relevant index. If upgrading to Postgres 9.3 or an index on (eventtype, eventtime desc) doesn't make a difference, this is a case where rewriting the query so it uses a correlated subquery works very well if you can enumerate all of the event types manually: select val as eventtype, (select max(eventtime) from allevents where allevents.eventtype = val ) as eventtime from unnest('{type1,type2,…}'::text[]) as val; Here's the plans I get when running similar queries: denis=# select version(); version ----------------------------------------------------------------------------------------------------------------------------------- PostgreSQL 9.3.1 on x86_64-apple-darwin11.4.2, compiled by Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn), 64-bit (1 row) Test data: denis=# create table test (evttype int, evttime timestamp, primary key (evttype, evttime)); CREATE TABLE denis=# insert into test (evttype, evttime) select i, now() + (i % 3) * interval '1 min' - j * interval '1 sec' from generate_series(1,10) i, generate_series(1,10000) j; INSERT 0 100000 denis=# create index on test (evttime, evttype); CREATE INDEX denis=# vacuum analyze test; VACUUM First query: denis=# explain analyze select evttype, max(evttime) from test group by evttype; QUERY PLAN ------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=2041.00..2041.10 rows=10 width=12) (actual time=54.983..54.987 rows=10 loops=1) -> Seq Scan on test (cost=0.00..1541.00 rows=100000 width=12) (actual time=0.009..15.954 rows=100000 loops=1) Total runtime: 55.045 ms (3 rows) Second query: denis=# explain analyze select val as evttype, (select max(evttime) from test where test.evttype = val) as evttime from unnest('{1,2,3,4,5,6,7,8,9,10}'::int[]) val; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------- Function Scan on unnest val (cost=0.00..48.39 rows=100 width=4) (actual time=0.086..0.292 rows=10 loops=1) SubPlan 2 -> Result (cost=0.46..0.47 rows=1 width=0) (actual time=0.024..0.024 rows=1 loops=10) InitPlan 1 (returns $1) -> Limit (cost=0.42..0.46 rows=1 width=8) (actual time=0.021..0.021 rows=1 loops=10) -> Index Only Scan Backward using test_pkey on test (cost=0.42..464.42 rows=10000 width=8) (actual time=0.019..0.019 rows=1 loops=10) Index Cond: ((evttype = val.val) AND (evttime IS NOT NULL)) Heap Fetches: 0 Total runtime: 0.370 ms (9 rows)
index on (eventtype, eventtime desc) should help. or reindex on primary key index. I would also recommend replace type of eventtype to enum (if number of types is fixed) or int/smallint. This will decrease size of data and indexes so queries will run faster.