Generic execution plan in query language (SQL) function - postgresql

I am trying to figure out an issue related to poor performance in a parameterized SQL function. The documentation is relatively clear about execution plans in PL/PGSQL functions - the query planner plans the executions the first 5 times and then may cache the plan after that within the same session.
However, I have been unable to find similar documentation related to SQL functions and, from some investigating, it appears that the query planner surprisingly always uses a generic plan.
Here's a simple example:
-- Postgres 11.1
-- Test table and some data
create table test (a int);
insert into test select 1 from generate_series(1,1000000);
insert into test values (2);
create index on test(a);
analyze test;
-- A SQL Function
create function test_f(val int, cnt out bigint) AS $$
SELECT count(*) FROM test where a = val;
$$ LANGUAGE SQL;
-- A similar PL/PGSQL Function
create function test_f_plpgsql (val int, cnt out bigint) AS $$
BEGIN
SELECT count(*) FROM test where a = val INTO cnt;
END
$$ LANGUAGE PLPGSQL;
-- Show plans
LOAD 'auto_explain';
SET auto_explain.log_analyze = on;
SET client_min_messages = log;
SET auto_explain.log_nested_statements = on;
SET auto_explain.log_min_duration = 0;
The PL/PGSQL function uses a non-generic plan, and knows to use an index-only scan.
select * from test_f_plpgsql(2);
LOG: duration: 0.326 ms plan:
Query Text: SELECT count(*) FROM test where a = val
Aggregate (cost=4.45..4.46 rows=1 width=8) (actual time=0.180..0.202 rows=1 loops=1)
-> Index Only Scan using test_a_idx on test (cost=0.42..4.44 rows=1 width=0) (actual time=0.096..0.135 rows=1 loops=1)
Index Cond: (a = 2)
Heap Fetches: 1
LOG: duration: 1.250 ms plan:
Query Text: select * from test_f_plpgsql(2);
Function Scan on test_f_plpgsql (cost=0.25..0.26 rows=1 width=8) (actual time=1.116..1.152 rows=1 loops=1)
cnt
-----
1
(1 row)
The SQL function, on the other hand, uses a generic plan, which poorly chooses a full table scan.
select * from test_f(2);
LOG: duration: 18.716 ms plan:
Query Text:
SELECT count(*) FROM test where a = val;
Partial Aggregate (cost=10675.01..10675.02 rows=1 width=8) (actual time=18.639..18.665 rows=1 loops=1)
-> Parallel Seq Scan on test (cost=0.00..9633.34 rows=416667 width=0) (actual time=18.621..18.628 rows=0 loops=1)
Filter: (a = $1)
Rows Removed by Filter: 273008
LOG: duration: 28.304 ms plan:
Query Text:
SELECT count(*) FROM test where a = val;
Partial Aggregate (cost=10675.01..10675.02 rows=1 width=8) (actual time=28.234..28.248 rows=1 loops=1)
-> Parallel Seq Scan on test (cost=0.00..9633.34 rows=416667 width=0) (actual time=28.199..28.208 rows=1 loops=1)
Filter: (a = $1)
Rows Removed by Filter: 129222
LOG: duration: 45.913 ms plan:
Query Text:
SELECT count(*) FROM test where a = val;
Finalize Aggregate (cost=11675.22..11675.23 rows=1 width=8) (actual time=42.370..42.377 rows=1 loops=1)
-> Gather (cost=11675.01..11675.22 rows=2 width=8) (actual time=42.288..45.787 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=10675.01..10675.02 rows=1 width=8) (actual time=29.579..29.597 rows=1 loops=3)
-> Parallel Seq Scan on test (cost=0.00..9633.34 rows=416667 width=0) (actual time=29.553..29.560 rows=0 loops=3)
Filter: (a = $1)
Rows Removed by Filter: 333333
LOG: duration: 47.128 ms plan:
Query Text: select * from test_f(2);
Function Scan on test_f (cost=0.25..0.26 rows=1 width=8) (actual time=47.058..47.073 rows=1 loops=1)
cnt
-----
1
(1 row)
Is there a way to force the SQL function to use a non-generic plan?

Related

MADLib + PostgreSQL Query Plan

While executing the postgresql explan analyze for a ML query in MADLib, i'm getting output like below.
query & query plan
[Query]
EXPLAIN (VERBOSE, ANALYZE)
SELECT COUNT(linregr.linregr_predict) FROM(
SELECT madlib.linregr_predict(ARRAY[c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12,c13,c14,c15,c16,c17,c18,c19,c20,c21,c22,c23,c24,c25,c26,c27,c28], ARRAY[f1,f2,f3,f4,f5,f6,f7,f8,f9,f10,f11,f12,f13,f14,f15,f16,f17,f18,f19,f20,f21,f22,f23,f24,f25,f26,f27,f28])
FROM higgs_50_linregr_model_coef, higgs_1k_test
WHERE higgs_1k_test.f1 >0.7) AS linregr;
[Query Plan]
Aggregate (cost=19158.81..19158.82 rows=1 width=8) (actual time=4.607..4.610 rows=1 loops=1)
-> Nested Loop (cost=0.00..1497.81 rows=117740 width=224) (actual time=0.056..1.827 rows=204 loops=1)
-> Seq Scan on higgs_50_linregr_model_coef (cost=0.00..15.80 rows=580 width=112) (actual time=0.017..0.019 rows=1 loops=1)
-> Materialize (cost=0.00..10.77 rows=203 width=112) (actual time=0.028..1.388 rows=204 loops=1)
-> Seq Scan on higgs_1k_test (cost=0.00..9.75 rows=203 width=112) (actual time=0.018..0.531 rows=204 loops=1)
Filter: (f1 > '0.7'::double precision)
Rows Removed by Filter: 96
Planning Time: 0.624 ms
Execution Time: 4.826 ms
It seems that the ML operation of query is not included in the overall query plan.
I have a wonder that it is right and if so, how postgresql uses the madlib query while executing?

Postgres optimization failing to filter window function partitions early

In some cases, PostgreSQL does not filter out window function partitions until they are calculated, while in a very similar scenario PostgreSQL filters row before performing window function calculation.
Tables used for minimal STR - log is the main data table, each row contains either increment or absolute value. Absolute value resets the current counter with a new base value. Window functions need to process all logs for a given account_id to calculate the correct running total. View uses a subquery to ensure that underlying log rows are not filtered by ts, otherwise, this would break the window function.
CREATE TABLE account(
id serial,
name VARCHAR(100)
);
CREATE TABLE log(
id serial,
absolute int,
incremental int,
account_id int,
ts timestamp,
PRIMARY KEY(id),
CONSTRAINT fk_account
FOREIGN KEY(account_id)
REFERENCES account(id)
);
CREATE FUNCTION get_running_total_func(
aggregated_total int,
absolute int,
incremental int
) RETURNS int
LANGUAGE sql IMMUTABLE CALLED ON NULL INPUT AS
$$
SELECT
CASE
WHEN absolute IS NOT NULL THEN absolute
ELSE COALESCE(aggregated_total, 0) + incremental
END
$$;
CREATE AGGREGATE get_running_total(integer, integer) (
sfunc = get_running_total_func,
stype = integer
);
Slow view:
CREATE VIEW test_view
(
log_id,
running_value,
account_id,
ts
)
AS
SELECT log_running.* FROM
(SELECT
log.id,
get_running_total(
log.absolute,
log.incremental
)
OVER(
PARTITION BY log.account_id
ORDER BY log.ts RANGE UNBOUNDED PRECEDING
),
account.id,
ts
FROM log log JOIN account account ON log.account_id=account.id
) AS log_running;
CREATE VIEW
postgres=# EXPLAIN ANALYZE SELECT * FROM test_view WHERE account_id=1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on log_running (cost=12734.02..15981.48 rows=1 width=20) (actual time=7510.851..16122.404 rows=20 loops=1)
Filter: (log_running.id_1 = 1)
Rows Removed by Filter: 99902
-> WindowAgg (cost=12734.02..14732.46 rows=99922 width=32) (actual time=7510.830..14438.783 rows=99922 loops=1)
-> Sort (cost=12734.02..12983.82 rows=99922 width=28) (actual time=7510.628..9312.399 rows=99922 loops=1)
Sort Key: log.account_id, log.ts
Sort Method: external merge Disk: 3328kB
-> Hash Join (cost=143.50..2042.24 rows=99922 width=28) (actual time=169.941..5431.650 rows=99922 loops=1)
Hash Cond: (log.account_id = account.id)
-> Seq Scan on log (cost=0.00..1636.22 rows=99922 width=24) (actual time=0.063..1697.802 rows=99922 loops=1)
-> Hash (cost=81.00..81.00 rows=5000 width=4) (actual time=169.837..169.865 rows=5000 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 240kB
-> Seq Scan on account (cost=0.00..81.00 rows=5000 width=4) (actual time=0.017..84.639 rows=5000 loops=1)
Planning Time: 0.199 ms
Execution Time: 16127.275 ms
(15 rows)
Fast view - only change is account.id -> log.account_id (!):
CREATE VIEW test_view
(
log_id,
running_value,
account_id,
ts
)
AS
SELECT log_running.* FROM
(SELECT
log.id,
get_running_total(
log.absolute,
log.incremental
)
OVER(
PARTITION BY log.account_id
ORDER BY log.ts RANGE UNBOUNDED PRECEDING
),
log.account_id,
ts
FROM log log JOIN account account ON log.account_id=account.id
) AS log_running;
CREATE VIEW
postgres=# EXPLAIN ANALYZE SELECT * FROM test_view WHERE account_id=1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on log_running (cost=1894.96..1895.56 rows=20 width=20) (actual time=34.718..45.958 rows=20 loops=1)
-> WindowAgg (cost=1894.96..1895.36 rows=20 width=28) (actual time=34.691..45.307 rows=20 loops=1)
-> Sort (cost=1894.96..1895.01 rows=20 width=24) (actual time=34.367..35.925 rows=20 loops=1)
Sort Key: log.ts
Sort Method: quicksort Memory: 26kB
-> Nested Loop (cost=0.28..1894.53 rows=20 width=24) (actual time=0.542..34.066 rows=20 loops=1)
-> Index Only Scan using account_pkey on account (cost=0.28..8.30 rows=1 width=4) (actual time=0.025..0.054 rows=1 loops=1)
Index Cond: (id = 1)
Heap Fetches: 1
-> Seq Scan on log (cost=0.00..1886.03 rows=20 width=24) (actual time=0.195..32.937 rows=20 loops=1)
Filter: (account_id = 1)
Rows Removed by Filter: 99902
Planning Time: 0.297 ms
Execution Time: 47.300 ms
(14 rows)
Is it a bug in PostgreSQL implementation? It seems that this change in view definition shouldn't affect performance at all, PostgreSQL should be able to filter data before applying window function for all data set.

PostgreSQL stable functions in query

I'm have some data model which consists of couple tables and I need to filter them.
It is two functions funcFast and funcList. funcFast can return fast result is table need to be filtered by funcList or not. funcList return list of allowed ids. I marked functions as STABLE but they run not as fast as I expect:)
I create couple of example functions:
CREATE OR REPLACE FUNCTION funcFastPlPgSql(res boolean)
returns boolean as $$
begin return res; end
$$ language plpgsql stable;
CREATE OR REPLACE FUNCTION funcList(cnt int)
returns setof integer as $$
select generate_series(1, cnt)
$$ language sql stable;
And tests.
Case 1. Filter only by fast function work OK:
explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where funcFastPlPgSql(true)
Query plan is:
Aggregate (cost=27.76..27.77 rows=1 width=8) (actual time=573.258..573.259 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..114.327 rows=1000000 loops=1)
-> Result (cost=0.25..20.25 rows=1000 width=0) (actual time=0.038..489.942 rows=1000000 loops=1)
One-Time Filter: funcfastplpgsql(true)
-> CTE Scan on obs (cost=0.25..20.25 rows=1000 width=0) (actual time=0.012..392.504 rows=1000000 loops=1)
Planning time: 0.184 ms
Execution time: 576.177 ms
Case 2. Filter only by slow function work OK too:
explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where id in (select funcList(1000))
Query plan is:
Aggregate (cost=62.26..62.27 rows=1 width=8) (actual time=469.344..469.344 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..106.144 rows=1000000 loops=1)
-> Hash Join (cost=22.25..56.00 rows=500 width=0) (actual time=1.566..469.202 rows=1000 loops=1)
Hash Cond: (obs.id = (funclist(1000)))
-> CTE Scan on obs (cost=0.00..20.00 rows=1000 width=4) (actual time=0.009..359.580 rows=1000000 loops=1)
-> Hash (cost=19.75..19.75 rows=200 width=4) (actual time=1.548..1.548 rows=1000 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 44kB
-> HashAggregate (cost=17.75..19.75 rows=200 width=4) (actual time=1.101..1.312 rows=1000 loops=1)
Group Key: funclist(1000)
-> Result (cost=0.00..5.25 rows=1000 width=4) (actual time=0.058..0.706 rows=1000 loops=1)
Planning time: 0.141 ms
Execution time: 472.183 ms
Case 3. But then two function combined I expect what the best case should be close to [case 1] and worst case should be close to [case 2], but:
explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where funcFastPlPgSql(true) or id in (select funcList(1000))
Query plan is:
Aggregate (cost=286.93..286.94 rows=1 width=8) (actual time=1575.775..1575.775 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.008..131.372 rows=1000000 loops=1)
-> CTE Scan on obs (cost=7.75..280.25 rows=667 width=0) (actual time=0.035..1468.007 rows=1000000 loops=1)
Filter: (funcfastplpgsql(true) OR (hashed SubPlan 2))
SubPlan 2
-> Result (cost=0.00..5.25 rows=1000 width=4) (never executed)
Planning time: 0.100 ms
Execution time: 1578.624 ms
What I am missing here? Why query with two together functions runs much longer and how to fix it?

Postgresql very slow query on indexed columns

I have very simple query which uses json data for joining on primary table:
WITH
timecode_range AS
(
SELECT
(t->>'table_id')::integer AS table_id,
(t->>'timecode_from')::bigint AS timecode_from,
(t->>'timecode_to')::bigint AS timecode_to
FROM (SELECT '{"table_id":1,"timecode_from":19890328,"timecode_to":119899328}'::jsonb t) rowset
)
SELECT n.*
FROM partition.json_notification n
INNER JOIN timecode_range r ON n.table_id = r.table_id AND n.timecode > r.timecode_from AND n.timecode <= r.timecode_to
It works perfectly when "timecode_range" returns only 1 record:
Nested Loop (cost=0.43..4668.80 rows=1416 width=97) (actual time=0.352..0.352 rows=0 loops=1)
CTE timecode_range
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.002..0.002 rows=1 loops=1)
-> CTE Scan on timecode_range r (cost=0.00..0.02 rows=1 width=20) (actual time=0.007..0.007 rows=1 loops=1)
-> Index Scan using json_notification_pkey on json_notification n (cost=0.42..4654.61 rows=1416 width=97) (actual time=0.322..0.322 rows=0 loops=1)
Index Cond: ((timecode > r.timecode_from) AND (timecode <= r.timecode_to))
Filter: (r.table_id = table_id)
Planning time: 2.292 ms
Execution time: 0.665 ms
But when I need to return several records:
WITH
timecode_range AS
(
SELECT
(t->>'table_id')::integer AS table_id,
(t->>'timecode_from')::bigint AS timecode_from,
(t->>'timecode_to')::bigint AS timecode_to
FROM (SELECT json_array_elements('[{"table_id":1,"timecode_from":19890328,"timecode_to":119899328}]') t) rowset
)
SELECT n.*
FROM partition.json_notification n
INNER JOIN timecode_range r ON n.table_id = r.table_id AND n.timecode > r.timecode_from AND n.timecode <= r.timecode_to
It starts using sequential scan and execution time dramatically grows :(
Hash Join (cost=7.01..37289.68 rows=92068 width=97) (actual time=418.563..418.563 rows=0 loops=1)
Hash Cond: (n.table_id = r.table_id)
Join Filter: ((n.timecode > r.timecode_from) AND (n.timecode <= r.timecode_to))
Rows Removed by Join Filter: 14444
CTE timecode_range
-> Subquery Scan on rowset (cost=0.00..3.76 rows=100 width=32) (actual time=0.233..0.234 rows=1 loops=1)
-> Result (cost=0.00..0.51 rows=100 width=0) (actual time=0.218..0.218 rows=1 loops=1)
-> Seq Scan on json_notification n (cost=0.00..21703.36 rows=840036 width=97) (actual time=0.205..312.991 rows=840036 loops=1)
-> Hash (cost=2.00..2.00 rows=100 width=20) (actual time=0.239..0.239 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> CTE Scan on timecode_range r (cost=0.00..2.00 rows=100 width=20) (actual time=0.235..0.236 rows=1 loops=1)
Planning time: 4.729 ms
Execution time: 418.937 ms
What am I doing wrong?
PostgreSQL has no possibility to estimate the number of rows returned from a table function, so it uses the ROWS value specified in CREATE FUNCTION (default 1000).
For json_array_elements this value is set to 100:
SELECT prorows FROM pg_proc WHERE proname = 'json_array_elements';
┌─────────┐
│ prorows │
├─────────┤
│ 100 │
└─────────┘
(1 row)
But in your case the function returns only 1 row.
This misestimate makes PostgreSQL choose another join strategy (hash join instead of nested loop), which causes the longer execution time.
If you can choose some other construct than such a table function (e.g. a VALUES statement) that PostgreSQL can estimate, you'll get a better plan.
An alternative is to use a LIMIT clause on the CTE definition if you can safely specify an upper limit.
If you think that PostgreSQL is wrong when it switches to a hash join beyond a certain row count, you can test as follows:
Run the query (using a sequential scan and a hash join) and measure the duration (psql's \timing command will help).
Force a nested loop join:
SET enable_hashjoin=off;
SET enable_mergejoin=off;
Run the query again (with a nested loop join) and measure the duration.
If PostgreSQL is indeed wrong, you could adjust the optimizer parameters by lowering random_page_cost to a value closer to seq_page_cost.

Postgres Slow group by query with max

I am using postgres 9.1 and I have a table with about 3.5M rows of eventtype (varchar) and eventtime (timestamp) - and some other fields. There are only about 20 different eventtype's and the event time spans about 4 years.
I want to get the last timestamp of each event type. If I run a query like:
select eventtype, max(eventtime)
from allevents
group by eventtype
it takes around 20 seconds. Selecting distinct eventtype's is equally slow. The query plan shows a full sequential scan of the table - not surprising it is slow.
Explain analyse for the above query gives:
HashAggregate (cost=84591.47..84591.68 rows=21 width=21) (actual time=20918.131..20918.141 rows=21 loops=1)
-> Seq Scan on allevents (cost=0.00..66117.98 rows=3694698 width=21) (actual time=0.021..4831.793 rows=3694392 loops=1)
Total runtime: 20918.204 ms
If I add a where clause to select a specific eventtype, it takes anywhere from 40ms to 150ms which is at least decent.
Query plan when selecting specific eventtype:
GroupAggregate (cost=343.87..24942.71 rows=1 width=21) (actual time=98.397..98.397 rows=1 loops=1)
-> Bitmap Heap Scan on allevents (cost=343.87..24871.07 rows=14325 width=21) (actual time=6.820..89.610 rows=19736 loops=1)
Recheck Cond: ((eventtype)::text = 'TEST_EVENT'::text)
-> Bitmap Index Scan on allevents_idx2 (cost=0.00..340.28 rows=14325 width=0) (actual time=6.121..6.121 rows=19736 loops=1)
Index Cond: ((eventtype)::text = 'TEST_EVENT'::text)
Total runtime: 98.482 ms
Primary key is (eventtype, eventtime). I also have the following indexes:
allevents_idx (event time desc, eventtype)
allevents_idx2 (eventtype).
How can I speed up the query?
Results of query play for correlated subquery suggested by #denis below with 14 manually entered values gives:
Function Scan on unnest val (cost=0.00..185.40 rows=100 width=32) (actual time=0.121..8983.134 rows=14 loops=1)
SubPlan 2
-> Result (cost=1.83..1.84 rows=1 width=0) (actual time=641.644..641.645 rows=1 loops=14)
InitPlan 1 (returns $1)
-> Limit (cost=0.00..1.83 rows=1 width=8) (actual time=641.640..641.641 rows=1 loops=14)
-> Index Scan using allevents_idx on allevents (cost=0.00..322672.36 rows=175938 width=8) (actual time=641.638..641.638 rows=1 loops=14)
Index Cond: ((eventtime IS NOT NULL) AND ((eventtype)::text = val.val))
Total runtime: 8983.203 ms
Using the recursive query suggested by #jjanes, the query runs between 4 and 5 seconds with the following plan:
CTE Scan on t (cost=260.32..448.63 rows=101 width=32) (actual time=0.146..4325.598 rows=22 loops=1)
CTE t
-> Recursive Union (cost=2.52..260.32 rows=101 width=32) (actual time=0.075..1.449 rows=22 loops=1)
-> Result (cost=2.52..2.53 rows=1 width=0) (actual time=0.074..0.074 rows=1 loops=1)
InitPlan 1 (returns $1)
-> Limit (cost=0.00..2.52 rows=1 width=13) (actual time=0.070..0.071 rows=1 loops=1)
-> Index Scan using allevents_idx2 on allevents (cost=0.00..9315751.37 rows=3696851 width=13) (actual time=0.070..0.070 rows=1 loops=1)
Index Cond: ((eventtype)::text IS NOT NULL)
-> WorkTable Scan on t (cost=0.00..25.58 rows=10 width=32) (actual time=0.059..0.060 rows=1 loops=22)
Filter: (eventtype IS NOT NULL)
SubPlan 3
-> Result (cost=2.53..2.54 rows=1 width=0) (actual time=0.059..0.059 rows=1 loops=21)
InitPlan 2 (returns $3)
-> Limit (cost=0.00..2.53 rows=1 width=13) (actual time=0.057..0.057 rows=1 loops=21)
-> Index Scan using allevents_idx2 on allevents (cost=0.00..3114852.66 rows=1232284 width=13) (actual time=0.055..0.055 rows=1 loops=21)
Index Cond: (((eventtype)::text IS NOT NULL) AND ((eventtype)::text > t.eventtype))
SubPlan 6
-> Result (cost=1.83..1.84 rows=1 width=0) (actual time=196.549..196.549 rows=1 loops=22)
InitPlan 5 (returns $6)
-> Limit (cost=0.00..1.83 rows=1 width=8) (actual time=196.546..196.546 rows=1 loops=22)
-> Index Scan using allevents_idx on allevents (cost=0.00..322946.21 rows=176041 width=8) (actual time=196.544..196.544 rows=1 loops=22)
Index Cond: ((eventtime IS NOT NULL) AND ((eventtype)::text = t.eventtype))
Total runtime: 4325.694 ms
What you need is a "skip scan" or "loose index scan". PostgreSQL's planner does not yet implement those automatically, but you can trick it into using one by using a recursive query.
WITH RECURSIVE t AS (
SELECT min(eventtype) AS eventtype FROM allevents
UNION ALL
SELECT (SELECT min(eventtype) as eventtype FROM allevents WHERE eventtype > t.eventtype)
FROM t where t.eventtype is not null
)
select eventtype, (select max(eventtime) from allevents where eventtype=t.eventtype) from t;
There may be a way to collapse the max(eventtime) into the recursive query rather than doing it outside that query, but if so I have not hit upon it.
This needs an index on (eventtype, eventtime) in order to be efficient. You can have it be DESC on the eventtime, but that is not necessary. This is efficiently only if eventtype has only a few distinct values (21 of them, in your case).
Based on the question you already have the relevant index.
If upgrading to Postgres 9.3 or an index on (eventtype, eventtime desc) doesn't make a difference, this is a case where rewriting the query so it uses a correlated subquery works very well if you can enumerate all of the event types manually:
select val as eventtype,
(select max(eventtime)
from allevents
where allevents.eventtype = val
) as eventtime
from unnest('{type1,type2,…}'::text[]) as val;
Here's the plans I get when running similar queries:
denis=# select version();
version
-----------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 9.3.1 on x86_64-apple-darwin11.4.2, compiled by Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn), 64-bit
(1 row)
Test data:
denis=# create table test (evttype int, evttime timestamp, primary key (evttype, evttime));
CREATE TABLE
denis=# insert into test (evttype, evttime) select i, now() + (i % 3) * interval '1 min' - j * interval '1 sec' from generate_series(1,10) i, generate_series(1,10000) j;
INSERT 0 100000
denis=# create index on test (evttime, evttype);
CREATE INDEX
denis=# vacuum analyze test;
VACUUM
First query:
denis=# explain analyze select evttype, max(evttime) from test group by evttype; QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=2041.00..2041.10 rows=10 width=12) (actual time=54.983..54.987 rows=10 loops=1)
-> Seq Scan on test (cost=0.00..1541.00 rows=100000 width=12) (actual time=0.009..15.954 rows=100000 loops=1)
Total runtime: 55.045 ms
(3 rows)
Second query:
denis=# explain analyze select val as evttype, (select max(evttime) from test where test.evttype = val) as evttime from unnest('{1,2,3,4,5,6,7,8,9,10}'::int[]) val;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Function Scan on unnest val (cost=0.00..48.39 rows=100 width=4) (actual time=0.086..0.292 rows=10 loops=1)
SubPlan 2
-> Result (cost=0.46..0.47 rows=1 width=0) (actual time=0.024..0.024 rows=1 loops=10)
InitPlan 1 (returns $1)
-> Limit (cost=0.42..0.46 rows=1 width=8) (actual time=0.021..0.021 rows=1 loops=10)
-> Index Only Scan Backward using test_pkey on test (cost=0.42..464.42 rows=10000 width=8) (actual time=0.019..0.019 rows=1 loops=10)
Index Cond: ((evttype = val.val) AND (evttime IS NOT NULL))
Heap Fetches: 0
Total runtime: 0.370 ms
(9 rows)
index on (eventtype, eventtime desc) should help. or reindex on primary key index. I would also recommend replace type of eventtype to enum (if number of types is fixed) or int/smallint. This will decrease size of data and indexes so queries will run faster.