How to get "memory usage" on postgres CTE query plan? - postgresql

I'm comparing different queries options in order to improve perfomance and I cannot understand why my CTE query plan does not show the "memory usage" option when I run it with explain analyze.
Here's my query:
EXPLAIN ANALYZE WITH
CTE AS (
SELECT
id,
jsonb_agg(CASE WHEN each_result->>'another_key' = 'A'
THEN jsonb_set(each_result, '{another_key}', '"B"', false)
ELSE each_result
END
) AS result_updated
FROM my_table, jsonb_array_elements(column->'key') as result
GROUP BY id
)
UPDATE my_table
SET column = jsonb_set(column, '{each_result}', (SELECT result_updated FROM CTE WHERE CTE.id = my_table.id), false)
;
And here's the query plan result:
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Update on custom_sequence (cost=3305.32..25976.06 rows=1003 width=83) (actual time=4071.870..4071.870 rows=0 loops=1)
CTE cte
-> HashAggregate (cost=3292.78..3305.32 rows=1003 width=48) (actual time=536.656..618.699 rows=1003 loops=1)
Group Key: my_table_1.id
-> Nested Loop (cost=0.01..2039.04 rows=100300 width=48) (actual time=0.323..59.200 rows=78234 loops=1)
-> Seq Scan on my_table my_table_1 (cost=0.00..33.03 rows=1003 width=34) (actual time=0.023..0.289 rows=1003 loops=1)
-> Function Scan on jsonb_array_elements step (cost=0.01..1.00 rows=100 width=32) (actual time=0.044..0.049 rows=78 loops=1003)
-> Seq Scan on my_table (cost=0.00..22670.74 rows=1003 width=83) (actual time=629.743..3701.285 rows=1003 loops=1)
SubPlan 2
-> CTE Scan on cte (cost=0.00..22.57 rows=5 width=32) (actual time=1.992..3.482 rows=1 loops=1003)
Filter: (id = my_table.id)
Rows Removed by Filter: 1002
Planning time: 0.458 ms
Execution time: 4081.778 ms
(14 rows)
My another option is an UPDATE FROM:
EXPLAIN ANALYZE UPDATE my_table
SET column = jsonb_set(column, '{each_result}', q2.result_updated, false)
FROM (
SELECT
id,
jsonb_agg(CASE WHEN each_result->>'anoter_key' = 'A'
THEN jsonb_set(each_result, '{another_key}', '"B"', false)
ELSE each_result
END
) AS result_updated
FROM my_table, jsonb_array_elements(column->'key') AS result
GROUP BY id
) q2
WHERE q2.id = my_table.id
;
When I run query plan for this query, I got the "memory usage" info :
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Update on my_table (cost=3748.30..4223.44 rows=1001 width=153) (actual time=992.003..992.003 rows=0 loops=1)
-> Hash Join (cost=3748.30..4223.44 rows=1001 width=153) (actual time=633.204..758.817 rows=1004 loops=1)
Hash Cond: (my_table.id = q2.id)
-> Seq Scan on my_table (cost=0.00..460.01 rows=1001 width=67) (actual time=0.031..0.550 rows=1004 loops=1)
-> Hash (cost=3735.79..3735.79 rows=1001 width=120) (actual time=632.951..632.951 rows=1004 loops=1)
Buckets: 1024 (originally 1024) Batches: 2 (originally 1) Memory Usage: 32791kB
-> Subquery Scan on q2 (cost=3713.26..3735.79 rows=1001 width=120) (actual time=537.310..605.349 rows=1004 loops=1)
-> HashAggregate (cost=3713.26..3725.78 rows=1001 width=48) (actual time=537.306..604.269 rows=1004 loops=1)
Group Key: my_table_1.id
-> Nested Loop (cost=0.01..2462.01 rows=100100 width=48) (actual time=0.373..47.725 rows=78312 loops=1)
-> Seq Scan on my_table my_table_1 (cost=0.00..460.01 rows=1001 width=34) (actual time=0.014..0.483 rows=1004 loops=1)
-> Function Scan on jsonb_array_elements step (cost=0.01..1.00 rows=100 width=32) (actual time=0.036..0.039 rows=78 loops=1004)
Planning time: 0.924 ms
Execution time: 998.818 ms
(14 rows)
How can I get the memory usage information on my CTE query too?

Related

How does a string operation on a column in a filter condition of a Postgresql query have on the plan it chooses

I was working on optimising a query, with dumb luck I tried something and it improved the query but I am unable to explain why.
Below is the query with poor performance
with ctedata1 as(
select
sum(total_visit_count) as total_visit_count,
sum(sh_visit_count) as sh_visit_count,
sum(ec_visit_count) as ec_visit_count,
sum(total_like_count) as total_like_count,
sum(sh_like_count) as sh_like_count,
sum(ec_like_count) as ec_like_count,
sum(total_order_count) as total_order_count,
sum(sh_order_count) as sh_order_count,
sum(ec_order_count) as ec_order_count,
sum(total_sales_amount) as total_sales_amount,
sum(sh_sales_amount) as sh_sales_amount,
sum(ec_sales_amount) as ec_sales_amount,
sum(ec_order_online_count) as ec_order_online_count,
sum(ec_sales_online_amount) as ec_sales_online_amount,
sum(ec_order_in_store_count) as ec_order_in_store_count,
sum(ec_sales_in_store_amount) as ec_sales_in_store_amount,
table2.im_name,
table2.brand as kpibrand,
table2.id_region as kpiregion
from
table2
where
deleted_at is null
and id_region = any('{1}')
group by
im_name,
kpiregion,
kpibrand ),
ctedata2 as (
select
ctedata1.*,
rank() over (partition by (kpiregion,
kpibrand)
order by
coalesce(ctedata1.total_sales_amount, 0) desc) rank,
count(*) over (partition by (kpiregion,
kpibrand)) as total_count
from
ctedata1 )
select
table1.id_pf_item,
table1.product_id,
table1.color_code,
table1.l1_code,
table1.local_title as product_name,
table1.id_region,
table1.gender,
case
when table1.created_at is null then '1970/01/01 00:00:00'
else table1.created_at
end as created_at,
(
select
count(distinct id_outfit)
from
table3
left join table4 on
table3.id_item = table4.id_item
and table4.deleted_at is null
where
table3.deleted_at is null
and table3.id_pf_item = table1.id_pf_item) as outfit_count,
count(*) over() as total_matched,
case
when table1.v8_im_name = '' then table1.im_name
else table1.v8_im_name
end as im_name,
case
when table1.id_region != 1 then null
else
case
when table1.sales_start_at is null then '1970/01/01 00:00:00'
else table1.sales_start_at
end
end as sales_start_date,
table1.category_ids,
array_to_string(table1.intermediate_category_ids, ','),
table1.image_url,
table1.brand,
table1.pdp_url,
coalesce(ctedata2.total_visit_count, 0) as total_visit_count,
coalesce(ctedata2.sh_visit_count, 0) as sh_visit_count,
coalesce(ctedata2.ec_visit_count, 0) as ec_visit_count,
coalesce(ctedata2.total_like_count, 0) as total_like_count,
coalesce(ctedata2.sh_like_count, 0) as sh_like_count,
coalesce(ctedata2.ec_like_count, 0) as ec_like_count,
coalesce(ctedata2.total_order_count, 0) as total_order_count,
coalesce(ctedata2.sh_order_count, 0) as sh_order_count,
coalesce(ctedata2.ec_order_count, 0) as ec_order_count,
coalesce(ctedata2.total_sales_amount, 0) as total_sales_amount,
coalesce(ctedata2.sh_sales_amount, 0) as sh_sales_amount,
coalesce(ctedata2.ec_sales_amount, 0) as ec_sales_amount,
coalesce(ctedata2.ec_order_online_count, 0) as ec_order_online_count,
coalesce(ctedata2.ec_sales_online_amount, 0) as ec_sales_online_amount,
coalesce(ctedata2.ec_order_in_store_count, 0) as ec_order_in_store_count,
coalesce(ctedata2.ec_sales_in_store_amount, 0) as ec_sales_in_store_amount,
ctedata2.rank,
ctedata2.total_count,
table1.department,
table1.seasons
from
table1
left join ctedata2 on
table1.im_name = ctedata2.im_name
and table1.brand = ctedata2.kpibrand
where
table1.deleted_at is null
and table1.id_region = any('{1}')
and lower(table1.brand) = any('{"brand1","brand2"}')
and 'season1' = any(lower(seasons::text)::text[])
and table1.department = 'Department1'
order by
total_sales_amount desc offset 0
limit 100
The explain output for above query is
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=172326.55..173435.38 rows=1 width=952) (actual time=85664.201..85665.970 rows=100 loops=1)
CTE ctedata1
-> GroupAggregate (cost=0.42..80478.71 rows=43468 width=530) (actual time=0.063..708.069 rows=73121 loops=1)
Group Key: table2.im_name, table2.id_region, table2.brand
-> Index Scan using udx_table2_im_name_id_region_brand_target_date_key on table2 (cost=0.42..59699.18 rows=391708 width=146) (actual time=0.029..308.582 rows=391779 loops=1)
Filter: ((deleted_at IS NULL) AND (id_region = ANY ('{1}'::integer[])))
Rows Removed by Filter: 20415
CTE ctedata2
-> WindowAgg (cost=16104.06..17842.78 rows=43468 width=628) (actual time=1012.994..1082.057 rows=73121 loops=1)
-> WindowAgg (cost=16104.06..17082.09 rows=43468 width=620) (actual time=945.755..1014.656 rows=73121 loops=1)
-> Sort (cost=16104.06..16212.73 rows=43468 width=612) (actual time=945.747..963.254 rows=73121 loops=1)
Sort Key: ctedata1.kpiregion, ctedata1.kpibrand, (COALESCE(ctedata1.total_sales_amount, '0'::numeric)) DESC
Sort Method: external merge Disk: 6536kB
-> CTE Scan on ctedata1 (cost=0.00..869.36 rows=43468 width=612) (actual time=0.069..824.841 rows=73121 loops=1)
-> Result (cost=74005.05..75113.88 rows=1 width=952) (actual time=85664.199..85665.950 rows=100 loops=1)
-> Sort (cost=74005.05..74005.05 rows=1 width=944) (actual time=85664.072..85664.089 rows=100 loops=1)
Sort Key: (COALESCE(ctedata2.total_sales_amount, '0'::numeric)) DESC
Sort Method: top-N heapsort Memory: 76kB
-> WindowAgg (cost=10960.95..74005.04 rows=1 width=944) (actual time=85658.049..85661.393 rows=3151 loops=1)
-> Nested Loop Left Join (cost=10960.95..74005.02 rows=1 width=927) (actual time=1075.219..85643.595 rows=3151 loops=1)
Join Filter: (((table1.im_name)::text = ctedata2.im_name) AND ((table1.brand)::text = ctedata2.kpibrand))
Rows Removed by Join Filter: 230402986
-> Bitmap Heap Scan on table1 (cost=10960.95..72483.64 rows=1 width=399) (actual time=45.466..278.376 rows=3151 loops=1)
Recheck Cond: (id_region = ANY ('{1}'::integer[]))
Filter: ((deleted_at IS NULL) AND (department = 'Department1'::text) AND (lower((brand)::text) = ANY ('{brand1, brand2}'::text[])) AND ('season1'::text = ANY ((lower((seasons)::text))::text[])))
Rows Removed by Filter: 106335
Heap Blocks: exact=42899
-> Bitmap Index Scan on table1_im_name_id_region_key (cost=0.00..10960.94 rows=110619 width=0) (actual time=38.307..38.307 rows=109486 loops=1)
Index Cond: (id_region = ANY ('{1}'::integer[]))
-> CTE Scan on ctedata2 (cost=0.00..869.36 rows=43468 width=592) (actual time=0.325..21.721 rows=73121 loops=3151)
SubPlan 3
-> Aggregate (cost=1108.80..1108.81 rows=1 width=8) (actual time=0.018..0.018 rows=1 loops=100)
-> Nested Loop Left Join (cost=5.57..1108.57 rows=93 width=4) (actual time=0.007..0.016 rows=3 loops=100)
-> Bitmap Heap Scan on table3 (cost=5.15..350.95 rows=93 width=4) (actual time=0.005..0.008 rows=3 loops=100)
Recheck Cond: (id_pf_item = table1.id_pf_item)
Filter: (deleted_at IS NULL)
Heap Blocks: exact=107
-> Bitmap Index Scan on idx_id_pf_item (cost=0.00..5.12 rows=93 width=0) (actual time=0.003..0.003 rows=3 loops=100)
Index Cond: (id_pf_item = table1.id_pf_item)
-> Index Scan using index_table4_id_item on table4 (cost=0.42..8.14 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=303)
Index Cond: (table3.id_item = id_item)
Filter: (deleted_at IS NULL)
Rows Removed by Filter: 0
Planning time: 1.023 ms
Execution time: 85669.512 ms
I changed
and lower(table1.brand) = any('{"brand1","brand2"}')
in the query to
and table1.brand = any('{"Brand1","Brand2"}')
and the plan changed to
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=173137.44..188661.06 rows=14 width=952) (actual time=1444.123..1445.653 rows=100 loops=1)
CTE ctedata1
-> GroupAggregate (cost=0.42..80478.71 rows=43468 width=530) (actual time=0.040..769.982 rows=73121 loops=1)
Group Key: table2.im_name, table2.id_region, table2.brand
-> Index Scan using udx_table2_item_im_name_id_region_brand_target_date_key on table2 (cost=0.42..59699.18 rows=391708 width=146) (actual time=0.021..350.774 rows=391779 loops=1)
Filter: ((deleted_at IS NULL) AND (id_region = ANY ('{1}'::integer[])))
Rows Removed by Filter: 20415
CTE ctedata2
-> WindowAgg (cost=16104.06..17842.78 rows=43468 width=628) (actual time=1088.905..1153.749 rows=73121 loops=1)
-> WindowAgg (cost=16104.06..17082.09 rows=43468 width=620) (actual time=1020.017..1089.117 rows=73121 loops=1)
-> Sort (cost=16104.06..16212.73 rows=43468 width=612) (actual time=1020.011..1037.170 rows=73121 loops=1)
Sort Key: ctedata1.kpiregion, ctedata1.kpibrand, (COALESCE(ctedata1.total_sales_amount, '0'::numeric)) DESC
Sort Method: external merge Disk: 6536kB
-> CTE Scan on ctedata1 (cost=0.00..869.36 rows=43468 width=612) (actual time=0.044..891.653 rows=73121 loops=1)
-> Result (cost=74815.94..90339.56 rows=14 width=952) (actual time=1444.121..1445.635 rows=100 loops=1)
-> Sort (cost=74815.94..74815.98 rows=14 width=944) (actual time=1444.053..1444.065 rows=100 loops=1)
Sort Key: (COALESCE(ctedata2.total_sales_amount, '0'::numeric)) DESC
Sort Method: top-N heapsort Memory: 76kB
-> WindowAgg (cost=72207.31..74815.68 rows=14 width=944) (actual time=1439.128..1441.885 rows=3151 loops=1)
-> Hash Right Join (cost=72207.31..74815.40 rows=14 width=927) (actual time=1307.531..1437.246 rows=3151 loops=1)
Hash Cond: ((ctedata2.im_name = (table1.im_name)::text) AND (ctedata2.kpibrand = (table1.brand)::text))
-> CTE Scan on ctedata2 (cost=0.00..869.36 rows=43468 width=592) (actual time=1088.911..1209.646 rows=73121 loops=1)
-> Hash (cost=72207.10..72207.10 rows=14 width=399) (actual time=216.850..216.850 rows=3151 loops=1)
Buckets: 4096 (originally 1024) Batches: 1 (originally 1) Memory Usage: 1249kB
-> Bitmap Heap Scan on table1 (cost=10960.95..72207.10 rows=14 width=399) (actual time=46.434..214.246 rows=3151 loops=1)
Recheck Cond: (id_region = ANY ('{1}'::integer[]))
Filter: ((deleted_at IS NULL) AND (department = 'Department1'::text) AND ((brand)::text = ANY ('{Brand1, Brand2}'::text[])) AND ('season1'::text = ANY ((lower((seasons)::text))::text[])))
Rows Removed by Filter: 106335
Heap Blocks: exact=42899
-> Bitmap Index Scan on table1_im_name_id_region_key (cost=0.00..10960.94 rows=110619 width=0) (actual time=34.849..34.849 rows=109486 loops=1)
Index Cond: (id_region = ANY ('{1}'::integer[]))
SubPlan 3
-> Aggregate (cost=1108.80..1108.81 rows=1 width=8) (actual time=0.015..0.015 rows=1 loops=100)
-> Nested Loop Left Join (cost=5.57..1108.57 rows=93 width=4) (actual time=0.006..0.014 rows=3 loops=100)
-> Bitmap Heap Scan on table3 (cost=5.15..350.95 rows=93 width=4) (actual time=0.004..0.006 rows=3 loops=100)
Recheck Cond: (id_pf_item = table1.id_pf_item)
Filter: (deleted_at IS NULL)
Heap Blocks: exact=107
-> Bitmap Index Scan on idx_id_pf_item (cost=0.00..5.12 rows=93 width=0) (actual time=0.003..0.003 rows=3 loops=100)
Index Cond: (id_pf_item = table1.id_pf_item)
-> Index Scan using index_table4_id_item on table4 (cost=0.42..8.14 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=303)
Index Cond: (table3.id_item = id_item)
Filter: (deleted_at IS NULL)
Rows Removed by Filter: 0
Planning time: 0.760 ms
Execution time: 1448.848 ms
My Observation
The join strategy for table1 left join ctedata2 changes after the lower() function is avoided. The strategy changes from nested loop left join to hash right join.
The CTE Scan node on ctedata2 is executed only once in the better performing query.
Postgres Version
9.6
Please help me to understand this behaviour. I will supply additional info if required.
It is almost not worthwhile taking a deep dive into the inner workings of a nearly-obsolete version. That time and energy is probably better spent jollying along an upgrade.
But the problem is pretty plain. Your scan on table1 is estimated dreadfully, although 14 times less dreadful in the better plan.
-> Bitmap Heap Scan on table1 (cost=10960.95..72483.64 rows=1 width=399) (actual time=45.466..278.376 rows=3151 loops=1)
-> Bitmap Heap Scan on table1 (cost=10960.95..72207.10 rows=14 width=399) (actual time=46.434..214.246 rows=3151 loops=1)
Your use of lower(), apparently without reason, surely contributes to the poor estimation. And dynamically converting a string into an array certainly doesn't help either. If it were stored as a real array in the first place, the statistics system could get its hands on it and generate more reasonable estimates.

Planner not using index order to sort the records using CTE

I am trying to pass some ids into an in-clause on a sorted index with the same order by condition but the query planner is explicitly sorting the data after performing index search. below are my queries.
Generate a temporary table.
SELECT a.n/20 as n, md5(a.n::TEXT) as b INTO temp_table
From generate_series(1, 100000) as a(n);
create an index
CREATE INDEX idx_temp_table ON temp_table(n ASC, b ASC);
In below query, planner uses index ordering and doesn't explicitly sorts the data.(expected)
EXPLAIN ANALYSE
SELECT * from
temp_table WHERE n = 10
ORDER BY n, b
limit 5;
Query Plan
QUERY PLAN Limit (cost=0.42..16.07 rows=5 width=36) (actual time=0.098..0.101 rows=5 loops=1)
-> Index Only Scan using idx_temp_table on temp_table (cost=0.42..1565.17 rows=500 width=36) (actual time=0.095..0.098 rows=5 loops=1)
Index Cond: (n = 10)
Heap Fetches: 5 Planning time: 0.551 ms Execution time: 0.128 ms
but when i use one or more ids from a cte and pass them in clause then planner only uses index to fetch the values but explicitly sorts them afterwards (not expected).
EXPLAIN ANALYSE
WITH cte(x) AS (VALUES (10))
SELECT * from temp_table
WHERE n IN ( SELECT x from cte)
ORDER BY n, b
limit 5;
then planner uses below query plan
QUERY PLAN
QUERY PLAN
Limit (cost=85.18..85.20 rows=5 width=37) (actual time=0.073..0.075 rows=5 loops=1)
CTE cte
-> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=4) (actual time=0.001..0.002 rows=2 loops=1)
-> Sort (cost=85.16..85.26 rows=40 width=37) (actual time=0.072..0.073 rows=5 loops=1)
Sort Key: temp_table.n, temp_table.b
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=0.47..84.50 rows=40 width=37) (actual time=0.037..0.056 rows=40 loops=1)
-> Unique (cost=0.05..0.06 rows=2 width=4) (actual time=0.009..0.010 rows=2 loops=1)
-> Sort (cost=0.05..0.06 rows=2 width=4) (actual time=0.009..0.010 rows=2 loops=1)
Sort Key: cte.x
Sort Method: quicksort Memory: 25kB
-> CTE Scan on cte (cost=0.00..0.04 rows=2 width=4) (actual time=0.004..0.005 rows=2 loops=1)
-> Index Only Scan using idx_temp_table on temp_table (cost=0.42..42.02 rows=20 width=37) (actual time=0.012..0.018 rows=20 loops=2)
Index Cond: (n = cte.x)
Heap Fetches: 40
Planning time: 0.166 ms
Execution time: 0.101 ms
I tried putting an explicit sorting while passing the ids in where clause so that sorted order in ids is maintained but still planner sorted explicitly
EXPLAIN ANALYSE
WITH cte(x) AS (VALUES (10))
SELECT * from temp_table
WHERE n IN ( SELECT x from cte)
ORDER BY n, b
limit 5;
Query plan
QUERY PLAN
Limit (cost=42.62..42.63 rows=5 width=37) (actual time=0.042..0.044 rows=5 loops=1)
CTE cte
-> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)
-> Sort (cost=42.61..42.66 rows=20 width=37) (actual time=0.042..0.042 rows=5 loops=1)
Sort Key: temp_table.n, temp_table.b
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=0.46..42.28 rows=20 width=37) (actual time=0.025..0.033 rows=20 loops=1)
-> HashAggregate (cost=0.05..0.06 rows=1 width=4) (actual time=0.009..0.009 rows=1 loops=1)
Group Key: cte.x
-> Sort (cost=0.03..0.04 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=1)
Sort Key: cte.x
Sort Method: quicksort Memory: 25kB
-> CTE Scan on cte (cost=0.00..0.02 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=1)
-> Index Only Scan using idx_temp_table on temp_table (cost=0.42..42.02 rows=20 width=37) (actual time=0.014..0.020 rows=20 loops=1)
Index Cond: (n = cte.x)
Heap Fetches: 20
Planning time: 0.167 ms
Execution time: 0.074 ms
Can anyone explain why planner is using an explicit sort on the data? Is there a way to by pass this and make planner use the index sorting order so additional sorting on the records can be saved. In production, we have similar case but size of our selection is too big but only a handful of records needs to fetched with pagination. Thanks in anticipation!
It is actually a decision made by the planner, with a larger set of values(), Postgres will switch to a smarter plan, with the sort done before the merge.
select version();
\echo +++++ Original
EXPLAIN ANALYSE
WITH cte(x) AS (VALUES (10))
SELECT * from temp_table
WHERE n IN ( SELECT x from cte)
ORDER BY n, b
limit 5;
\echo +++++ TEN Values
EXPLAIN ANALYSE
WITH cte(x) AS (VALUES (10),(11),(12),(13),(14),(15),(16),(17),(18),(19)
)
SELECT * from temp_table
WHERE n IN ( SELECT x from cte)
ORDER BY n, b
limit 5;
\echo ++++++++ one row from table
EXPLAIN ANALYSE
WITH cte(x) AS (SELECT n FROM temp_table WHERE n = 10)
SELECT * from temp_table
WHERE n IN ( SELECT x from cte)
ORDER BY n, b
limit 5;
\echo ++++++++ one row from table TWO ctes
EXPLAIN ANALYSE
WITH val(x) AS (VALUES (10))
, cte(x) AS (
SELECT n FROM temp_table WHERE n IN (select x from val)
)
SELECT * from temp_table
WHERE n IN ( SELECT x from cte)
ORDER BY n, b
limit 5;
Resulting plans:
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 11.3 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4, 64-bit
(1 row)
+++++ Original
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=13.72..13.73 rows=5 width=37) (actual time=0.197..0.200 rows=5 loops=1)
CTE cte
-> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)
-> Sort (cost=13.71..13.76 rows=20 width=37) (actual time=0.194..0.194 rows=5 loops=1)
Sort Key: temp_table.n, temp_table.b
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=0.44..13.37 rows=20 width=37) (actual time=0.083..0.097 rows=20 loops=1)
-> HashAggregate (cost=0.02..0.03 rows=1 width=4) (actual time=0.018..0.018 rows=1 loops=1)
Group Key: cte.x
-> CTE Scan on cte (cost=0.00..0.02 rows=1 width=4) (actual time=0.007..0.008 rows=1 loops=1)
-> Index Only Scan using idx_temp_table on temp_table (cost=0.42..13.14 rows=20 width=37) (actual time=0.058..0.068 rows=20 loops=1)
Index Cond: (n = cte.x)
Heap Fetches: 20
Planning Time: 1.328 ms
Execution Time: 0.360 ms
(15 rows)
+++++ TEN Values
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.91..89.11 rows=5 width=37) (actual time=0.179..0.183 rows=5 loops=1)
CTE cte
-> Values Scan on "*VALUES*" (cost=0.00..0.12 rows=10 width=4) (actual time=0.001..0.007 rows=10 loops=1)
-> Merge Semi Join (cost=0.78..3528.72 rows=200 width=37) (actual time=0.178..0.181 rows=5 loops=1)
Merge Cond: (temp_table.n = cte.x)
-> Index Only Scan using idx_temp_table on temp_table (cost=0.42..3276.30 rows=100000 width=37) (actual time=0.030..0.123 rows=204 loops=1)
Heap Fetches: 204
-> Sort (cost=0.37..0.39 rows=10 width=4) (actual time=0.023..0.023 rows=1 loops=1)
Sort Key: cte.x
Sort Method: quicksort Memory: 25kB
-> CTE Scan on cte (cost=0.00..0.20 rows=10 width=4) (actual time=0.003..0.013 rows=10 loops=1)
Planning Time: 0.197 ms
Execution Time: 0.226 ms
(13 rows)
++++++++ one row from table
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=14.39..58.52 rows=5 width=37) (actual time=0.168..0.173 rows=5 loops=1)
CTE cte
-> Index Only Scan using idx_temp_table on temp_table temp_table_1 (cost=0.42..13.14 rows=20 width=4) (actual time=0.010..0.020 rows=20 loops=1)
Index Cond: (n = 10)
Heap Fetches: 20
-> Merge Semi Join (cost=1.25..3531.24 rows=400 width=37) (actual time=0.167..0.170 rows=5 loops=1)
Merge Cond: (temp_table.n = cte.x)
-> Index Only Scan using idx_temp_table on temp_table (cost=0.42..3276.30 rows=100000 width=37) (actual time=0.025..0.101 rows=204 loops=1)
Heap Fetches: 204
-> Sort (cost=0.83..0.88 rows=20 width=4) (actual time=0.039..0.039 rows=1 loops=1)
Sort Key: cte.x
Sort Method: quicksort Memory: 25kB
-> CTE Scan on cte (cost=0.00..0.40 rows=20 width=4) (actual time=0.012..0.031 rows=20 loops=1)
Planning Time: 0.243 ms
Execution Time: 0.211 ms
(15 rows)
++++++++ one row from table TWO ctes
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=14.63..58.76 rows=5 width=37) (actual time=0.224..0.229 rows=5 loops=1)
CTE val
-> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)
CTE cte
-> Nested Loop (cost=0.44..13.37 rows=20 width=4) (actual time=0.038..0.052 rows=20 loops=1)
-> HashAggregate (cost=0.02..0.03 rows=1 width=4) (actual time=0.007..0.007 rows=1 loops=1)
Group Key: val.x
-> CTE Scan on val (cost=0.00..0.02 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=1)
-> Index Only Scan using idx_temp_table on temp_table temp_table_1 (cost=0.42..13.14 rows=20 width=4) (actual time=0.029..0.038 rows=20 loops=1)
Index Cond: (n = val.x)
Heap Fetches: 20
-> Merge Semi Join (cost=1.25..3531.24 rows=400 width=37) (actual time=0.223..0.226 rows=5 loops=1)
Merge Cond: (temp_table.n = cte.x)
-> Index Only Scan using idx_temp_table on temp_table (cost=0.42..3276.30 rows=100000 width=37) (actual time=0.038..0.114 rows=204 loops=1)
Heap Fetches: 204
-> Sort (cost=0.83..0.88 rows=20 width=4) (actual time=0.082..0.082 rows=1 loops=1)
Sort Key: cte.x
Sort Method: quicksort Memory: 25kB
-> CTE Scan on cte (cost=0.00..0.40 rows=20 width=4) (actual time=0.040..0.062 rows=20 loops=1)
Planning Time: 0.362 ms
Execution Time: 0.313 ms
(21 rows)
Beware of CTEs!.
For the planner, CTEs are more or less black boxes, and very little is known about expected number of rows, statistics distribution, or ordering inside.
In cases where CTEs result in a bad plan (the original question is not such a case), a CTE can often be replaced by a (temp) view, which is seen by the planner in its full naked glory.
Update
Starting with version 11, CTEs are handled differently by the planner: if they do not have side effects, they are candidates for being merged with the main query. (but is still a good idea to check your query plans)
The optimizet isn't aware that the CTE is sorted. If you scan an index for multiple values and have an ORDER BY, PostgreSQL will always sort.
The only thing that comes to my mind is to create a temporary table with the values from the IN list and put an index on that temporary table. Then when you join with that table, PostgreSQL will be aware of the ordering and might for example choose a merge join that can use the indexes.
Of course that means a lot of overhead, and it could easily be that the original sort wins out.

PostgreSQL stable functions in query

I'm have some data model which consists of couple tables and I need to filter them.
It is two functions funcFast and funcList. funcFast can return fast result is table need to be filtered by funcList or not. funcList return list of allowed ids. I marked functions as STABLE but they run not as fast as I expect:)
I create couple of example functions:
CREATE OR REPLACE FUNCTION funcFastPlPgSql(res boolean)
returns boolean as $$
begin return res; end
$$ language plpgsql stable;
CREATE OR REPLACE FUNCTION funcList(cnt int)
returns setof integer as $$
select generate_series(1, cnt)
$$ language sql stable;
And tests.
Case 1. Filter only by fast function work OK:
explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where funcFastPlPgSql(true)
Query plan is:
Aggregate (cost=27.76..27.77 rows=1 width=8) (actual time=573.258..573.259 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..114.327 rows=1000000 loops=1)
-> Result (cost=0.25..20.25 rows=1000 width=0) (actual time=0.038..489.942 rows=1000000 loops=1)
One-Time Filter: funcfastplpgsql(true)
-> CTE Scan on obs (cost=0.25..20.25 rows=1000 width=0) (actual time=0.012..392.504 rows=1000000 loops=1)
Planning time: 0.184 ms
Execution time: 576.177 ms
Case 2. Filter only by slow function work OK too:
explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where id in (select funcList(1000))
Query plan is:
Aggregate (cost=62.26..62.27 rows=1 width=8) (actual time=469.344..469.344 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..106.144 rows=1000000 loops=1)
-> Hash Join (cost=22.25..56.00 rows=500 width=0) (actual time=1.566..469.202 rows=1000 loops=1)
Hash Cond: (obs.id = (funclist(1000)))
-> CTE Scan on obs (cost=0.00..20.00 rows=1000 width=4) (actual time=0.009..359.580 rows=1000000 loops=1)
-> Hash (cost=19.75..19.75 rows=200 width=4) (actual time=1.548..1.548 rows=1000 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 44kB
-> HashAggregate (cost=17.75..19.75 rows=200 width=4) (actual time=1.101..1.312 rows=1000 loops=1)
Group Key: funclist(1000)
-> Result (cost=0.00..5.25 rows=1000 width=4) (actual time=0.058..0.706 rows=1000 loops=1)
Planning time: 0.141 ms
Execution time: 472.183 ms
Case 3. But then two function combined I expect what the best case should be close to [case 1] and worst case should be close to [case 2], but:
explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where funcFastPlPgSql(true) or id in (select funcList(1000))
Query plan is:
Aggregate (cost=286.93..286.94 rows=1 width=8) (actual time=1575.775..1575.775 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.008..131.372 rows=1000000 loops=1)
-> CTE Scan on obs (cost=7.75..280.25 rows=667 width=0) (actual time=0.035..1468.007 rows=1000000 loops=1)
Filter: (funcfastplpgsql(true) OR (hashed SubPlan 2))
SubPlan 2
-> Result (cost=0.00..5.25 rows=1000 width=4) (never executed)
Planning time: 0.100 ms
Execution time: 1578.624 ms
What I am missing here? Why query with two together functions runs much longer and how to fix it?

Postgresql IN operator Performance: List vs Subquery

For a list of ~700 ids the query performance is over 20x slower than passing a subquery that returns those 700 ids. It should be the opposite.
e.g. (first query takes under 400ms, the later 9600 ms)
select date_trunc('month', day) as month, sum(total)
from table_x
where y_id in (select id from table_y where prop = 'xyz')
and day between '2015-11-05' and '2016-11-04'
group by month
is 20x faster on my machine than passing the array directly:
select date_trunc('month', day) as month, sum(total)
from table_x
where y_id in (1625, 1871, ..., 1640, 1643, 13291, 1458, 13304, 1407, 1765)
and day between '2015-11-05' and '2016-11-04'
group by month
Any idea what could be the problem or how to optimize and obtain the same performance?
The difference is a simple filter vs a hash join:
explain analyze
select i
from t
where i in (500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on t (cost=0.00..140675.00 rows=101 width=4) (actual time=0.648..1074.567 rows=101 loops=1)
Filter: (i = ANY ('{500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600}'::integer[]))
Rows Removed by Filter: 999899
Planning time: 0.170 ms
Execution time: 1074.624 ms
explain analyze
select i
from t
where i in (select i from r);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Hash Semi Join (cost=3.27..17054.40 rows=101 width=4) (actual time=0.382..240.389 rows=101 loops=1)
Hash Cond: (t.i = r.i)
-> Seq Scan on t (cost=0.00..14425.00 rows=1000000 width=4) (actual time=0.030..117.193 rows=1000000 loops=1)
-> Hash (cost=2.01..2.01 rows=101 width=4) (actual time=0.074..0.074 rows=101 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on r (cost=0.00..2.01 rows=101 width=4) (actual time=0.010..0.035 rows=101 loops=1)
Planning time: 0.245 ms
Execution time: 240.448 ms
To have the same performance join the array:
explain analyze
select i
from
t
inner join
unnest(
array[500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600]::int[]
) u (i) using (i)
;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Hash Join (cost=2.25..18178.25 rows=100 width=4) (actual time=0.267..243.768 rows=101 loops=1)
Hash Cond: (t.i = u.i)
-> Seq Scan on t (cost=0.00..14425.00 rows=1000000 width=4) (actual time=0.022..118.709 rows=1000000 loops=1)
-> Hash (cost=1.00..1.00 rows=100 width=4) (actual time=0.063..0.063 rows=101 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Function Scan on unnest u (cost=0.00..1.00 rows=100 width=4) (actual time=0.028..0.041 rows=101 loops=1)
Planning time: 0.172 ms
Execution time: 243.816 ms
Or use the values syntax:
explain analyze
select i
from t
where i = any (values (500),(501),(502),(503),(504),(505),(506),(507),(508),(509),(510),(511),(512),(513),(514),(515),(516),(517),(518),(519),(520),(521),(522),(523),(524),(525),(526),(527),(528),(529),(530),(531),(532),(533),(534),(535),(536),(537),(538),(539),(540),(541),(542),(543),(544),(545),(546),(547),(548),(549),(550),(551),(552),(553),(554),(555),(556),(557),(558),(559),(560),(561),(562),(563),(564),(565),(566),(567),(568),(569),(570),(571),(572),(573),(574),(575),(576),(577),(578),(579),(580),(581),(582),(583),(584),(585),(586),(587),(588),(589),(590),(591),(592),(593),(594),(595),(596),(597),(598),(599),(600))
;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Hash Semi Join (cost=2.53..17053.65 rows=101 width=4) (actual time=0.279..239.888 rows=101 loops=1)
Hash Cond: (t.i = "*VALUES*".column1)
-> Seq Scan on t (cost=0.00..14425.00 rows=1000000 width=4) (actual time=0.022..117.199 rows=1000000 loops=1)
-> Hash (cost=1.26..1.26 rows=101 width=4) (actual time=0.059..0.059 rows=101 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Values Scan on "*VALUES*" (cost=0.00..1.26 rows=101 width=4) (actual time=0.002..0.027 rows=101 loops=1)
Planning time: 0.242 ms
Execution time: 239.933 ms
Try to change the critical line to this:
where y_id = any (values (1625, 1871, ..., 1640, 1643, 13291, 1458, 13304, 1407, 1765) )

Postgres Slow group by query with max

I am using postgres 9.1 and I have a table with about 3.5M rows of eventtype (varchar) and eventtime (timestamp) - and some other fields. There are only about 20 different eventtype's and the event time spans about 4 years.
I want to get the last timestamp of each event type. If I run a query like:
select eventtype, max(eventtime)
from allevents
group by eventtype
it takes around 20 seconds. Selecting distinct eventtype's is equally slow. The query plan shows a full sequential scan of the table - not surprising it is slow.
Explain analyse for the above query gives:
HashAggregate (cost=84591.47..84591.68 rows=21 width=21) (actual time=20918.131..20918.141 rows=21 loops=1)
-> Seq Scan on allevents (cost=0.00..66117.98 rows=3694698 width=21) (actual time=0.021..4831.793 rows=3694392 loops=1)
Total runtime: 20918.204 ms
If I add a where clause to select a specific eventtype, it takes anywhere from 40ms to 150ms which is at least decent.
Query plan when selecting specific eventtype:
GroupAggregate (cost=343.87..24942.71 rows=1 width=21) (actual time=98.397..98.397 rows=1 loops=1)
-> Bitmap Heap Scan on allevents (cost=343.87..24871.07 rows=14325 width=21) (actual time=6.820..89.610 rows=19736 loops=1)
Recheck Cond: ((eventtype)::text = 'TEST_EVENT'::text)
-> Bitmap Index Scan on allevents_idx2 (cost=0.00..340.28 rows=14325 width=0) (actual time=6.121..6.121 rows=19736 loops=1)
Index Cond: ((eventtype)::text = 'TEST_EVENT'::text)
Total runtime: 98.482 ms
Primary key is (eventtype, eventtime). I also have the following indexes:
allevents_idx (event time desc, eventtype)
allevents_idx2 (eventtype).
How can I speed up the query?
Results of query play for correlated subquery suggested by #denis below with 14 manually entered values gives:
Function Scan on unnest val (cost=0.00..185.40 rows=100 width=32) (actual time=0.121..8983.134 rows=14 loops=1)
SubPlan 2
-> Result (cost=1.83..1.84 rows=1 width=0) (actual time=641.644..641.645 rows=1 loops=14)
InitPlan 1 (returns $1)
-> Limit (cost=0.00..1.83 rows=1 width=8) (actual time=641.640..641.641 rows=1 loops=14)
-> Index Scan using allevents_idx on allevents (cost=0.00..322672.36 rows=175938 width=8) (actual time=641.638..641.638 rows=1 loops=14)
Index Cond: ((eventtime IS NOT NULL) AND ((eventtype)::text = val.val))
Total runtime: 8983.203 ms
Using the recursive query suggested by #jjanes, the query runs between 4 and 5 seconds with the following plan:
CTE Scan on t (cost=260.32..448.63 rows=101 width=32) (actual time=0.146..4325.598 rows=22 loops=1)
CTE t
-> Recursive Union (cost=2.52..260.32 rows=101 width=32) (actual time=0.075..1.449 rows=22 loops=1)
-> Result (cost=2.52..2.53 rows=1 width=0) (actual time=0.074..0.074 rows=1 loops=1)
InitPlan 1 (returns $1)
-> Limit (cost=0.00..2.52 rows=1 width=13) (actual time=0.070..0.071 rows=1 loops=1)
-> Index Scan using allevents_idx2 on allevents (cost=0.00..9315751.37 rows=3696851 width=13) (actual time=0.070..0.070 rows=1 loops=1)
Index Cond: ((eventtype)::text IS NOT NULL)
-> WorkTable Scan on t (cost=0.00..25.58 rows=10 width=32) (actual time=0.059..0.060 rows=1 loops=22)
Filter: (eventtype IS NOT NULL)
SubPlan 3
-> Result (cost=2.53..2.54 rows=1 width=0) (actual time=0.059..0.059 rows=1 loops=21)
InitPlan 2 (returns $3)
-> Limit (cost=0.00..2.53 rows=1 width=13) (actual time=0.057..0.057 rows=1 loops=21)
-> Index Scan using allevents_idx2 on allevents (cost=0.00..3114852.66 rows=1232284 width=13) (actual time=0.055..0.055 rows=1 loops=21)
Index Cond: (((eventtype)::text IS NOT NULL) AND ((eventtype)::text > t.eventtype))
SubPlan 6
-> Result (cost=1.83..1.84 rows=1 width=0) (actual time=196.549..196.549 rows=1 loops=22)
InitPlan 5 (returns $6)
-> Limit (cost=0.00..1.83 rows=1 width=8) (actual time=196.546..196.546 rows=1 loops=22)
-> Index Scan using allevents_idx on allevents (cost=0.00..322946.21 rows=176041 width=8) (actual time=196.544..196.544 rows=1 loops=22)
Index Cond: ((eventtime IS NOT NULL) AND ((eventtype)::text = t.eventtype))
Total runtime: 4325.694 ms
What you need is a "skip scan" or "loose index scan". PostgreSQL's planner does not yet implement those automatically, but you can trick it into using one by using a recursive query.
WITH RECURSIVE t AS (
SELECT min(eventtype) AS eventtype FROM allevents
UNION ALL
SELECT (SELECT min(eventtype) as eventtype FROM allevents WHERE eventtype > t.eventtype)
FROM t where t.eventtype is not null
)
select eventtype, (select max(eventtime) from allevents where eventtype=t.eventtype) from t;
There may be a way to collapse the max(eventtime) into the recursive query rather than doing it outside that query, but if so I have not hit upon it.
This needs an index on (eventtype, eventtime) in order to be efficient. You can have it be DESC on the eventtime, but that is not necessary. This is efficiently only if eventtype has only a few distinct values (21 of them, in your case).
Based on the question you already have the relevant index.
If upgrading to Postgres 9.3 or an index on (eventtype, eventtime desc) doesn't make a difference, this is a case where rewriting the query so it uses a correlated subquery works very well if you can enumerate all of the event types manually:
select val as eventtype,
(select max(eventtime)
from allevents
where allevents.eventtype = val
) as eventtime
from unnest('{type1,type2,…}'::text[]) as val;
Here's the plans I get when running similar queries:
denis=# select version();
version
-----------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 9.3.1 on x86_64-apple-darwin11.4.2, compiled by Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn), 64-bit
(1 row)
Test data:
denis=# create table test (evttype int, evttime timestamp, primary key (evttype, evttime));
CREATE TABLE
denis=# insert into test (evttype, evttime) select i, now() + (i % 3) * interval '1 min' - j * interval '1 sec' from generate_series(1,10) i, generate_series(1,10000) j;
INSERT 0 100000
denis=# create index on test (evttime, evttype);
CREATE INDEX
denis=# vacuum analyze test;
VACUUM
First query:
denis=# explain analyze select evttype, max(evttime) from test group by evttype; QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=2041.00..2041.10 rows=10 width=12) (actual time=54.983..54.987 rows=10 loops=1)
-> Seq Scan on test (cost=0.00..1541.00 rows=100000 width=12) (actual time=0.009..15.954 rows=100000 loops=1)
Total runtime: 55.045 ms
(3 rows)
Second query:
denis=# explain analyze select val as evttype, (select max(evttime) from test where test.evttype = val) as evttime from unnest('{1,2,3,4,5,6,7,8,9,10}'::int[]) val;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Function Scan on unnest val (cost=0.00..48.39 rows=100 width=4) (actual time=0.086..0.292 rows=10 loops=1)
SubPlan 2
-> Result (cost=0.46..0.47 rows=1 width=0) (actual time=0.024..0.024 rows=1 loops=10)
InitPlan 1 (returns $1)
-> Limit (cost=0.42..0.46 rows=1 width=8) (actual time=0.021..0.021 rows=1 loops=10)
-> Index Only Scan Backward using test_pkey on test (cost=0.42..464.42 rows=10000 width=8) (actual time=0.019..0.019 rows=1 loops=10)
Index Cond: ((evttype = val.val) AND (evttime IS NOT NULL))
Heap Fetches: 0
Total runtime: 0.370 ms
(9 rows)
index on (eventtype, eventtime desc) should help. or reindex on primary key index. I would also recommend replace type of eventtype to enum (if number of types is fixed) or int/smallint. This will decrease size of data and indexes so queries will run faster.