I have had several cases where a Postgres function that returns a table result from a query is much slower than running the actual query. Why is that?
This is one example, but I've found that function is slower than just the query in many cases.
create function trending_names(date_start timestamp with time zone, date_end timestamp with time zone, gender_filter character, country_filter text)
returns TABLE(name_id integer, gender character, country text, score bigint, rank bigint)
language sql
as
$$
select u.name_id,
n.gender,
u.country,
count(u.rank) as score,
row_number() over (order by count(u.rank) desc) as rank
from babynames.user_scores u
inner join babynames.names n on u.name_id = n.id
where u.created_at between date_start and date_end
and u.rank > 0
and n.gender = gender_filter
and u.country = country_filter
group by u.name_id, n.gender, u.country
$$;
This is the query plan for a select from the function:
Function Scan on trending_names (cost=0.25..10.25 rows=1000 width=84) (actual time=1118.673..1118.861 rows=2238 loops=1)
Buffers: shared hit=216509 read=29837
Planning Time: 0.078 ms
Execution Time: 1119.083 ms
Query plan from just running the query. This takes less than half the time.
WindowAgg (cost=44834.98..45593.32 rows=43334 width=25) (actual time=383.387..385.223 rows=2238 loops=1)
Planning Time: 2.512 ms
Execution Time: 387.403 ms
Buffers: shared hit=100446 read=50220
-> Sort (cost=44834.98..44943.31 rows=43334 width=17) (actual time=383.375..383.546 rows=2238 loops=1)
Sort Method: quicksort Memory: 271kB
Sort Key: (count(u.rank)) DESC
Buffers: shared hit=100446 read=50220
-> HashAggregate (cost=41064.22..41497.56 rows=43334 width=17) (actual time=381.088..381.906 rows=2238 loops=1)
" Group Key: u.name_id, u.country, n.gender"
Buffers: shared hit=100446 read=50220
-> Hash Join (cost=5352.15..40630.88 rows=43334 width=13) (actual time=60.710..352.646 rows=36271 loops=1)
Hash Cond: (u.name_id = n.id)
Buffers: shared hit=100446 read=50220
-> Index Scan using user_scores_rank_ix on user_scores u (cost=0.43..35077.55 rows=76796 width=11) (actual time=24.193..287.393 rows=69770 loops=1)
-> Hash (cost=5005.89..5005.89 rows=27667 width=6) (actual time=36.420..36.420 rows=27472 loops=1)
Rows Removed by Filter: 106521
Index Cond: (rank > 0)
Filter: ((created_at >= '2021-01-01 00:00:00+00'::timestamp with time zone) AND (country = 'sv'::text) AND (created_at <= now()))
Buffers: shared hit=99417 read=46856
Buffers: shared hit=1029 read=3364
Buckets: 32768 Batches: 1 Memory Usage: 1330kB
-> Seq Scan on names n (cost=0.00..5005.89 rows=27667 width=6) (actual time=0.022..24.447 rows=27472 loops=1)
Rows Removed by Filter: 21559
Filter: (gender = 'f'::bpchar)
Buffers: shared hit=1029 read=3364
I'm also confused on why it does a Seq scan on names n in the last step since names.id is the primary key and gender is indexed.
I have a query which is generating a trend response and getting me counts of devices for various dates. The query is going over almost 500k rows. The table has almost 17.5 million records. I have partitioned the table based on id so that it can only look for a specific partition but still it is quite slow. Each partition has almost 200k records. Any idea how to improve the performance of this.
Query
select start_date,end_date,average, fail_count, warning_count,pass_count, average from
(
select generate_series(timestamp '2021-01-18 00:00:00', timestamp '2021-02-12 00:00:00', interval '1 day')::date) t(start_date)
LEFT JOIN (
SELECT start_date, end_date, avg(score) as average
, count(*) FILTER (WHERE status = 'Fail') AS fail_count
, count(*) FILTER (WHERE status = 'Warning') AS warning_count
, count(*) FILTER (WHERE status = 'Pass') AS pass_count
FROM performance.tenant_based scd join performance.hierarchy dh on dh.id = scd.id and dh.tag = scd.tag
where dh.parent_id in (0,1,2,3,4,5,6,7,8,9,10) and dh.child_id in (0,1,2,3,4,5,6,7,8,9,10) and dh.desc in ('test')
and dh.id ='ita68f0c03880e4c6694859dfa74f1cdf6' AND start_date >= '2021-01-18 00:00:00' -- same date range as above
AND start_date <= '2021-02-12 00:00:00'
GROUP BY 1,2
) s USING (start_date)
ORDER BY 1;
The Query plan is below
Sort (cost=241350.02..241850.02 rows=200000 width=104) (actual time=3453.888..3453.890 rows=26 loops=1)
Sort Key: (((((generate_series('2021-01-18 00:00:00'::timestamp without time zone, '2021-02-12 00:00:00'::timestamp without time zone, '1 day'::interval)))::date))::timestamp without time zone)
Sort Method: quicksort Memory: 28kB
-> Merge Left Join (cost=201014.95..212802.88 rows=200000 width=104) (actual time=2901.012..3453.867 rows=26 loops=1)
Merge Cond: ((((generate_series('2021-01-18 00:00:00'::timestamp without time zone, '2021-02-12 00:00:00'::timestamp without time zone, '1 day'::interval)))::date) = scd.start_date)
-> Sort (cost=79.85..82.35 rows=1000 width=4) (actual time=0.015..0.024 rows=26 loops=1)
Sort Key: (((generate_series('2021-01-18 00:00:00'::timestamp without time zone, '2021-02-12 00:00:00'::timestamp without time zone, '1 day'::interval)))::date)
Sort Method: quicksort Memory: 26kB
-> Result (cost=0.00..20.02 rows=1000 width=4) (actual time=0.003..0.009 rows=26 loops=1)
-> ProjectSet (cost=0.00..5.02 rows=1000 width=8) (actual time=0.002..0.006 rows=26 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=1)
-> Materialize (cost=200935.11..209318.03 rows=40000 width=72) (actual time=2900.992..3453.789 rows=25 loops=1)
-> Finalize GroupAggregate (cost=200935.11..208818.03 rows=40000 width=72) (actual time=2900.990..3453.771 rows=25 loops=1)
Group Key: scd.start_date, scd.end_date
-> Gather Merge (cost=200935.11..207569.38 rows=49910 width=72) (actual time=2879.365..3453.827 rows=75 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial GroupAggregate (cost=199935.08..200808.51 rows=24955 width=72) (actual time=2686.465..3228.313 rows=25 loops=3)
Group Key: scd.start_date, scd.end_date
-> Sort (cost=199935.08..199997.47 rows=24955 width=25) (actual time=2664.518..2860.477 rows=1666667 loops=3)
Sort Key: scd.start_date, scd.end_date
Sort Method: external merge Disk: 59840kB
-> Hash Join (cost=44891.11..198112.49 rows=24955 width=25) (actual time=111.653..1817.228 rows=1666667 loops=3)
Hash Cond: (scd.tag = (dh.tag)::text)
-> Append (cost=0.00..145159.33 rows=2083333 width=68) (actual time=0.006..591.818 rows=1666667 loops=3)
-> Parallel Seq Scan on ita68f0c03880e4c6694859dfa74f1cdf6 scd (cost=0.00..145159.33 rows=2083333 width=68) (actual time=0.006..455.525 rows=1666667 loops=3)
Filter: ((start_date >= '2021-01-18 00:00:00'::timestamp without time zone) AND (start_date <= '2021-02-12 00:00:00'::timestamp without time zone) AND ((id)::text = 'ita68f0c03880e4c6694859dfa74f1cdf6'::text))
-> Hash (cost=44638.71..44638.71 rows=20192 width=45) (actual time=111.502..111.502 rows=200000 loops=3)
Buckets: 65536 (originally 32768) Batches: 8 (originally 1) Memory Usage: 3585kB
-> Bitmap Heap Scan on hierarchy dh (cost=1339.01..44638.71 rows=20192 width=45) (actual time=26.542..62.078 rows=200000 loops=3)
Recheck Cond: (((id)::text = 'ita68f0c03880e4c6694859dfa74f1cdf6'::text) AND (parent_id = ANY ('{0,1,2,3,4,5,6,7,8,9,10}'::integer[])) AND (child_id = ANY ('{0,1,2,3,4,5,6,7,8,9,10}'::integer[])) AND ((desc)::text = 'test'::text))
Heap Blocks: exact=5717
-> Bitmap Index Scan on hierarchy_id_region_idx (cost=0.00..1333.96 rows=20192 width=0) (actual time=25.792..25.792 rows=200000 loops=3)
Index Cond: (((id)::text = 'ita68f0c03880e4c6694859dfa74f1cdf6'::text) AND (parent_id = ANY ('{0,1,2,3,4,5,6,7,8,9,10}'::integer[])) AND (child_id = ANY ('{0,1,2,3,4,5,6,7,8,9,10}'::integer[])) AND ((desc)::text = 'test'::text))
Planning time: 0.602 ms
Execution time: 3463.440 ms
After going through several trail and errors we landed on the materialized view for this query. The number of rows the query was scanning was almost 500k+ and no indexes and partitioning was helping. We tweaked the above query to create a Materialized view and then doing a select on top of it. We are now at 96ms. The generalized materialized view for the query in this question is shown below.
CREATE MATERIALIZED VIEW performance.daily_trends
TABLESPACE pg_default
AS SELECT s.id,
d.parent_id,
d.child_id,
d.desc,
s.start_date,
s.end_date,
count(*) FILTER (WHERE s.overall_status::text = 'Fail'::text) AS fail_count,
count(*) FILTER (WHERE s.overall_status::text = 'Warning'::text) AS warning_count,
count(*) FILTER (WHERE s.overall_status::text = 'Pass'::text) AS pass_count,
avg(s.score) AS average_score
FROM performance.tenant_based s
JOIN performance.hierarchy d ON s.id::text = d.id::text AND s.tag = d.tag::text
WHERE s.start_date >= (CURRENT_DATE - 45) AND s.start_date <= CURRENT_DATE
GROUP BY s.id, d.parent_id, d.child_id, d.desc, s.start_date, s.end_date
WITH DATA;
Thanks for all who tried helping on this.
I have a query that is very fast for large date filter
EXPLAIN ANALYZE
SELECT "advertisings"."id",
"advertisings"."page_id",
"advertisings"."page_name",
"advertisings"."created_at",
"posts"."image_url",
"posts"."thumbnail_url",
"posts"."post_content",
"posts"."like_count"
FROM "advertisings"
INNER JOIN "posts" ON "advertisings"."post_id" = "posts"."id"
WHERE "advertisings"."created_at" >= '2020-01-01T00:00:00Z'
AND "advertisings"."created_at" < '2020-12-02T23:59:59Z'
ORDER BY "like_count" DESC LIMIT 20
And the query plan is:
Limit (cost=0.85..20.13 rows=20 width=552) (actual time=0.026..0.173 rows=20 loops=1)
-> Nested Loop (cost=0.85..951662.55 rows=987279 width=552) (actual time=0.025..0.169 rows=20 loops=1)
-> Index Scan using posts_like_count_idx on posts (cost=0.43..378991.65 rows=1053015 width=504) (actual time=0.013..0.039 rows=20 loops=1)
-> Index Scan using advertisings_post_id_index on advertisings (cost=0.43..0.53 rows=1 width=52) (actual time=0.005..0.006 rows=1 loops=20)
Index Cond: (post_id = posts.id)
Filter: ((created_at >= '2020-01-01 00:00:00'::timestamp without time zone) AND (created_at < '2020-12-02 23:59:59'::timestamp without time zone))
Planning Time: 0.365 ms
Execution Time: 0.199 ms
However, when I narrow the filter (change "created_at" >= '2020-11-25T00:00:00Z') which returns 9 records (which is less than the limit 20), the query is very slow
EXPLAIN ANALYZE
SELECT "advertisings"."id",
"advertisings"."page_id",
"advertisings"."page_name",
"advertisings"."created_at",
"posts"."image_url",
"posts"."thumbnail_url",
"posts"."post_content",
"posts"."like_count"
FROM "advertisings"
INNER JOIN "posts" ON "advertisings"."post_id" = "posts"."id"
WHERE "advertisings"."created_at" >= '2020-11-25T00:00:00Z'
AND "advertisings"."created_at" < '2020-12-02T23:59:59Z'
ORDER BY "like_count" DESC LIMIT 20
Query plan:
Limit (cost=1000.88..8051.73 rows=20 width=552) (actual time=218.485..4155.336 rows=9 loops=1)
-> Gather Merge (cost=1000.88..612662.09 rows=1735 width=552) (actual time=218.483..4155.328 rows=9 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Nested Loop (cost=0.85..611461.80 rows=723 width=552) (actual time=118.170..3786.176 rows=3 loops=3)
-> Parallel Index Scan using posts_like_count_idx on posts (cost=0.43..372849.07 rows=438756 width=504) (actual time=0.024..1542.094 rows=351005 loops=3)
-> Index Scan using advertisings_post_id_index on advertisings (cost=0.43..0.53 rows=1 width=52) (actual time=0.006..0.006 rows=0 loops=1053015)
Index Cond: (post_id = posts.id)
Filter: ((created_at >= '2020-11-25 00:00:00'::timestamp without time zone) AND (created_at < '2020-12-02 23:59:59'::timestamp without time zone))
Rows Removed by Filter: 1
Planning Time: 0.394 ms
Execution Time: 4155.379 ms
I spent hours googling but couldn't find the right solution. And help would be greatly appreciated.
Updated
When I continue narrowing the filter to
WHERE "advertisings"."created_at" >= '2020-11-27T00:00:00Z'
AND "advertisings"."created_at" < '2020-12-02T23:59:59Z'
which also returns the 9 records as the slow above query. However, this time, the query is really fast again.
Limit (cost=8082.99..8083.04 rows=20 width=552) (actual time=0.062..0.065 rows=9 loops=1)
-> Sort (cost=8082.99..8085.40 rows=962 width=552) (actual time=0.061..0.062 rows=9 loops=1)
Sort Key: posts.like_count DESC
Sort Method: quicksort Memory: 32kB
-> Nested Loop (cost=0.85..8057.39 rows=962 width=552) (actual time=0.019..0.047 rows=9 loops=1)
-> Index Scan using advertisings_created_at_index on advertisings (cost=0.43..501.30 rows=962 width=52) (actual time=0.008..0.012 rows=9 loops=1)
Index Cond: ((created_at >= '2020-11-27 00:00:00'::timestamp without time zone) AND (created_at < '2020-12-02 23:59:59'::timestamp without time zone))
-> Index Scan using posts_pkey on posts (cost=0.43..7.85 rows=1 width=504) (actual time=0.003..0.003 rows=1 loops=9)
Index Cond: (id = advertisings.post_id)
Planning Time: 0.540 ms
Execution Time: 0.096 ms
I have no idea what happens
PostgreSQL follows two different strategies in the first two and the last query:
If there are many matching advertisings rows, it uses a nested loop join to fetch the rows in the order of the ORDER BY clause and discards rows that don't match the condition until it has found 20.
If there are few matching advertisings rows, it fetches those few rows, then the matching rows in posts, then sorts and takes the first 20 rows.
The second execution is slow because PostgreSQL overestimates the rows in advertisings that match the condition. See how it estimates 962 instead of 9 in the third query?
The solution is to improve PostgreSQL's estimate:
if running
ANALYZE advertisings;
is enough to make the slow query fast, tell PostgreSQL to collect statistics more often:
ALTER TABLE advertisings SET (autovacuum_analyze_scale_factor = 0.05);
if that is not enough, try collecting more detailed statistics:
SET default_statistics_target = 1000;
ANALYZE advertisings;
You can experiment with values up to 10000. Once you found the value that works, persist it:
ALTER TABLE advertisings ALTER created_at SET STATISTICS 1000;
I'm using django orm with select related and it generates the query of the form:
SELECT *
FROM "coupons_coupon"
LEFT OUTER JOIN "coupons_merchant"
ON ("coupons_coupon"."merchant_id" = "coupons_merchant"."slug")
WHERE ("coupons_coupon"."end_date" > '2020-07-10T09:10:28.101980+00:00'::timestamptz AND "coupons_coupon"."published" = true)
ORDER BY "coupons_coupon"."end_date" ASC, "coupons_coupon"."id"
LIMIT 5;
Which is then executed using the following plan:
Limit (cost=4363.28..4363.30 rows=5 width=604) (actual time=21.864..21.865 rows=5 loops=1)
-> Sort (cost=4363.28..4373.34 rows=4022 width=604) (actual time=21.863..21.863 rows=5 loops=1)
Sort Key: coupons_coupon.end_date, coupons_coupon.id"
Sort Method: top-N heapsort Memory: 32kB
-> Hash Left Join (cost=2613.51..4296.48 rows=4022 width=604) (actual time=13.918..20.209 rows=4022 loops=1)
Hash Cond: ((coupons_coupon.merchant_id)::text = (coupons_merchant.slug)::text)
-> Seq Scan on coupons_coupon (cost=0.00..291.41 rows=4022 width=261) (actual time=0.007..1.110 rows=4022 loops=1)
Filter: (published AND (end_date > '2020-07-10 09:10:28.10198+00'::timestamp with time zone))
Rows Removed by Filter: 1691
-> Hash (cost=1204.56..1204.56 rows=24956 width=331) (actual time=13.894..13.894 rows=23911 loops=1)
Buckets: 16384 Batches: 4 Memory Usage: 1948kB
-> Seq Scan on coupons_merchant (cost=0.00..1204.56 rows=24956 width=331) (actual time=0.003..4.681 rows=23911 loops=1)
Which is a bad execution plan as join can be done after the left table has been filtered, ordered and limited. When I remove the id from order by it generates an efficient plan, which basically could have been used in the previous query as well.
Limit (cost=0.57..8.84 rows=5 width=600) (actual time=0.013..0.029 rows=5 loops=1)
-> Nested Loop Left Join (cost=0.57..6650.48 rows=4022 width=600) (actual time=0.012..0.028 rows=5 loops=1)
-> Index Scan using coupons_cou_end_dat_a8d5b7_btree on coupons_coupon (cost=0.28..1015.77 rows=4022 width=261) (actual time=0.007..0.010 rows=5 loops=1)
Index Cond: (end_date > '2020-07-10 09:10:28.10198+00'::timestamp with time zone)
Filter: published
-> Index Scan using coupons_merchant_pkey on coupons_merchant (cost=0.29..1.40 rows=1 width=331) (actual time=0.003..0.003 rows=1 loops=5)
Index Cond: ((slug)::text = (coupons_coupon.merchant_id)::text)
Why is this happening? Can the optimizer be nudged to use similar plan for the former query?
I'm using postgres 12.
v13 of PostgreSQL, which should be released in the next few months, implements incremental sorting, in which it can read rows in an pre-sorted order based on prefix columns, then sorts just the ties on those prefix column(s) by the remaining column(s) in order to get a complete sort based on more columns than an index provides. I think that will do more or less what you want.
Limit (cost=2.46..2.99 rows=5 width=21)
-> Incremental Sort (cost=2.46..405.58 rows=3850 width=21)
Sort Key: coupons_coupon.end_date, coupons_coupon.id
Presorted Key: coupons_coupon.end_date
-> Nested Loop Left Join (cost=0.31..253.48 rows=3850 width=21)
-> Index Scan using coupons_coupon_end_date_idx on coupons_coupon (cost=0.15..54.71 rows=302 width=17)
Index Cond: (end_date > '2020-07-10 05:10:28.10198-04'::timestamp with time zone)
Filter: published
-> Index Only Scan using coupons_merchant_slug_idx on coupons_merchant (cost=0.15..0.53 rows=13 width=4)
Index Cond: (slug = coupons_coupon.merchant_id)
Of course just adding "id" into the current index will work under currently released versions, and even under version 13 is should be more efficient to have the index fully order the rows in the way you need them.
I have a table with about 50 million records in PostgreSQL. Trying to select a post with most "likes" filtering by a "tag". Both fields have b-tree index. For "love" tag I get
EXPLAIN analyse select user_id from posts where tags #> array['love'] order by likes desc nulls last limit 12
Limit (cost=0.57..218.52 rows=12 width=12) (actual time=2.658..14.243 rows=12 loops=1)
-> Index Scan using idx_likes on posts (cost=0.57..55759782.55 rows=3070010 width=12) (actual time=2.657..14.239 rows=12 loops=1)
Filter: (tags #> '{love}'::text[])
Rows Removed by Filter: 10584
Planning time: 0.297 ms
Execution time: 14.276 ms
14 ms is great, but if I try to get it for "tamir", it suddenly becomes over 22 seconds!! Obviously query planner is doing something wrong.
EXPLAIN analyse select user_id from posts where tags #> array['tamir'] order by likes desc nulls last limit 12
Limit (cost=0.57..25747.73 rows=12 width=12) (actual time=17552.406..22839.503 rows=12 loops=1)
-> Index Scan using idx_likes on posts (cost=0.57..55759782.55 rows=25988 width=12) (actual time=17552.405..22839.484 rows=12 loops=1)
Filter: (tags #> '{tamir}'::text[])
Rows Removed by Filter: 11785083
Planning time: 0.253 ms
Execution time: 22839.569 ms
After reading the article I've added "user_id" to the ORDER BY and "tamir" is blazingly fast, 0.2ms! Now it's doing sorts and Bitmap Heap Scan instead of Index scan.
EXPLAIN analyse select user_id from posts where tags #> array['tamir'] order by likes desc nulls last, user_id limit 12
Limit (cost=101566.17..101566.20 rows=12 width=12) (actual time=0.237..0.238 rows=12 loops=1)
-> Sort (cost=101566.17..101631.14 rows=25988 width=12) (actual time=0.237..0.237 rows=12 loops=1)
Sort Key: likes DESC NULLS LAST, user_id
Sort Method: top-N heapsort Memory: 25kB
-> Bitmap Heap Scan on posts (cost=265.40..100970.40 rows=25988 width=12) (actual time=0.074..0.214 rows=126 loops=1)
Recheck Cond: (tags #> '{tamir}'::text[])
Heap Blocks: exact=44
-> Bitmap Index Scan on idx_tags (cost=0.00..258.91 rows=25988 width=0) (actual time=0.056..0.056 rows=126 loops=1)
Index Cond: (tags #> '{tamir}'::text[])
Planning time: 0.287 ms
Execution time: 0.277 ms
But what happens to "love"? Now it goes from 14 ms to 2.3 seconds...
EXPLAIN analyse select user_id from posts where tags #> array['love'] order by likes desc nulls last, user_id limit 12
Limit (cost=7347142.18..7347142.21 rows=12 width=12) (actual time=2360.784..2360.786 rows=12 loops=1)
-> Sort (cost=7347142.18..7354817.20 rows=3070010 width=12) (actual time=2360.783..2360.784 rows=12 loops=1)
Sort Key: likes DESC NULLS LAST, user_id
Sort Method: top-N heapsort Memory: 25kB
-> Bitmap Heap Scan on posts (cost=28316.58..7276762.77 rows=3070010 width=12) (actual time=595.274..2171.571 rows=1517679 loops=1)
Recheck Cond: (tags #> '{love}'::text[])
Heap Blocks: exact=642705
-> Bitmap Index Scan on idx_tags (cost=0.00..27549.08 rows=3070010 width=0) (actual time=367.080..367.080 rows=1517679 loops=1)
Index Cond: (tags #> '{love}'::text[])
Planning time: 0.226 ms
Execution time: 2360.863 ms
Can somebody shed some light on why this is happening and what would the fix.
Update
"tag" field had gin index, not b-tree, just typo.
B-tree indexes are not very useful for searching of element in array field. You should remove b-tree index from tags field and use gin index instead:
drop index idx_tags;
create index idx_tags using gin(tags);
And don't add order by user_id — this sabotages possibility to use your idx_likes for ordering when there's a lot of rows with the tag you search for.
Also likes field should probably be not null default 0.