postgres function update large table and deletion performance - postgresql
postgres version: 9.3
postgres.conf: all default configurations
I have 2 tables, A and B,both have 1 million rows.
There is a postgres function that will execute every 2 seconds, it will update Table A where ids in an array(array size = 20), and then delete the rows in Table B.
DB function shows as below:
CREATE OR REPLACE FUNCTION test_function (ids NUMERIC[])
RETURNS void AS $$
BEGIN
UPDATE A a
SET status = 'begin', end_time = (NOW() AT TIME ZONE 'UTC')
WHERE a.id = ANY (ids);
DELETE FROM B b
WHERE b.aid = ANY (ids)
AND b.status = 'end';
END;
$$ LANGUAGE plpgsql;
Analysis shows as below:
explain(ANALYZE,BUFFERS,VERBOSE) select test_function('{2,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20}');
QUERY PLAN
Result (cost=0.00..0.26 rows=1 width=0) (actual time=14030.435..14030.436 rows=1 loops=1)
Output: test_function('{2,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20}'::numeric[])
Buffers: shared hit=24297 read=26137 dirtied=20
Total runtime: 14030.444 ms
(4 rows)
My Question is:
In the production environment, why this function need to execute at most 7 seconds before success;
When this function is executing, this process will eats up to 60%. --> This is the key problem
EDIT:
Analyze each single sql:
explain(ANALYZE,VERBOSE,BUFFERS) UPDATE A a SET status = 'begin',
end_time = (now()) WHERE a.id = ANY
('{2,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20}');
QUERY PLAN
Update on public.A a (cost=0.45..99.31 rows=20 width=143) (actual time=1.206..1.206 rows=0 loops=1)
Buffers: shared hit=206 read=27 dirtied=30
-> Index Scan using A_pkey on public.a a (cost=0.45..99.31 rows=20 width=143) (actual time=0.019..0.116 rows=19 loops=1)
Output: id, start_time, now(), 'begin'::character varying(255), xxxx... ctid
Index Cond: (t.id = ANY('{2,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20}'::integer[]))
Buffers: shared hit=75 read=11
Trigger test_trigger: time=5227.111 calls=1
Total runtime: 5228.357 ms
(8 rows)
explain(ANALYZE,BUFFERS,VERBOSE) DELETE FROM
B b WHERE tq.aid = ANY
('{2,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20}');
QUERY PLAN
Delete on B b (cost=0.00..1239.11 rows=20 width=6) (actual time=6.013..6.013 rows=0 loops=1)
Buffers: shared hit=448
-> Seq Scan on B b (cost=0.00..1239.11 rows=20 width=6) (actual time=6.011..6.011 rows=0 loops=1)
Output: ctid
Filter: (b.aid = ANY ('{2,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20}'::bigint[]))
Rows Removed by Filter: 21743
Buffers: shared hit=448
Total runtime: 6.029 ms
(8 rows)
CPU usage
Before calling:
Afer frequently operations:
Related
Update statement on PostgreSQL not using primary key index to update
I have a stored procedure on PostgreSQL like this: create or replace procedure new_emp_sp (f_name varchar, l_name varchar, age integer, threshold integer, dept varchar) language plpgsql as $$ declare new_emp_count integer; begin INSERT INTO employees (id, first_name, last_name, age) VALUES (nextval('emp_id_seq'), random_string(10), random_string(20), age); select count(*) into new_emp_count from employees where age > threshold; update dept_employees set emp_count = new_emp_count where id = dept; end; $$ I have enabled auto_explain module and set log_min_duration to 0 so that it logs everything. I have an issue with the update statement in the procedure. From the auto_explain logs I see that it is not using the primary key index to update the table: -> Seq Scan on dept_employees (cost=0.00..1.05 rows=1 width=14) (actual time=0.005..0.006 rows=1 loops=1) Filter: ((id)::text = 'ABC'::text) Rows Removed by Filter: 3 This worked as expected until a couple of hours ago and I used to get a log like this: -> Index Scan using dept_employees_pkey on dept_employees (cost=0.15..8.17 rows=1 width=48) (actual time=0.010..0.011 rows=1 loops=1) Index Cond: ((id)::text = 'ABC'::text) Without the procedure, if I run the statement standalone like this: explain analyze update dept_employees set emp_count = 123 where id = 'ABC'; The statement correctly uses the primary key index: Update on dept_employees (cost=0.15..8.17 rows=1 width=128) (actual time=0.049..0.049 rows=0 loops=1) -> Index Scan using dept_employees_pkey on dept_employees (cost=0.15..8.17 rows=1 width=128) (actual time=0.035..0.036 rows=1 loops=1) Index Cond: ((id)::text = 'ABC'::text) I can't figure out what has gone wrong especially because it worked perfectly just a couple of hours ago.
It is faster to scan N rows sequentially than to scan N rows using an index. So for small tables Postgres may decide that a sequence scan is faster than an index scan. PL/pgSQL can cache prepared statements and execution plans, so you're probably getting a cached execution plan from when the table was smaller.
Postgres function slower than same ad hoc query
I have had several cases where a Postgres function that returns a table result from a query is much slower than running the actual query. Why is that? This is one example, but I've found that function is slower than just the query in many cases. create function trending_names(date_start timestamp with time zone, date_end timestamp with time zone, gender_filter character, country_filter text) returns TABLE(name_id integer, gender character, country text, score bigint, rank bigint) language sql as $$ select u.name_id, n.gender, u.country, count(u.rank) as score, row_number() over (order by count(u.rank) desc) as rank from babynames.user_scores u inner join babynames.names n on u.name_id = n.id where u.created_at between date_start and date_end and u.rank > 0 and n.gender = gender_filter and u.country = country_filter group by u.name_id, n.gender, u.country $$; This is the query plan for a select from the function: Function Scan on trending_names (cost=0.25..10.25 rows=1000 width=84) (actual time=1118.673..1118.861 rows=2238 loops=1) Buffers: shared hit=216509 read=29837 Planning Time: 0.078 ms Execution Time: 1119.083 ms Query plan from just running the query. This takes less than half the time. WindowAgg (cost=44834.98..45593.32 rows=43334 width=25) (actual time=383.387..385.223 rows=2238 loops=1) Planning Time: 2.512 ms Execution Time: 387.403 ms Buffers: shared hit=100446 read=50220 -> Sort (cost=44834.98..44943.31 rows=43334 width=17) (actual time=383.375..383.546 rows=2238 loops=1) Sort Method: quicksort Memory: 271kB Sort Key: (count(u.rank)) DESC Buffers: shared hit=100446 read=50220 -> HashAggregate (cost=41064.22..41497.56 rows=43334 width=17) (actual time=381.088..381.906 rows=2238 loops=1) " Group Key: u.name_id, u.country, n.gender" Buffers: shared hit=100446 read=50220 -> Hash Join (cost=5352.15..40630.88 rows=43334 width=13) (actual time=60.710..352.646 rows=36271 loops=1) Hash Cond: (u.name_id = n.id) Buffers: shared hit=100446 read=50220 -> Index Scan using user_scores_rank_ix on user_scores u (cost=0.43..35077.55 rows=76796 width=11) (actual time=24.193..287.393 rows=69770 loops=1) -> Hash (cost=5005.89..5005.89 rows=27667 width=6) (actual time=36.420..36.420 rows=27472 loops=1) Rows Removed by Filter: 106521 Index Cond: (rank > 0) Filter: ((created_at >= '2021-01-01 00:00:00+00'::timestamp with time zone) AND (country = 'sv'::text) AND (created_at <= now())) Buffers: shared hit=99417 read=46856 Buffers: shared hit=1029 read=3364 Buckets: 32768 Batches: 1 Memory Usage: 1330kB -> Seq Scan on names n (cost=0.00..5005.89 rows=27667 width=6) (actual time=0.022..24.447 rows=27472 loops=1) Rows Removed by Filter: 21559 Filter: (gender = 'f'::bpchar) Buffers: shared hit=1029 read=3364 I'm also confused on why it does a Seq scan on names n in the last step since names.id is the primary key and gender is indexed.
PostgreSQL slow order
I have table (over 100 millions records) on PostgreSQL 13.1 CREATE TABLE report ( id serial primary key, license_plate_id integer, datetime timestamp ); Indexes (for test I create both of them): create index report_lp_datetime_index on report (license_plate_id, datetime); create index report_lp_datetime_desc_index on report (license_plate_id desc, datetime desc); So, my question is why query like select * from report r where r.license_plate_id in (1,2,4,5,6,7,8,10,15,22,34,75) order by datetime desc limit 100 Is very slow (~10sec). But query without order statement is fast (milliseconds). Explain: explain (analyze, buffers, format text) select * from report r where r.license_plate_id in (1,2,4,5,6,7,8,10,15,22,34, 75,374,57123) limit 100 Limit (cost=0.57..400.38 rows=100 width=316) (actual time=0.037..0.216 rows=100 loops=1) Buffers: shared hit=103 -> Index Scan using report_lp_id_idx on report r (cost=0.57..44986.97 rows=11252 width=316) (actual time=0.035..0.202 rows=100 loops=1) Index Cond: (license_plate_id = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75,374,57123}'::integer[])) Buffers: shared hit=103 Planning Time: 0.228 ms Execution Time: 0.251 ms explain (analyze, buffers, format text) select * from report r where r.license_plate_id in (1,2,4,5,6,7,8,10,15,22,34,75,374,57123) order by datetime desc limit 100 Limit (cost=44193.63..44193.88 rows=100 width=316) (actual time=4921.030..4921.047 rows=100 loops=1) Buffers: shared hit=11455 read=671 -> Sort (cost=44193.63..44221.76 rows=11252 width=316) (actual time=4921.028..4921.035 rows=100 loops=1) Sort Key: datetime DESC Sort Method: top-N heapsort Memory: 128kB Buffers: shared hit=11455 read=671 -> Bitmap Heap Scan on report r (cost=151.18..43763.59 rows=11252 width=316) (actual time=54.422..4911.927 rows=12148 loops=1) Recheck Cond: (license_plate_id = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75,374,57123}'::integer[])) Heap Blocks: exact=12063 Buffers: shared hit=11455 read=671 -> Bitmap Index Scan on report_lp_id_idx (cost=0.00..148.37 rows=11252 width=0) (actual time=52.631..52.632 rows=12148 loops=1) Index Cond: (license_plate_id = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75,374,57123}'::integer[])) Buffers: shared hit=59 read=4 Planning Time: 0.427 ms Execution Time: 4921.128 ms
You seem to have rather slow storage, if reading 671 8kB-blocks from disk takes a couple of seconds. The way to speed this up is to reorder the table in the same way as the index, so that you can find the required rows in the same or adjacent table blocks: CLUSTER report_lp_id_idx USING report_lp_id_idx; Be warned that rewriting the table in this way causes downtime – the table will not be available while it is being rewritten. Moreover, PostgreSQL does not maintain the table order, so subsequent data modifications will cause performance to gradually deteriorate, so that after a while you will have to run CLUSTER again. But if you need this query to be fast no matter what, CLUSTER is the way to go.
Your two indices do exactly the same thing, so you can remove the second one, it's useless. To optimize your query, the order of the fields inside the index must be reversed: create index report_lp_datetime_index on report (datetime,license_plate_id); BEGIN; CREATE TABLE foo (d INTEGER, i INTEGER); INSERT INTO foo SELECT random()*100000, random()*1000 FROM generate_series(1,1000000) s; CREATE INDEX foo_d_i ON foo(d DESC,i); COMMIT; VACUUM ANALYZE foo; EXPLAIN ANALYZE SELECT * FROM foo WHERE i IN (1,2,4,5,6,7,8,10,15,22,34,75) ORDER BY d DESC LIMIT 100; Limit (cost=0.42..343.92 rows=100 width=8) (actual time=0.076..9.359 rows=100 loops=1) -> Index Only Scan Backward using foo_d_i on foo (cost=0.42..40976.43 rows=11929 width=8) (actual time=0.075..9.339 rows=100 loops=1) Filter: (i = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75}'::integer[])) Rows Removed by Filter: 9016 Heap Fetches: 0 Planning Time: 0.339 ms Execution Time: 9.387 ms Note the index is not used to optimize the WHERE clause. It is used here as a compact and fast way to store references to the rows ordered by date DESC, so the ORDER BY can do an index-only scan and avoid sorting. By adding column id to the index, an index-only scan can be performed to test the condition on id, without hitting the table for every row. Since there is a low LIMIT value it does not need to scan the whole index, it only scans it in date DESC order until it finds enough rows satisfying the WHERE condition to return the result. It will be faster if you create the index in date DESC order, this could be useful if you use ORDER BY date DESC + LIMIT in other queries too. You forget that OP's table has a third column, and he is using SELECT *. So that wouldn't be an index-only scan. Easy to work around. The optimum way to do this query would be an index-only scan to filter on WHERE conditions, then LIMIT, then hit the table to get the rows. For some reason if "select *" is used postgres takes the id column from the table instead of taking it from the index, which results in lots of unnecessary heap fetches for rows whose id is rejected by the WHERE condition. Easy to work around, by doing it manually. I've also added another bogus column to make sure the SELECT * hits the table. EXPLAIN (ANALYZE,buffers) SELECT * FROM foo JOIN (SELECT d,i FROM foo WHERE i IN (1,2,4,5,6,7,8,10,15,22,34,75) ORDER BY d DESC LIMIT 100) f USING (d,i) ORDER BY d DESC LIMIT 100; Limit (cost=0.85..1281.94 rows=1 width=17) (actual time=0.052..3.618 rows=100 loops=1) Buffers: shared hit=453 -> Nested Loop (cost=0.85..1281.94 rows=1 width=17) (actual time=0.050..3.594 rows=100 loops=1) Buffers: shared hit=453 -> Limit (cost=0.42..435.44 rows=100 width=8) (actual time=0.037..2.953 rows=100 loops=1) Buffers: shared hit=53 -> Index Only Scan using foo_d_i on foo foo_1 (cost=0.42..51936.43 rows=11939 width=8) (actual time=0.037..2.935 rows=100 loops=1) Filter: (i = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75}'::integer[])) Rows Removed by Filter: 9010 Heap Fetches: 0 Buffers: shared hit=53 -> Index Scan using foo_d_i on foo (cost=0.42..8.45 rows=1 width=17) (actual time=0.005..0.005 rows=1 loops=100) Index Cond: ((d = foo_1.d) AND (i = foo_1.i)) Buffers: shared hit=400 Execution Time: 3.663 ms Another option is to just add the primary key to the date,license_plate index. SELECT * FROM foo JOIN (SELECT id FROM foo WHERE i IN (1,2,4,5,6,7,8,10,15,22,34,75) ORDER BY d DESC LIMIT 100) f USING (id) ORDER BY d DESC LIMIT 100; Limit (cost=1357.98..1358.23 rows=100 width=17) (actual time=3.920..3.947 rows=100 loops=1) Buffers: shared hit=473 -> Sort (cost=1357.98..1358.23 rows=100 width=17) (actual time=3.919..3.931 rows=100 loops=1) Sort Key: foo.d DESC Sort Method: quicksort Memory: 32kB Buffers: shared hit=473 -> Nested Loop (cost=0.85..1354.66 rows=100 width=17) (actual time=0.055..3.858 rows=100 loops=1) Buffers: shared hit=473 -> Limit (cost=0.42..509.41 rows=100 width=8) (actual time=0.039..3.116 rows=100 loops=1) Buffers: shared hit=73 -> Index Only Scan using foo_d_i_id on foo foo_1 (cost=0.42..60768.43 rows=11939 width=8) (actual time=0.039..3.093 rows=100 loops=1) Filter: (i = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75}'::integer[])) Rows Removed by Filter: 9010 Heap Fetches: 0 Buffers: shared hit=73 -> Index Scan using foo_pkey on foo (cost=0.42..8.44 rows=1 width=17) (actual time=0.006..0.006 rows=1 loops=100) Index Cond: (id = foo_1.id) Buffers: shared hit=400 Execution Time: 3.972 ms Edit After thinking about it... since the LIMIT restricts the output to 100 rows ordered by date desc, wouldn't it be nice if we could get the 100 most recent rows for each license_plate_id, put all that into a top-n sort, and only keep the best 100 for all license_plate_ids? That would avoid reading and throwing away a lot of rows from the index. Even if that's much faster than hitting the table, it will still load up these index pages in RAM and clog up your buffers with stuff you don't actually need to keep in cache. Let's use LATERAL JOIN: EXPLAIN (ANALYZE,BUFFERS) SELECT * FROM foo JOIN (SELECT d,i FROM (VALUES (1),(2),(4),(5),(6),(7),(8),(10),(15),(22),(34),(75)) idlist CROSS JOIN LATERAL (SELECT d,i FROM foo WHERE i=idlist.column1 ORDER BY d DESC LIMIT 100) f2 ORDER BY d DESC LIMIT 100 ) f3 USING (d,i) ORDER BY d DESC LIMIT 100; It's even faster: 2ms, and it uses the index on (license_plate_id,date) instead of the other way around. Also, and this is important, since each subquery in the lateral hits only the index pages that contain rows that will actually be selected, while the previous queries hit much more index pages. So you save on RAM buffers. If you don't need the index on (date,license_plate_id) and don't want to keep a useless index, that could be interesting since this query doesn't use it. On the other hand, if you need the index on (date,license_plate_id) for something else and want to keep it, then... maybe not. Please post results for the winning query 🔥
Postgres optimization failing to filter window function partitions early
In some cases, PostgreSQL does not filter out window function partitions until they are calculated, while in a very similar scenario PostgreSQL filters row before performing window function calculation. Tables used for minimal STR - log is the main data table, each row contains either increment or absolute value. Absolute value resets the current counter with a new base value. Window functions need to process all logs for a given account_id to calculate the correct running total. View uses a subquery to ensure that underlying log rows are not filtered by ts, otherwise, this would break the window function. CREATE TABLE account( id serial, name VARCHAR(100) ); CREATE TABLE log( id serial, absolute int, incremental int, account_id int, ts timestamp, PRIMARY KEY(id), CONSTRAINT fk_account FOREIGN KEY(account_id) REFERENCES account(id) ); CREATE FUNCTION get_running_total_func( aggregated_total int, absolute int, incremental int ) RETURNS int LANGUAGE sql IMMUTABLE CALLED ON NULL INPUT AS $$ SELECT CASE WHEN absolute IS NOT NULL THEN absolute ELSE COALESCE(aggregated_total, 0) + incremental END $$; CREATE AGGREGATE get_running_total(integer, integer) ( sfunc = get_running_total_func, stype = integer ); Slow view: CREATE VIEW test_view ( log_id, running_value, account_id, ts ) AS SELECT log_running.* FROM (SELECT log.id, get_running_total( log.absolute, log.incremental ) OVER( PARTITION BY log.account_id ORDER BY log.ts RANGE UNBOUNDED PRECEDING ), account.id, ts FROM log log JOIN account account ON log.account_id=account.id ) AS log_running; CREATE VIEW postgres=# EXPLAIN ANALYZE SELECT * FROM test_view WHERE account_id=1; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------- Subquery Scan on log_running (cost=12734.02..15981.48 rows=1 width=20) (actual time=7510.851..16122.404 rows=20 loops=1) Filter: (log_running.id_1 = 1) Rows Removed by Filter: 99902 -> WindowAgg (cost=12734.02..14732.46 rows=99922 width=32) (actual time=7510.830..14438.783 rows=99922 loops=1) -> Sort (cost=12734.02..12983.82 rows=99922 width=28) (actual time=7510.628..9312.399 rows=99922 loops=1) Sort Key: log.account_id, log.ts Sort Method: external merge Disk: 3328kB -> Hash Join (cost=143.50..2042.24 rows=99922 width=28) (actual time=169.941..5431.650 rows=99922 loops=1) Hash Cond: (log.account_id = account.id) -> Seq Scan on log (cost=0.00..1636.22 rows=99922 width=24) (actual time=0.063..1697.802 rows=99922 loops=1) -> Hash (cost=81.00..81.00 rows=5000 width=4) (actual time=169.837..169.865 rows=5000 loops=1) Buckets: 8192 Batches: 1 Memory Usage: 240kB -> Seq Scan on account (cost=0.00..81.00 rows=5000 width=4) (actual time=0.017..84.639 rows=5000 loops=1) Planning Time: 0.199 ms Execution Time: 16127.275 ms (15 rows) Fast view - only change is account.id -> log.account_id (!): CREATE VIEW test_view ( log_id, running_value, account_id, ts ) AS SELECT log_running.* FROM (SELECT log.id, get_running_total( log.absolute, log.incremental ) OVER( PARTITION BY log.account_id ORDER BY log.ts RANGE UNBOUNDED PRECEDING ), log.account_id, ts FROM log log JOIN account account ON log.account_id=account.id ) AS log_running; CREATE VIEW postgres=# EXPLAIN ANALYZE SELECT * FROM test_view WHERE account_id=1; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------- Subquery Scan on log_running (cost=1894.96..1895.56 rows=20 width=20) (actual time=34.718..45.958 rows=20 loops=1) -> WindowAgg (cost=1894.96..1895.36 rows=20 width=28) (actual time=34.691..45.307 rows=20 loops=1) -> Sort (cost=1894.96..1895.01 rows=20 width=24) (actual time=34.367..35.925 rows=20 loops=1) Sort Key: log.ts Sort Method: quicksort Memory: 26kB -> Nested Loop (cost=0.28..1894.53 rows=20 width=24) (actual time=0.542..34.066 rows=20 loops=1) -> Index Only Scan using account_pkey on account (cost=0.28..8.30 rows=1 width=4) (actual time=0.025..0.054 rows=1 loops=1) Index Cond: (id = 1) Heap Fetches: 1 -> Seq Scan on log (cost=0.00..1886.03 rows=20 width=24) (actual time=0.195..32.937 rows=20 loops=1) Filter: (account_id = 1) Rows Removed by Filter: 99902 Planning Time: 0.297 ms Execution Time: 47.300 ms (14 rows) Is it a bug in PostgreSQL implementation? It seems that this change in view definition shouldn't affect performance at all, PostgreSQL should be able to filter data before applying window function for all data set.
Generic execution plan in query language (SQL) function
I am trying to figure out an issue related to poor performance in a parameterized SQL function. The documentation is relatively clear about execution plans in PL/PGSQL functions - the query planner plans the executions the first 5 times and then may cache the plan after that within the same session. However, I have been unable to find similar documentation related to SQL functions and, from some investigating, it appears that the query planner surprisingly always uses a generic plan. Here's a simple example: -- Postgres 11.1 -- Test table and some data create table test (a int); insert into test select 1 from generate_series(1,1000000); insert into test values (2); create index on test(a); analyze test; -- A SQL Function create function test_f(val int, cnt out bigint) AS $$ SELECT count(*) FROM test where a = val; $$ LANGUAGE SQL; -- A similar PL/PGSQL Function create function test_f_plpgsql (val int, cnt out bigint) AS $$ BEGIN SELECT count(*) FROM test where a = val INTO cnt; END $$ LANGUAGE PLPGSQL; -- Show plans LOAD 'auto_explain'; SET auto_explain.log_analyze = on; SET client_min_messages = log; SET auto_explain.log_nested_statements = on; SET auto_explain.log_min_duration = 0; The PL/PGSQL function uses a non-generic plan, and knows to use an index-only scan. select * from test_f_plpgsql(2); LOG: duration: 0.326 ms plan: Query Text: SELECT count(*) FROM test where a = val Aggregate (cost=4.45..4.46 rows=1 width=8) (actual time=0.180..0.202 rows=1 loops=1) -> Index Only Scan using test_a_idx on test (cost=0.42..4.44 rows=1 width=0) (actual time=0.096..0.135 rows=1 loops=1) Index Cond: (a = 2) Heap Fetches: 1 LOG: duration: 1.250 ms plan: Query Text: select * from test_f_plpgsql(2); Function Scan on test_f_plpgsql (cost=0.25..0.26 rows=1 width=8) (actual time=1.116..1.152 rows=1 loops=1) cnt ----- 1 (1 row) The SQL function, on the other hand, uses a generic plan, which poorly chooses a full table scan. select * from test_f(2); LOG: duration: 18.716 ms plan: Query Text: SELECT count(*) FROM test where a = val; Partial Aggregate (cost=10675.01..10675.02 rows=1 width=8) (actual time=18.639..18.665 rows=1 loops=1) -> Parallel Seq Scan on test (cost=0.00..9633.34 rows=416667 width=0) (actual time=18.621..18.628 rows=0 loops=1) Filter: (a = $1) Rows Removed by Filter: 273008 LOG: duration: 28.304 ms plan: Query Text: SELECT count(*) FROM test where a = val; Partial Aggregate (cost=10675.01..10675.02 rows=1 width=8) (actual time=28.234..28.248 rows=1 loops=1) -> Parallel Seq Scan on test (cost=0.00..9633.34 rows=416667 width=0) (actual time=28.199..28.208 rows=1 loops=1) Filter: (a = $1) Rows Removed by Filter: 129222 LOG: duration: 45.913 ms plan: Query Text: SELECT count(*) FROM test where a = val; Finalize Aggregate (cost=11675.22..11675.23 rows=1 width=8) (actual time=42.370..42.377 rows=1 loops=1) -> Gather (cost=11675.01..11675.22 rows=2 width=8) (actual time=42.288..45.787 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 -> Partial Aggregate (cost=10675.01..10675.02 rows=1 width=8) (actual time=29.579..29.597 rows=1 loops=3) -> Parallel Seq Scan on test (cost=0.00..9633.34 rows=416667 width=0) (actual time=29.553..29.560 rows=0 loops=3) Filter: (a = $1) Rows Removed by Filter: 333333 LOG: duration: 47.128 ms plan: Query Text: select * from test_f(2); Function Scan on test_f (cost=0.25..0.26 rows=1 width=8) (actual time=47.058..47.073 rows=1 loops=1) cnt ----- 1 (1 row) Is there a way to force the SQL function to use a non-generic plan?