PostgreSQL: Drop in performance with multiple running functions without hardware bottleneck - postgresql

I have simple function, if I run it it takes around 40 second to finish.
select * from f_cyklus1(100000000)
but if I run this function 8 times in 8 separated instances, meaning all 8 function are running in parallel it takes around 210 to 260 seconds to finish for each of it's instances. Which is a massive drop in performance. I tried to compile it as 8 individual functions and run it again but it had no change in performance.
select * from f_cyklus1(100000000);
select * from f_cyklus2(100000000);
select * from f_cyklus3(100000000);
select * from f_cyklus4(100000000);
select * from f_cyklus5(100000000);
select * from f_cyklus6(100000000);
select * from f_cyklus7(100000000);
select * from f_cyklus8(100000000);
So why it takes 40s compare to 210-260s to finish? Our virtual machine has 16 CPUs and physical hardware was at low usage. I was also the only one using the Postgre database at the time of testing.
create or replace function f_cyklus1 (p_rozsah int) returns bigint as -- drop function f_cyklus(int)
$body$
declare
declare
v_exc_context TEXT;
v_result INTEGER;
p_soucet bigint :=0;
begin
for i in 0..p_rozsah
loop
p_soucet = p_soucet + i;
end loop;
return p_soucet;
EXCEPTION
WHEN OTHERS THEN
GET STACKED DIAGNOSTICS v_exc_context = PG_EXCEPTION_CONTEXT;
PERFORM main.ut_log('ERR', SQLERRM || ' [SQL State: ' || SQLSTATE || '] Context: ' || v_exc_context );
RAISE;
END;
$body$ LANGUAGE plpgsql
PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-39), 64-bit
Virtual machine: Centos 7 + KVM
HW: 2x AMD EPYC 7351 + 256 GB RAM
Note: I already asked similar question where I thought it was due to asynchronous processing, but this shows the problem is actually in raw Postgres performance, therefore I deleted my former question and asked this new one.

p_soucet = p_soucet + i;
Each time you do this, it has to acquire a "snapshot" in which to run the statement, as it uses the regular SQL engine behind the scenes and that always needs to run in snapshot. Acquiring a snapshot requires a system-wide lock. The more processes you have running simultaneously, the more time they spend fighting to acquire the snapshot, rather than doing useful work.
If you run the function in a transaction which is set to "repeatable read", you will find they scale better because they keep the same snapshot for the duration and keep re-using it. Of course that might interfere with your real use case.
plpgsql is not really well suited for this kind of work, scaling aside. You can use one of the other pl languages, like plperl or plpythonu.
How expressions are evaluated by main SQL engine is described at https://www.postgresql.org/docs/current/plpgsql-expressions.html
Snapshots are discussed in general at the docs starting at https://www.postgresql.org/docs/current/mvcc.html
I am not aware that the interaction between the two are documented anywhere for end users.

Related

How do I chain a VACUUM off of a purge routine running with pg_cron?

Postgres 13.4
I've got some pg_cron jobs set up to periodically delete older records out of log-like files. What I'd like to do is to run VACUUM ANALYZE after performing a purge. Unfortunately, I can't work out how to do this in a stored function. Am I missing a trick? Is a stored procedure more appropriate?
As an example, here's one of my purge routines
CREATE OR REPLACE FUNCTION dba.purge_event_log (
retain_days_in integer_positive default 14)
RETURNS int4
AS $BODY$
WITH -- Use a CTE so that we've got a way of returning the count easily.
deleted AS (
-- Normal-looking code for this requires a literal:
-- where your_dts < now() - INTERVAL '14 days'
-- Don't want to use a literal, SQL injection, etc.
-- Instead, using a interval constructor to achieve the same result:
DELETE
FROM dba.event_log
WHERE dts < now() - make_interval (days => $1)
RETURNING *
),
----------------------------------------
-- Save details to a custom log table
----------------------------------------
logit AS (
insert into dba.event_log (name, details)
values ('purge_event_log(' || retain_days_in::text || ')',
'count = ' || (select count(*)::text from deleted)
)
)
----------------------------------------
-- Return result count
----------------------------------------
select count(*) from deleted;
$BODY$
LANGUAGE sql;
COMMENT ON FUNCTION dba.purge_event_log (integer_positive) IS
'Delete dba.event_log records older than the day count passed in, with a default retention period of 14 days.';
The truth is, I don't really care about the count(*) result from this routine, in this case. But I might want a result and an additional action in some other, similar context. As you can see, the routine deletes records, uses a CTE to insert a report into another table, and then returns a result. No matter what, I figure this example is a good way to get me head around the alternatives and options in stored functions. The main thing I want to achieve here is to delete records, and then run maintenance. if this is an awkward fit for a stored function or procedure, I could write out an entry to a vacuum_list table with the table name, and have another job to run though that list.
If there's a smarter way to approach vacuum without the extra, I'm of course interested in that. But I'm also interested in understanding the limits on what operationa you can combine in PL/PgSQL routines.
Pavel Stehule' answer is correct and complete. I decided to follow-up a bit here as I like to dig in on bugs in my code, behaviors in Postgres, etc. to get a better sense of what I'm dealing with. I'm including some notes below for anyone who finds them of use.
COMMAND cannot be executed...
The reference to "VACUUM cannot be executed inside a transaction block" gave me a better way to search the docs for similarly restricted commands. The information below probably doesn't cover everything, but it's a start.
Command Limitation
CREATE DATABASE
ALTER DATABASE If creating a new table space.
DROP DATABASE
CLUSTER Without any parameters.
CREATE TABLESPACE
DROP TABLESPACE
REINDEX All in system catalogs, database, or schema.
CREATE SUBSCRIPTION When creating a replication slot (the default behavior.)
ALTER SUBSCRIPTION With refresh option as true.
DROP SUBSCRIPTION If the subscription is associated with a replication slot.
COMMIT PREPARED
ROLLBACK PREPARED
DISCARD ALL
VACUUM
The accepted answer indicates that the limitation has nothing to do with the specific server-side language used. I've just come across an older thread that has some excellent explanations and links for stored functions and transactions:
Do stored procedures run in database transaction in Postgres?
Sample Code
I also wondered about stored procedures, as they're allowed to control transactions. I tried them out in PG 13 and, no, the code is treated like a stored function, down to the error messages.
For anyone that goes in for this sort of thing, here are the "hello world" samples of sQL and PL/PgSQL stored functions and procedures to test out how VACCUM behaves in these cases. Spoiler: It doesn't work, as advertised.
SQL Function
/*
select * from dba.vacuum_sql_function();
Fails:
ERROR: VACUUM cannot be executed from a function
CONTEXT: SQL function "vacuum_sql_function" statement 1. 0.000 seconds. (Line 13).
*/
DROP FUNCTION IF EXISTS dba.vacuum_sql_function();
CREATE FUNCTION dba.vacuum_sql_function()
RETURNS VOID
LANGUAGE sql
AS $sql_code$
VACUUM ANALYZE activity;
$sql_code$;
select * from dba.vacuum_sql_function(); -- Fails.
PL/PgSQL Function
/*
select * from dba.vacuum_plpgsql_function();
Fails:
ERROR: VACUUM cannot be executed from a function
CONTEXT: SQL statement "VACUUM ANALYZE activity"
PL/pgSQL function vacuum_plpgsql_function() line 4 at SQL statement. 0.000 seconds. (Line 22).
*/
DROP FUNCTION IF EXISTS dba.vacuum_plpgsql_function();
CREATE FUNCTION dba.vacuum_plpgsql_function()
RETURNS VOID
LANGUAGE plpgsql
AS $plpgsql_code$
BEGIN
VACUUM ANALYZE activity;
END
$plpgsql_code$;
select * from dba.vacuum_plpgsql_function();
SQL Procedure
/*
call dba.vacuum_sql_procedure();
ERROR: VACUUM cannot be executed from a function
CONTEXT: SQL function "vacuum_sql_procedure" statement 1. 0.000 seconds. (Line 20).
*/
DROP PROCEDURE IF EXISTS dba.vacuum_sql_procedure();
CREATE PROCEDURE dba.vacuum_sql_procedure()
LANGUAGE SQL
AS $sql_code$
VACUUM ANALYZE activity;
$sql_code$;
call dba.vacuum_sql_procedure();
PL/PgSQL Procedure
/*
call dba.vacuum_plpgsql_procedure();
ERROR: VACUUM cannot be executed from a function
CONTEXT: SQL statement "VACUUM ANALYZE activity"
PL/pgSQL function vacuum_plpgsql_procedure() line 4 at SQL statement. 0.000 seconds. (Line 23).
*/
DROP PROCEDURE IF EXISTS dba.vacuum_plpgsql_procedure();
CREATE PROCEDURE dba.vacuum_plpgsql_procedure()
LANGUAGE plpgsql
AS $plpgsql_code$
BEGIN
VACUUM ANALYZE activity;
END
$plpgsql_code$;
call dba.vacuum_plpgsql_procedure();
Other Options
Plenty. As I understand it, VACUUM, and a handful of other commands, are not supported in server-side code running within Postgres. Therefore, you code needs to start from somewhere else. That can be:
Whatever cron you've got in your server's OS.
Any exteral client you like.
pg_cron.
As we're deployed on RDS, those last two options are where I'll look. And there's one more:
Let AUTOVACCUM and an occasional VACCUM do their thing.
That's pretty easy to do, and seems to work fine for the bulk of our needs.
Another Idea
If you do want a bit more control and some custom logging, I'm imagining a table like this:
CREATE TABLE IF NOT EXISTS dba.vacuum_list (
database_name text,
schema_name text,
table_name text,
run boolean,
run_analyze boolean,
run_full boolean,
last_run_dts timestamp)
ALTER TABLE dba.vacuum_list ADD CONSTRAINT
vacuum_list_pk
PRIMARY KEY (database_name, schema_name, table_name);
That's just a sketch. The idea is like this:
You INSERT into vacuum_list when a table needs some vacuuming, at least as far as you're concerned.
In my case, that would be an UPSERT as I don't need a full log-like table, just a single row per table of interest with the last outcome and/or pending state.
Periodically, a remote client, etc. connects, reads the table, and executes each specified VACUUM, according to the options specified in the record.
The external client updates the row with the last run timestamp, and whatever else you're including in the row.
Optionally, you could include fields for duration and change in relation size pre:post vacuuming.
That last option is what I'm interested in. None of our VACUUM calls were working for quite some time as there was a months-old dead connection from something sever-side. VACUUM appears to run fine, in such a case, it just can't delete a whole lot of rows. (Because of the super old "open" transaction ID, visibility maps, etc.) The only way to see this sort of thing seems to be to VACUUM VERBOSE and study the output. Or to record vacuum time and, more important, relation size change to flag cases where nothing seems to happen, when it seems like it should.
VACUUM is "top level" command. It cannot be executed from PL/pgSQL ever or from any other PL.

Is there a postgres function to mutably update a binary data structure?

I have written an AGGREGATE function that approximates a SELECT COUNT(DISTINCT ...) over a UUID column, a kind of poor man's HyperLogLog (and having different perf characteristics).
However, it is very slow because I am using set_bit on a BIT and that has copy-on-write semantics.
So my question is:
is there a way to inplace / mutably update a BIT or bytea?
failing that, are there any binary data structures that allow mutable/in-place set_bit edits?
A constraint is that I can't push C code or extensions to implement this. But I can use extensions that are available in AWS RDS postgres. If it's not faster than HLL then I'll just be using HLL. Note that HLL is optimised for pre-aggregated counts, it isn't terribly fast at doing adhoc count estimates over millions of rows (although still faster than a raw COUNT DISTINCT).
Below is the code for context, probably buggy too:
CREATE OR REPLACE FUNCTION uuid_approx_count_distinct_sfunc (BIT(83886080), UUID)
RETURNS BIT(83886080) AS $$
DECLARE
s BIT(83886080) := $1;
BEGIN
IF s IS NULL THEN s := '0' :: BIT(83886080); END IF;
RETURN set_bit(s, abs(mod(uuid_hash($2), 83886080)), 1);
END
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION uuid_approx_count_distinct_ffunc (BIT(83886080))
RETURNS INTEGER AS $$
DECLARE
i INTEGER := 83886079;
s INTEGER := 0;
BEGIN
LOOP
EXIT WHEN i < 0;
IF get_bit($1, i) = 1 THEN s := s + 1; END IF;
i := i - 1;
END LOOP;
RETURN s;
END
$$ LANGUAGE plpgsql;
CREATE OR REPLACE AGGREGATE approx_count_distinct (UUID) (
SFUNC = uuid_approx_count_distinct_sfunc,
FINALFUNC = uuid_approx_count_distinct_ffunc,
STYPE = BIT(83886080),
COMBINEFUNC = bitor,
PARALLEL = SAFE
);
Yeah, SQL isn't actually that fast for raw computation. I might try a UDF, perhaps pljava or plv8 (JavaScript) which compile just-in-time to native and available on most major hosting providers. Of course for performance, use C (perhaps via LLVM) for maximum performance at maximum pain. Plv8 should take minutes to prototype, just pass an array constructed from array_agg(). Obviously keep the array size to millions of items, or find a way to roll-up your sketches ( bitwuse-AND ?)
https://plv8.github.io/#function-calls
https://www.postgresqltutorial.com/postgresql-aggregate-functions/postgresql-array_agg-function/
FYI HyperLogLog is available as an open source extension for PostgreSQL from Citus/Microsoft and of course available on Azure. https://www.google.com/search?q=hyperloglog+postgres
(You could crib from their coffee and just change the core algorithm, then test side by side). Citus is pretty easy to install, so this isn't a bad option.

FOR LOOP without a transaction

We am doing a system redesigning and due to the change in design we need to import data from multiple similar source tables into one table. For this same, I am running a loop which have the list of tables and importing all the data. However, due to massive amount of data, I got out of memory error after execution of around 12 hours and 20 tables. Now I discovered that the loop runs in a single transaction which I don't need since the system which is filling the data is suspended for that time. Having this transaction thing, I believe, it is taking longer time also. My requirement is to run my query without any transaction.
DO $$DECLARE r record;
BEGIN
FOR r IN SELECT '
INSERT INTO dbo.tb_requests
(node_request_id, request_type, id, process_id, data, timestamp_d1, timestamp_d2, create_time, is_processed)
SELECT lpad(A._id, 32, ''0'')::UUID, (A.data_type + 1) request_type, B.id, B.order_id, data_value, timestamp_d1, timestamp_d2, create_time, TRUE
FROM dbo.data_store_' || id || ' A
JOIN dbo.tb_new_processes B
ON A.process_id = B.process_id
WHERE A._id != ''0'';
' as log_query FROM dbo.list_table
ORDER BY line_id
LOOP
EXECUTE r.log_query;
END LOOP;
END$$;
This is a sample code block. It is not the actual code block but I think, it will give the idea.
Error Message(Translation from Original Japanese error Message):
ERROR: Out of memory
DETAIL: Request for size 32 failed in memory context "ExprContext".
SQL state: 53200
You cannot to run any statement on server side without transaction. For some modern Postgres releases you can run commit statement inside DO statement. It is closes current transaction and starts new transactions. This can breaks very long transaction, and can solve the problem with memory leak - Postgres releasing some memory at transaction end.
Or use shell scripts instead (bash) if it is possible.

Make postgres ignore statement that current version does not support

Background
A service running same tasks against a number of similar PostgreSQL instances. Most environments are on version 10, but some are on 9. Upgrading them is not an option in short term at least.
Problem
To improve performance, we used PostgreSQL 10 feature CREATE STATISTICS. It works just fine on on environments on v10, but is not supported on v9.
One way to deal with it could be to duplicate each script that uses CREATE STATISTICS, maintain a copy of it without those statement and choose which script to run at application level. I'd like to avoid it as it's a lot of duplicated code to maintain.
I've tried to cheat it by only creating the statistics if the script finds appropriate version (code below), but on v9 it still gets picked up as a syntax error.
DO $$
-- server_version_num is a version number melted to an integer:
-- 9.6.6 -> 09.06.06 -> 90606
-- 10.0.1 -> 10.00.01 -> 100001
DECLARE pg_version int := current_setting('server_version_num');
BEGIN
IF pg_version >= 100000 THEN
CREATE STATISTICS table_1_related_col_group_a (NDISTINCT)
ON col_a1, col_a2
FROM schema_1.table_1;
CREATE STATISTICS table_2_related_col_group_b (NDISTINCT)
ON col_b1, col_b2, col_b3
FROM schema_1.table_2;
END IF;
END $$ LANGUAGE plpgsql;
Question
Is there a way to run a script that has an unsupported statement like CREATE STATISTICS in it without tipping postgres 9 off?
Use dynamic SQL. It won't be evaluated unless executed.
DO $$
-- server_version_num is a version number melted to an integer:
-- 9.6.6 -> 09.06.06 -> 90606
-- 10.0.1 -> 10.00.01 -> 100001
DECLARE pg_version int := current_setting('server_version_num');
BEGIN
IF pg_version >= 100000 THEN
EXECUTE 'CREATE STATISTICS table_1_related_col_group_a (NDISTINCT)
ON col_a1, col_a2
FROM schema_1.table_1;
CREATE STATISTICS table_2_related_col_group_b (NDISTINCT)
ON col_b1, col_b2, col_b3
FROM schema_1.table_2;';
END IF;
END $$ LANGUAGE plpgsql;

Why is PostgreSQL 9.2 slower than 9.1 in my tests?

I have upgraded from PostgreSQL 9.1.5 to 9.2.1:
"PostgreSQL 9.1.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit"
"PostgreSQL 9.2.1 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit"
It is on the same machine with default PostgreSQL configuration files (only port was changed).
For testing purpose I have simple table:
CREATE TEMP TABLE test_table_md_speed(id serial primary key, n integer);
Which I test using function:
CREATE OR REPLACE FUNCTION TEST_DB_SPEED(cnt integer) RETURNS text AS $$
DECLARE
time_start timestamp;
time_stop timestamp;
time_total interval;
BEGIN
time_start := cast(timeofday() AS TIMESTAMP);
FOR i IN 1..cnt LOOP
INSERT INTO test_table_md_speed(n) VALUES (i);
END LOOP;
time_stop := cast(timeofday() AS TIMESTAMP);
time_total := time_stop-time_start;
RETURN extract (milliseconds from time_total);
END;
$$ LANGUAGE plpgsql;
And I call:
SELECT test_db_speed(1000000);
I see strange results. For PostgreSQL 9.1.5 I get "8254.769", and for 9.2.1 I get: "9022.219". This means that new version is slower. I cannot find why.
Any ideas why those results differ?
You say both are on the same machine. Presumably the data files for the newer version were added later. Later files tend to be added closer to the center of the platter, where access speeds are slower.
There is a good section on this in Greg Smith's book on PostgreSQL performance, including ways to measure and graph the effect. With clever use of the dd utility you might be able to do some ad hoc tests of the relative speed at each location, at least for reads.
The 9.2 release generally scales up to a large number of cores better than earlier versions, although in some of the benchmarks there was a very slight reduction in the performance of a single query running alone. I didn't see any benchmarks showing an effect anywhere near this big, though; I would bet on it being the result of position on the drive -- with just goes to show how hard it can be to do good benchmarking.
UPDATE: A change made in 9.2.0 to improve performance for some queries made some other queries perform worse. Eventually it was determined that this change should be reverted, which happened in version 9.2.3; so it is worth checking performance after upgrading to that maintenance release. A proper fix, which has been confirmed to fix the problem the reverted patch fixed without causing a regression, will be included in 9.3.0.