postgres create temporary table in function is slow

postgres create temporary table in function is slow - postgresql

I am facing a strange problem. in PostgreSQL, I want to create a temporary table in a function. Using below statement to create:
CREATE TEMPORARY TABLE mytemp ON COMMIT DROP
AS SELECT empid, sal, myfunction(empid,sal) FROM emp;
When I run this by itself, it takes 28 seconds. But when I put it into function code, it takes 3 minutes. I have verified the timing by displaying clock_timestamp() before and after the creation statement. The part of the function is:
begin
/*..... other code lines...*/
v_select:='select empid, sal, myfunction(empid,sal) from emp';
raise notice '%','time - '||clock_timestamp();
execute 'create temporary table mytemp on commit drop as '||v_select;
raise notice '%','time - '||clock_timestamp();
/*..... other code lines...*/
end;
it takes 3 minutes to execute. Please help me, I am keen to know the reason for such a difference.

Related

How do I chain a VACUUM off of a purge routine running with pg_cron?

Postgres 13.4
I've got some pg_cron jobs set up to periodically delete older records out of log-like files. What I'd like to do is to run VACUUM ANALYZE after performing a purge. Unfortunately, I can't work out how to do this in a stored function. Am I missing a trick? Is a stored procedure more appropriate?
As an example, here's one of my purge routines
CREATE OR REPLACE FUNCTION dba.purge_event_log (
retain_days_in integer_positive default 14)
RETURNS int4
AS $BODY$
WITH -- Use a CTE so that we've got a way of returning the count easily.
deleted AS (
-- Normal-looking code for this requires a literal:
-- where your_dts < now() - INTERVAL '14 days'
-- Don't want to use a literal, SQL injection, etc.
-- Instead, using a interval constructor to achieve the same result:
DELETE
FROM dba.event_log
WHERE dts < now() - make_interval (days => $1)
RETURNING *
),
----------------------------------------
-- Save details to a custom log table
----------------------------------------
logit AS (
insert into dba.event_log (name, details)
values ('purge_event_log(' || retain_days_in::text || ')',
'count = ' || (select count(*)::text from deleted)
)
)
----------------------------------------
-- Return result count
----------------------------------------
select count(*) from deleted;
$BODY$
LANGUAGE sql;
COMMENT ON FUNCTION dba.purge_event_log (integer_positive) IS
'Delete dba.event_log records older than the day count passed in, with a default retention period of 14 days.';
The truth is, I don't really care about the count(*) result from this routine, in this case. But I might want a result and an additional action in some other, similar context. As you can see, the routine deletes records, uses a CTE to insert a report into another table, and then returns a result. No matter what, I figure this example is a good way to get me head around the alternatives and options in stored functions. The main thing I want to achieve here is to delete records, and then run maintenance. if this is an awkward fit for a stored function or procedure, I could write out an entry to a vacuum_list table with the table name, and have another job to run though that list.
If there's a smarter way to approach vacuum without the extra, I'm of course interested in that. But I'm also interested in understanding the limits on what operationa you can combine in PL/PgSQL routines.
Pavel Stehule' answer is correct and complete. I decided to follow-up a bit here as I like to dig in on bugs in my code, behaviors in Postgres, etc. to get a better sense of what I'm dealing with. I'm including some notes below for anyone who finds them of use.
COMMAND cannot be executed...
The reference to "VACUUM cannot be executed inside a transaction block" gave me a better way to search the docs for similarly restricted commands. The information below probably doesn't cover everything, but it's a start.
Command Limitation
CREATE DATABASE
ALTER DATABASE If creating a new table space.
DROP DATABASE
CLUSTER Without any parameters.
CREATE TABLESPACE
DROP TABLESPACE
REINDEX All in system catalogs, database, or schema.
CREATE SUBSCRIPTION When creating a replication slot (the default behavior.)
ALTER SUBSCRIPTION With refresh option as true.
DROP SUBSCRIPTION If the subscription is associated with a replication slot.
COMMIT PREPARED
ROLLBACK PREPARED
DISCARD ALL
VACUUM
The accepted answer indicates that the limitation has nothing to do with the specific server-side language used. I've just come across an older thread that has some excellent explanations and links for stored functions and transactions:
Do stored procedures run in database transaction in Postgres?
Sample Code
I also wondered about stored procedures, as they're allowed to control transactions. I tried them out in PG 13 and, no, the code is treated like a stored function, down to the error messages.
For anyone that goes in for this sort of thing, here are the "hello world" samples of sQL and PL/PgSQL stored functions and procedures to test out how VACCUM behaves in these cases. Spoiler: It doesn't work, as advertised.
SQL Function
/*
select * from dba.vacuum_sql_function();
Fails:
ERROR: VACUUM cannot be executed from a function
CONTEXT: SQL function "vacuum_sql_function" statement 1. 0.000 seconds. (Line 13).
*/
DROP FUNCTION IF EXISTS dba.vacuum_sql_function();
CREATE FUNCTION dba.vacuum_sql_function()
RETURNS VOID
LANGUAGE sql
AS $sql_code$
VACUUM ANALYZE activity;
$sql_code$;
select * from dba.vacuum_sql_function(); -- Fails.
PL/PgSQL Function
/*
select * from dba.vacuum_plpgsql_function();
Fails:
ERROR: VACUUM cannot be executed from a function
CONTEXT: SQL statement "VACUUM ANALYZE activity"
PL/pgSQL function vacuum_plpgsql_function() line 4 at SQL statement. 0.000 seconds. (Line 22).
*/
DROP FUNCTION IF EXISTS dba.vacuum_plpgsql_function();
CREATE FUNCTION dba.vacuum_plpgsql_function()
RETURNS VOID
LANGUAGE plpgsql
AS $plpgsql_code$
BEGIN
VACUUM ANALYZE activity;
END
$plpgsql_code$;
select * from dba.vacuum_plpgsql_function();
SQL Procedure
/*
call dba.vacuum_sql_procedure();
ERROR: VACUUM cannot be executed from a function
CONTEXT: SQL function "vacuum_sql_procedure" statement 1. 0.000 seconds. (Line 20).
*/
DROP PROCEDURE IF EXISTS dba.vacuum_sql_procedure();
CREATE PROCEDURE dba.vacuum_sql_procedure()
LANGUAGE SQL
AS $sql_code$
VACUUM ANALYZE activity;
$sql_code$;
call dba.vacuum_sql_procedure();
PL/PgSQL Procedure
/*
call dba.vacuum_plpgsql_procedure();
ERROR: VACUUM cannot be executed from a function
CONTEXT: SQL statement "VACUUM ANALYZE activity"
PL/pgSQL function vacuum_plpgsql_procedure() line 4 at SQL statement. 0.000 seconds. (Line 23).
*/
DROP PROCEDURE IF EXISTS dba.vacuum_plpgsql_procedure();
CREATE PROCEDURE dba.vacuum_plpgsql_procedure()
LANGUAGE plpgsql
AS $plpgsql_code$
BEGIN
VACUUM ANALYZE activity;
END
$plpgsql_code$;
call dba.vacuum_plpgsql_procedure();
Other Options
Plenty. As I understand it, VACUUM, and a handful of other commands, are not supported in server-side code running within Postgres. Therefore, you code needs to start from somewhere else. That can be:
Whatever cron you've got in your server's OS.
Any exteral client you like.
pg_cron.
As we're deployed on RDS, those last two options are where I'll look. And there's one more:
Let AUTOVACCUM and an occasional VACCUM do their thing.
That's pretty easy to do, and seems to work fine for the bulk of our needs.
Another Idea
If you do want a bit more control and some custom logging, I'm imagining a table like this:
CREATE TABLE IF NOT EXISTS dba.vacuum_list (
database_name text,
schema_name text,
table_name text,
run boolean,
run_analyze boolean,
run_full boolean,
last_run_dts timestamp)
ALTER TABLE dba.vacuum_list ADD CONSTRAINT
vacuum_list_pk
PRIMARY KEY (database_name, schema_name, table_name);
That's just a sketch. The idea is like this:
You INSERT into vacuum_list when a table needs some vacuuming, at least as far as you're concerned.
In my case, that would be an UPSERT as I don't need a full log-like table, just a single row per table of interest with the last outcome and/or pending state.
Periodically, a remote client, etc. connects, reads the table, and executes each specified VACUUM, according to the options specified in the record.
The external client updates the row with the last run timestamp, and whatever else you're including in the row.
Optionally, you could include fields for duration and change in relation size pre:post vacuuming.
That last option is what I'm interested in. None of our VACUUM calls were working for quite some time as there was a months-old dead connection from something sever-side. VACUUM appears to run fine, in such a case, it just can't delete a whole lot of rows. (Because of the super old "open" transaction ID, visibility maps, etc.) The only way to see this sort of thing seems to be to VACUUM VERBOSE and study the output. Or to record vacuum time and, more important, relation size change to flag cases where nothing seems to happen, when it seems like it should.

VACUUM is "top level" command. It cannot be executed from PL/pgSQL ever or from any other PL.

PostgreSQL : relation [temp table] does not exist error for temporary table

I have a function in PostgreSQL that calls multiple function depends on certain conditions. I create a temporary table in main function dynamically using "Execute" statement and using that temporary table for insertion and selection
in other functions (same dynamic process using "Execute" statement) those I am calling from main function.
However, its working fine as per my requirement. But sometimes it is throwing an error 'relation does not exists' on the temporary table when it is performing selection or insertion on the subroutine(internal function).
SAMPLE
Main Function
CREATE OR REPLACE FUNCTION public.sample_function(
param bigint)
RETURNS TABLE(isfinished boolean)
LANGUAGE 'plpgsql'
COST 100.0
VOLATILE ROWS 1000.0
AS $function$
DECLARE
st_dt DATE;
end_dt DATE;
var4 CHARACTER VARYING := CURRENT_TIME;
var1 character varying;
BEGIN
SELECT SUBSTRING(REPLACE(REPLACE(REPLACE(var4,':',''),'.',''),'+','') FROM 5 FOR 7) INTO var4;
EXECUTE 'CREATE TABLE sampletable'||var4||' (
"emp_id" UUid,
"emp_name" Character Varying( 2044 ),
"start_date" Date,
"end_date" Date)';
select public.innerfunction (st_dt,end_dt,var4)
into var1;
EXECUTE 'DROP TABLE sampletable'||var4;
return query select true ;
END;
- Inner Function
CREATE OR REPLACE FUNCTION public.innerfunction(
st_dt timestamp without time zone,
end_dt timestamp without time zone,
var4 bigint)
RETURNS integer
LANGUAGE 'plpgsql'
COST 100.0
VOLATILE
AS $function$
DECLARE
date1 timestamp without time zone:=st_dt
BEGIN
EXECUTE 'INSERT INTO sampletable'||var4||'
SELECT *
from "abc"'
;
return return_val;
END;
$function$;
- Error Message
ERROR:
relation "sampletable1954209" does not exist
LINE 1: INSERT INTO sampletable1954209
QUERY: INSERT INTO sampletable1954209
SELECT *
from "abc"
;
CONTEXT: PL/pgSQL function innerfunction(timestamp without time zone,
timestamp without time zone) line 51 at EXECUTE
SQL statement "SELECT public.innerfunction(st_dt ,end_dt)"
PL/pgSQL function sample_function(bigint) line 105 at SQL statement
********** Error **********
In above example I created a main function 'sample_function', and I am creating a temporary dynamic table 'sampletable with a random number attached to it. I am using that table on 'innerfunction' for insertion purpose.
When I am calling the main function it working as required but some times it gives the mentioned error 'relation "sampletable1954209" does not exist'.
I am not able to catch the issue.

I just ran into this problem. It turned out to be because the database was behind pgBouncer in transaction pooling mode, which meant each query could conceivably run in a different connection.
There were two symptoms of this behaviour:
I could run CREATE TEMPORARY TABLE test (LIKE sometable); in one connection, and then run SELECT * FROM test in another connection and see the temporary table. This isn't supposed to happen as temporary tables are meant to be session-specific, but because pgBouncer is pooling connections it means multiple sessions can share temporary tables unpredictably, if they are created outside of transactions.
Sometimes the connection would randomly change and my temporary table would disappear. I was creating and accessing the temporary table outside of a transaction which is why pgBouncer thought it was ok to switch connections.
The solution is to wrap everything up inside a transaction, however this was problematic for me because I was calling other code that used transactions. Postgres doesn't support nested transactions, so I was unable to wrap my code in one. (It does approximate nested transactions with savepoints, but this requires your code to use different SQL if it's running inside a transaction or not, which is not practical.)
Another solution is to change the pgBouncer configuration to session pooling instead, which causes it to run the DISCARD ALL statement when a client disconnects, before giving the session to another client. This drops all temporary tables and makes the connections behave more like direct connections.
There's a good write up of the issues at the VMWare support hub.

I see a problem with this code, but it does not quite match your exact error message:
You pass var4 to innerfunction as bigint.
Now if var4 starts with a zero like 0649940, then sample_function will use sampletable0649940, while innerfunction will try to access sampletable649940 because the leading zero is lost in the conversion.
Your error message has a seven-digit number though, so it might be a different problem.
Why don't you use a temporary table and use a fixed name for it? A temporary table is only visible in one session, and there cannot be any name collisions. That's what temporary tables are for.

Sybase error when creating several triggers

I have a weird error in sybase, basically I'm adding a functionality that includes creating several tables and triggers, in order to do this I'm creating a script that verifies if table or trigger exist and drops it to generate it again in case any of them needs and update, with the tables I'm having no issues but is not working with triggers:
Here is the syntaxs:
IF EXISTS (SELECT 1 FROM sysobjects WHERE name="fu_codigos_alertas_UPD")
DROP TRIGGER fu_codigos_alertas_UPD
GO
CREATE TRIGGER fu_codigos_alertas_UPD ON fu_codigos_alertas FOR UPDATE AS
DECLARE
#seq int
BEGIN
/* Code */
END
CREATE TRIGGER fu_codigos_alertas_DEL ON fu_codigos_alertas FOR DELETE AS
DECLARE
#seq int
BEGIN
/* Code */
END
If I execute one or the other alone they work, but if I execute them together it fails and i get:
"DROP TRIGGER command not allowed within a trigger"
Basically the error is when i execute 2 trigger drops, dunno why.
Anyone experimented a similar issue?

postgres trigger function only inserts few records in another table

I have 2 tables hourly and daily and my aim is to calculate average of values from hourly table and save it to daily table. I have written a trigger function like this -
CREATE OR REPLACE FUNCTION public.calculate_daily_avg()
RETURNS trigger AS
$BODY$
DECLARE chrly CURSOR for
SELECT device, date(datum) datum, avg(cpu_util) cpu_util
FROM chourly WHERE date(datum) = current_date group by device, date(datum);
BEGIN
FOR chrly_rec IN chrly
LOOP
insert into cdaily (device, datum, cpu_util)
values (chrly_rec.device, chrly_rec.datum, chrly_rec.cpu_util)
on conflict (device, datum) do update set
device=chrly_rec.device, datum=chrly_rec.datum, cpu_util=chrly_rec.cpu_util;
return NEW;
END LOOP;
EXCEPTION
WHEN NO_DATA_FOUND THEN
RAISE NOTICE 'NO DATA IN chourly FOR %', current_date;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION public.calculate_daily_avg()
OWNER TO postgres;
and a trigger on hourly table like this -
CREATE TRIGGER calculate_daily_avg_trg
BEFORE INSERT OR UPDATE
ON public.chourly
FOR EACH ROW
EXECUTE PROCEDURE public.calculate_daily_avg();
But when I try to insert or update about 3000 records in the hourly table only 3 or 4 devices are inserted and not 3000. (also in trigger I have already tried AFTER INSERT OR UPDATE but even that gives the same result) What I am doing wrong here? Please suggest any better way to write the trigger if you feel I have written it wrongly. Thanks!

I don't suggest using TRIGGER for calculation when INSERT. Try different approach using function execute by cron hourly or daily.
WHY?
Because everytime you INSERT one row. the postgres will always do aggregate function AVG() and LOOPING for insert (based on your flow).
That mean another INSERT statement will waiting until Previous INSERT commited that will suffer your database performance for highly INSERT Transaction. If somehow you manage to BREAK the rule (maybe from configuration) you will get inconsistent data like what happened to you right now.

PL/pgSQL query in PostgreSQL returns result for new, empty table

I am learning to use triggers in PostgreSQL but run into an issue with this code:
CREATE OR REPLACE FUNCTION checkAdressen() RETURNS TRIGGER AS $$
DECLARE
adrCnt int = 0;
BEGIN
SELECT INTO adrCnt count(*) FROM Adresse
WHERE gehoert_zu = NEW.kundenId;
IF adrCnt < 1 OR adrCnt > 3 THEN
RAISE EXCEPTION 'Customer must have 1 to 3 addresses.';
ELSE
RAISE EXCEPTION 'No exception';
END IF;
END;
$$ LANGUAGE plpgsql;
I create a trigger with this procedure after freshly creating all my tables so they are all empty. However the count(*) function in the above code returns 1.
When I run SELECT count(*) FROM adresse; outside of PL/pgSQL, I get 0.
I tried using the FOUND variable but it is always true.
Even more strangely, when I insert some values into my tables and then delete them again so that they are empty again, the code works as intended and count(*) returns 0.
Also if I leave out the WHERE gehoert_zu = NEW.kundenId, count(*) returns 0 which means I get more results with the WHERE clause than without.
--Edit:
Here is an example of how I use the procedure:
CREATE TABLE kunde (
kundenId int PRIMARY KEY
);
CREATE TABLE adresse (
id int PRIMARY KEY,
gehoert_zu int REFERENCES kunde
);
CREATE CONSTRAINT TRIGGER adressenKonsistenzTrigger AFTER INSERT ON Kunde
DEFERRABLE INITIALLY DEFERRED
FOR EACH ROW
EXECUTE PROCEDURE checkAdressen();
INSERT INTO kunde VALUES (1);
INSERT INTO adresse VALUES (1,1);
It looks like I am getting the DEFERRABLE INITIALLY DEFERRED part wrong. I assumed the trigger would be executed after the first INSERT statement but it happens after the second one, although the inserts are not inside a BEGIN; - COMMIT; - Block.
According to the PostgreSQL Documentation inserts are commited automatically every time if not inside such a block and thus there shouldn't be an entry in adresse when the first INSERT statement is commited.
Can anyone point out my mistake?
--Edit:
The trigger and DEFERRABLE INITIALLY DEFERRED seem to be working all right.
My mistake was to assume that since I am not using a BEGIN-COMMIT-Block each insert would be executed in an own transaction with the trigger being executed afterwards every time.
However even without the BEGIN-COMMIT all inserts get bundled into one transaction and the trigger is executed afterwards.
Given this behaviour, what is the point in using BEGIN-COMMIT?

You need a transaction plus the "DEFERRABLE INITIALLY DEFERRED" because of the chicken and egg problem.
starting with two empty tables:
you cannot insert a single row into the person table, because the it needs at least one address.
you cannot insert a single row into the address table, because the FK constraint needs a corresponding row on the person table to exist
This is why you need to bundle the two inserts into one operation: the transaction. You need the BEGIN+ COMMIT, and the DEFERRABLE allows transient forbidden database states to exists: it causes the check to be evaluated at commit time.

This may seem a bit silly, but the answer is you need to stop deferring the trigger and run it BEFORE the insert. If you run it after the insert, of course there is data in the table.
As far as I can tell this is working as expected.
One further note, you probably dont mean:
RAISE EXCEPTION 'No Exception';
You probably want
RAISE INFO 'No Exception';
Then you can change your settings and run queries in transactions to test that the trigger does what you want it to do. As it is, every insert is going to fail and you have no way to move this into production without editing your procedure.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse