I've got an app built on top of PostgresQL, which makes use of a custom sequence. I think I understand sequences pretty well by now: they are non-transactional, currval is defined only within the current session, etc. But I don't understand this:
2015-10-13 10:37:16 SQLSelect: SELECT nextval('commit_id_seq')
2015-10-13 10:37:16 commit_id_seq: 57
2015-10-13 10:37:16 SQLExecute: UPDATE bid SET is_archived=false,company_id=1436,contact_id=15529,...(etc)...,sharing_policy='' WHERE id = 56229
2015-10-13 10:37:16 ERROR: ERROR: currval of sequence "commit_id_seq" is not yet defined in this session
CONTEXT: SQL statement "INSERT INTO history (table_name, record_id, sec_user_id, created, action, notes, status, before, after, commit_id)
SELECT TG_TABLE_NAME, rec.id, (SELECT id FROM sec_user WHERE name = CURRENT_USER), now(), SUBSTR(TG_OP,1,1), note, stat, oldH, newH, currval('commit_id_seq')"
PL/pgSQL function log_to_history() line 28 at SQL statement
[3]
We log every call to the database, and in the case of the SELECT nextval, I also log the result. The above are the exact calls, except that I trimmed the UPDATE statement (because the original is really long).
So, you can see that we just called nextval on the sequence, got a reasonable number back, and then we do an UPDATE that invokes a trigger function that attempts to use currval on that sequence... and it fails, claiming currval is not defined.
Note that this doesn't usually happen, but once it does start happening, it does so consistently (perhaps until the user disconnects from the DB).
How can this be? And what can I do about it?
Your UPDATE statement obviously calls a trigger. The most plausible cause of this error is that the trigger function is in a different schema from where the sequence is defined and the schema of the sequence is not in the search_path. That gives you two options to resolve this:
Make the schema of the sequence visible to the trigger function using SET search_path TO .... Note that this will make all objects in the schema of the sequence visible, which may be something of a security risk, depending on your database design.
Schema-qualify the sequence name in the trigger function: currval('my_schema.commit_id_seq').
Another plausible cause is connection pooling at your application end. Log the "session ID" (really just the starting time and pid of the current session) by adding %c to your log_line_prefix() parameter in postgresql.conf. In PostgreSQL every command runs in its own transaction unless a transaction is explicitly established. Connection pooling software also works at the transaction level (i.e. you start a transaction and then your connection will stay open until you close it, outside of a transaction there are no guarantees about session persistence). If that is the case you can wrap your entire set of commands in a BEGIN ... COMMIT block (you should probably use a specific call from your pooling software), or better yet, change your code to not depend on a previous nextval() call.
Related
Postgres 13.4
I've got some pg_cron jobs set up to periodically delete older records out of log-like files. What I'd like to do is to run VACUUM ANALYZE after performing a purge. Unfortunately, I can't work out how to do this in a stored function. Am I missing a trick? Is a stored procedure more appropriate?
As an example, here's one of my purge routines
CREATE OR REPLACE FUNCTION dba.purge_event_log (
retain_days_in integer_positive default 14)
RETURNS int4
AS $BODY$
WITH -- Use a CTE so that we've got a way of returning the count easily.
deleted AS (
-- Normal-looking code for this requires a literal:
-- where your_dts < now() - INTERVAL '14 days'
-- Don't want to use a literal, SQL injection, etc.
-- Instead, using a interval constructor to achieve the same result:
DELETE
FROM dba.event_log
WHERE dts < now() - make_interval (days => $1)
RETURNING *
),
----------------------------------------
-- Save details to a custom log table
----------------------------------------
logit AS (
insert into dba.event_log (name, details)
values ('purge_event_log(' || retain_days_in::text || ')',
'count = ' || (select count(*)::text from deleted)
)
)
----------------------------------------
-- Return result count
----------------------------------------
select count(*) from deleted;
$BODY$
LANGUAGE sql;
COMMENT ON FUNCTION dba.purge_event_log (integer_positive) IS
'Delete dba.event_log records older than the day count passed in, with a default retention period of 14 days.';
The truth is, I don't really care about the count(*) result from this routine, in this case. But I might want a result and an additional action in some other, similar context. As you can see, the routine deletes records, uses a CTE to insert a report into another table, and then returns a result. No matter what, I figure this example is a good way to get me head around the alternatives and options in stored functions. The main thing I want to achieve here is to delete records, and then run maintenance. if this is an awkward fit for a stored function or procedure, I could write out an entry to a vacuum_list table with the table name, and have another job to run though that list.
If there's a smarter way to approach vacuum without the extra, I'm of course interested in that. But I'm also interested in understanding the limits on what operationa you can combine in PL/PgSQL routines.
Pavel Stehule' answer is correct and complete. I decided to follow-up a bit here as I like to dig in on bugs in my code, behaviors in Postgres, etc. to get a better sense of what I'm dealing with. I'm including some notes below for anyone who finds them of use.
COMMAND cannot be executed...
The reference to "VACUUM cannot be executed inside a transaction block" gave me a better way to search the docs for similarly restricted commands. The information below probably doesn't cover everything, but it's a start.
Command Limitation
CREATE DATABASE
ALTER DATABASE If creating a new table space.
DROP DATABASE
CLUSTER Without any parameters.
CREATE TABLESPACE
DROP TABLESPACE
REINDEX All in system catalogs, database, or schema.
CREATE SUBSCRIPTION When creating a replication slot (the default behavior.)
ALTER SUBSCRIPTION With refresh option as true.
DROP SUBSCRIPTION If the subscription is associated with a replication slot.
COMMIT PREPARED
ROLLBACK PREPARED
DISCARD ALL
VACUUM
The accepted answer indicates that the limitation has nothing to do with the specific server-side language used. I've just come across an older thread that has some excellent explanations and links for stored functions and transactions:
Do stored procedures run in database transaction in Postgres?
Sample Code
I also wondered about stored procedures, as they're allowed to control transactions. I tried them out in PG 13 and, no, the code is treated like a stored function, down to the error messages.
For anyone that goes in for this sort of thing, here are the "hello world" samples of sQL and PL/PgSQL stored functions and procedures to test out how VACCUM behaves in these cases. Spoiler: It doesn't work, as advertised.
SQL Function
/*
select * from dba.vacuum_sql_function();
Fails:
ERROR: VACUUM cannot be executed from a function
CONTEXT: SQL function "vacuum_sql_function" statement 1. 0.000 seconds. (Line 13).
*/
DROP FUNCTION IF EXISTS dba.vacuum_sql_function();
CREATE FUNCTION dba.vacuum_sql_function()
RETURNS VOID
LANGUAGE sql
AS $sql_code$
VACUUM ANALYZE activity;
$sql_code$;
select * from dba.vacuum_sql_function(); -- Fails.
PL/PgSQL Function
/*
select * from dba.vacuum_plpgsql_function();
Fails:
ERROR: VACUUM cannot be executed from a function
CONTEXT: SQL statement "VACUUM ANALYZE activity"
PL/pgSQL function vacuum_plpgsql_function() line 4 at SQL statement. 0.000 seconds. (Line 22).
*/
DROP FUNCTION IF EXISTS dba.vacuum_plpgsql_function();
CREATE FUNCTION dba.vacuum_plpgsql_function()
RETURNS VOID
LANGUAGE plpgsql
AS $plpgsql_code$
BEGIN
VACUUM ANALYZE activity;
END
$plpgsql_code$;
select * from dba.vacuum_plpgsql_function();
SQL Procedure
/*
call dba.vacuum_sql_procedure();
ERROR: VACUUM cannot be executed from a function
CONTEXT: SQL function "vacuum_sql_procedure" statement 1. 0.000 seconds. (Line 20).
*/
DROP PROCEDURE IF EXISTS dba.vacuum_sql_procedure();
CREATE PROCEDURE dba.vacuum_sql_procedure()
LANGUAGE SQL
AS $sql_code$
VACUUM ANALYZE activity;
$sql_code$;
call dba.vacuum_sql_procedure();
PL/PgSQL Procedure
/*
call dba.vacuum_plpgsql_procedure();
ERROR: VACUUM cannot be executed from a function
CONTEXT: SQL statement "VACUUM ANALYZE activity"
PL/pgSQL function vacuum_plpgsql_procedure() line 4 at SQL statement. 0.000 seconds. (Line 23).
*/
DROP PROCEDURE IF EXISTS dba.vacuum_plpgsql_procedure();
CREATE PROCEDURE dba.vacuum_plpgsql_procedure()
LANGUAGE plpgsql
AS $plpgsql_code$
BEGIN
VACUUM ANALYZE activity;
END
$plpgsql_code$;
call dba.vacuum_plpgsql_procedure();
Other Options
Plenty. As I understand it, VACUUM, and a handful of other commands, are not supported in server-side code running within Postgres. Therefore, you code needs to start from somewhere else. That can be:
Whatever cron you've got in your server's OS.
Any exteral client you like.
pg_cron.
As we're deployed on RDS, those last two options are where I'll look. And there's one more:
Let AUTOVACCUM and an occasional VACCUM do their thing.
That's pretty easy to do, and seems to work fine for the bulk of our needs.
Another Idea
If you do want a bit more control and some custom logging, I'm imagining a table like this:
CREATE TABLE IF NOT EXISTS dba.vacuum_list (
database_name text,
schema_name text,
table_name text,
run boolean,
run_analyze boolean,
run_full boolean,
last_run_dts timestamp)
ALTER TABLE dba.vacuum_list ADD CONSTRAINT
vacuum_list_pk
PRIMARY KEY (database_name, schema_name, table_name);
That's just a sketch. The idea is like this:
You INSERT into vacuum_list when a table needs some vacuuming, at least as far as you're concerned.
In my case, that would be an UPSERT as I don't need a full log-like table, just a single row per table of interest with the last outcome and/or pending state.
Periodically, a remote client, etc. connects, reads the table, and executes each specified VACUUM, according to the options specified in the record.
The external client updates the row with the last run timestamp, and whatever else you're including in the row.
Optionally, you could include fields for duration and change in relation size pre:post vacuuming.
That last option is what I'm interested in. None of our VACUUM calls were working for quite some time as there was a months-old dead connection from something sever-side. VACUUM appears to run fine, in such a case, it just can't delete a whole lot of rows. (Because of the super old "open" transaction ID, visibility maps, etc.) The only way to see this sort of thing seems to be to VACUUM VERBOSE and study the output. Or to record vacuum time and, more important, relation size change to flag cases where nothing seems to happen, when it seems like it should.
VACUUM is "top level" command. It cannot be executed from PL/pgSQL ever or from any other PL.
I'm writing a script for PostgreSQL and since I want it to be executed atomically, I'm wrapping it inside a transaction.
I expected the script to look something like this:
BEGIN
-- 1) Execute some valid actions;
-- 2) Execute some action that causes an error.
EXCEPTION
WHEN OTHERS THEN
ROLLBACK;
END; -- A.k.a. COMMIT;
However, in this case pgAdmin warns me about a syntax error right after the initial BEGIN. If I terminate the command there by appending a semicolon like so: BEGIN; it instead informs me about error near EXCEPTION.
I realize that perhaps I'm mixing up syntax for control structures and transactions, however I couldn't find any mention of how to roll back a failed transaction in the docs (nor in SO for that matter).
I also considered that perhaps the transaction is rolled back automatically on error, but it doesn't seem to be the case since the following script:
BEGIN;
-- 1) Execute some valid actions;
-- 2) Execute some action that causes an error.
COMMIT;
warns me that: ERROR: current transaction is aborted, commands ignored until end of transaction block and I have to then manually ROLLBACK; the transaction.
It seems I'm missing something fundamental here, but what?
EDIT:
I tried using DO as well like so:
DO $$
BEGIN
-- 1) Execute some valid actions;
-- 2) Execute some action that causes an error.
EXCEPTION
WHEN OTHERS THEN
ROLLBACK;
END; $$
pgAdmin hits me back with a: ERROR: cannot begin/end transactions in PL/pgSQL. HINT: Use a BEGIN block with an EXCEPTION clause instead. which confuses me to no end, because that is exactly what I am (I think) doing.
POST-ACCEPT EDIT:
Regarding Laurenz's comment: "Your SQL script would contain a COMMIT. That ends the transaction and rolls it back." - this is not the behavior that I observe. Please consider the following example (which is just a concrete version of an example I already provided in my original question):
BEGIN;
-- Just a simple, self-referencing table.
CREATE TABLE "Dummy" (
"Id" INT GENERATED ALWAYS AS IDENTITY,
"ParentId" INT NULL,
CONSTRAINT "PK_Dummy" PRIMARY KEY ("Id"),
CONSTRAINT "FK_Dummy_Dummy" FOREIGN KEY ("ParentId") REFERENCES "Dummy" ("Id")
);
-- Foreign key violation terminates the transaction.
INSERT INTO "Dummy" ("ParentId")
VALUES (99);
COMMIT;
When I execute the script above, I'm greeted with: ERROR: insert or update on table "Dummy" violates foreign key constraint "FK_Dummy_Dummy". DETAIL: Key (ParentId)=(99) is not present in table "Dummy". which is as expected.
However, if I then try to check whether my Dummy table was created or rolled back like so:
SELECT EXISTS (
SELECT FROM information_schema."tables"
WHERE "table_name" = 'Dummy');
instead of a simple false, I get the same error that I already mentioned twice: ERROR: current transaction is aborted, commands ignored until end of transaction block. Then I have to manually terminate the transaction via issuing ROLLBACK;.
So to me it seems that either the comment mentioned above is false or at least I'm heavily misinterpreting something here.
You cannot use ROLLBACK in PL/pgSQL, except in certain limited cases inside procedures.
You don't need to explicitly roll back in your PL/pgSQL code. Just let the exception propagate out of the PL/pgSQL code, and it will cause an error, which will cause the whole transaction to be rolled back.
Your comments suggest that this code is called from an SQL script. Then the solution would be to have a COMMIT in that SQL script at some place after the PL/pgSQL code. That would end the transaction and roll it back.
I think you must be using an older version, as the exact code from your question works without error for me:
(The above is with PostgreSQL 13.1, and pgAdmin 4.28.)
It also works fine for me, without the exception block:
As per this comment, you can remove the exception block within a function, and if an error occurs, the transaction run within it will automatically be rolled back. That appears to be the case, from my limited testing.
I'm working with a PostgresQL database, in which a trigger function logs changes to a history table. I'm trying to add a column which keeps a logical "commit ID" to group master and detail records together. I've created a (non-temporary) sequence, and before I start the batch of updates, I bump this. All my SQL is logged to a log file, so you can clearly see this happening:
2015-04-16 10:43:37 SQLSelect: SELECT nextval('commit_id_seq')
2015-04-16 10:43:37 commit_id_seq: 8
...but then I attempt the UPDATE, my trigger function attempts to use currval, and it fails:
2015-04-16 10:43:37 ERROR: ERROR: currval of sequence "commit_id_seq" is not yet defined in this session
CONTEXT: SQL statement "INSERT INTO history (table_name, record_id, sec_user_id, created, action, notes, status, before, after, commit_id)
SELECT TG_TABLE_NAME, rec.id, (SELECT oid FROM pg_roles WHERE rolname = CURRENT_USER), now(), SUBSTR(TG_OP,1,1), note, stat, hstore(old), hstore(new), currval('commit_id_seq')"
PL/pgSQL function log_to_history() line 18 at SQL statement
[3]
So my question is basically: WTF?
One of two reasons:
Search_path differences, so you're actually talking about two different sequences.
Different sessions. The "current value" is only defined for the session you call nextval() in.
You can add process-id to the logfile format to check if they are different sessions.
How is it possible to get the current sequence value in postgresql 8.4?
Note: I need the value for the some sort of statistics, just retrieve and store. Nothing related to the concurrency and race conditions in case of manually incrementing it isn't relevant to the question.
Note 2: The sequence is shared across several tables
Note 3: currval won't work because of:
Return the value most recently obtained by nextval for this sequence in the current session
ERROR: currval of sequence "<sequence name>" is not yet defined in this session
My current idea: is to parse DDL, which is weird
You may use:
SELECT last_value FROM sequence_name;
Update:
this is documented in the CREATE SEQUENCE statement:
Although you cannot update a sequence directly, you can use a query
like:
SELECT * FROM name;
to examine the parameters and current state of a
sequence. In particular, the last_value field of the sequence shows
the last value allocated by any session. (Of course, this value might
be obsolete by the time it's printed, if other sessions are actively
doing nextval calls.)
If the sequence is being used for unique ids in a table, you can simply do this:
select max(id) from mytable;
The most efficient way, although postgres specific, is:
select currval('mysequence');
although technically this returns the last value generated by the call to nextval('mysequence'), which may not necessarily be used by the caller (and if unused would leave gaps in an auto increments id column).
As of some point in time, a secret function (which is undocumented ... still), does exactly this:
SELECT pg_sequence_last_value('schema.your_sequence_name');
I have this Trigger in Postgresql that I can't just get to work (does nothing). For understanding, there's how I defined it:
CREATE TABLE documents (
...
modification_time timestamp with time zone DEFAULT now()
);
CREATE FUNCTION documents_update_mod_time() RETURNS trigger
AS $$
begin
new.modification_time := now();
return new;
end
$$
LANGUAGE plpgsql;
CREATE TRIGGER documents_modification_time
BEFORE INSERT OR UPDATE ON documents
FOR EACH ROW
EXECUTE PROCEDURE documents_update_mod_time();
Now to make it a bit more interesting.. How do you debug triggers?
Use the following code within a trigger function, then watch the 'messages' tab in pgAdmin3 or the output in psql:
RAISE NOTICE 'myplpgsqlval is currently %', myplpgsqlval; -- either this
RAISE EXCEPTION 'failed'; -- or that
To see which triggers actually get called, how many times etc, the following statement is the life-saver of choice:
EXPLAIN ANALYZE UPDATE table SET foo='bar'; -- shows the called triggers
Note that if your trigger is not getting called and you use inheritance, it may be that you've only defined a trigger on the parent table, whereas triggers are not inherited by child tables automatically.
To step through the function, you can use the debugger built into pgAdmin3, which on Windows is enabled by default; all you have to do is execute the code found in ...\8.3\share\contrib\pldbgapi.sql against the database you're debugging, restart pgAdmin3, right-click your trigger function, hit 'Set Breakpoint', and then execute a statement that would cause the trigger to fire, such as the UPDATE statement above.
Turns out I was using inheritance in the above problem and forgot to mention it. Now for everybody who might run into this as well, here's some debugging hints:
Use the following code to debug what a trigger is doing:
RAISE NOTICE 'test'; -- either this
RAISE EXCEPTION 'failed'; -- or that
To see which triggers actually get called, how many times etc, the following statement is the life-saver of choice:
EXPLAIN ANALYZE UPDATE table SET foo='bar'; -- shows the called triggers
Then there's the one thing I didn't know before: triggers only fire when updating the exact table they're defined on. If you use inheritance, you MUST define them on the child tables as well!
You can use 'raise notice' statements inside your trigger function to debug it. To debug the trigger not being called at all is another story.
If you add a 'raise exception' inside your trigger function, can you still do inserts/updates?
Also, if your update test occurs in the same transaction as your insert test, now() will be the same (since it's only calculated once per transaction) and therefore the update won't seem to do anything. If that's the case, either do them in separate transactions, or if this is a unit test and you can't do that, use clock_timestamp().
I have a unit test that depends on some time going by between transactions, so at the beginning of the unit test I have something like:
ALTER TABLE documents
ALTER COLUMN modification_time SET DEFAULT clock_timestamp();
Then in the trigger, use "set modification_time = default".
So normally it doesn't do the extra calculation, but during a unit test this allows me to do inserts with pg_sleep in between to simulate time passing and actually have that be reflected in the data.