PostgreSQL: trivial INSERT fails the first time, succeeds afterwards - postgresql

I am puzzled by a weird Postgres problem I encounter in the trivial database shown below: If I first insert a tag and explicitly specify its ID and then try to insert another tag without passing an ID, then this second insert fails. If I try a third time (again without ID), the insert succeeds.
DROP DATABASE IF EXISTS mydb;
CREATE DATABASE mydb;
\c mydb
DROP SCHEMA public;
CREATE SCHEMA core;
CREATE TABLE core.tag
(
id serial PRIMARY KEY,
title text NOT NULL
);
-- this works: all columns specified explicitly
INSERT INTO core.tag(id, title) VALUES (1, 'known tag');
-- omitting the tag ID fails with
-- ERROR: duplicate key value violates unique constraint "tag_pkey"
-- DETAIL: Key (id)=(1) already exists.
INSERT INTO core.tag(title) VALUES ('unknown tag');
-- this works again ?!?
INSERT INTO core.tag(title) VALUES ('unknown tag');
The issue only seems to occur on a freshly created database and once it does, it does not seem to happen again. I have never come across anything like this - so far, I have just inserted data with or without explicit ID and AFAICS, nothing ever failed like this...
Does anyone have an idea what's going on here ?!?
Environment: PostgreSQL 9.1.3 on Mac OSX 10.7.5

Of course this fails.
What happens?
When you create the table, a sequence is also created that generates the values for the ID column. The sequence starts with 1 but it is only used if you do not specify a value for the ID column.
Now when you run
INSERT INTO core.tag(id, title) VALUES (1, 'known tag');
you bypass Postgres' automatic assigment of the ID value, the sequence "stays" at one.
Now when you run
INSERT INTO core.tag(title) VALUES ('unknown tag');
Postgres takes the next value from the sequence - which is 1. But that alreay exists so the insert fails. After taking the value from the sequence, the next value is 2, so the subsequent insert without specifying an ID value gets the 2 and succeeds.
The solution is to either never include the ID column in your inserts. Or - if you do - request the ID from the sequence:
INSERT INTO core.tag(id, title) VALUES (nextval('tag_id_seq'), 'known tag');
When a serial column is created it is automatically associated with a sequence which is named <table_name>_<column_name>_seq. And that's the name I used in the above statement.
More details about how the serial "data type" works are in the manual: http://www.postgresql.org/docs/current/static/datatype-numeric.html#DATATYPE-SERIAL

Related

Postgres: CREATE TABLE IF NOT EXISTS ⇒ 23505

I start multiple programs that all more or less simultaneously do
CREATE TABLE IF NOT EXISTS log (...);
Sometimes this works perfectly. But most of the time, one or more of the programs crash with the error:
23505: duplicate key value violates unique constraint "pg_class_relname_nsp_index".
Can somebody explain to me how the actual Christmas tree CREATE TABLE IF NOT EXISTS is giving me an error message about the table already existing? Isn't that, like, the entire point of this command?? What is going on here? More to the point, how do I get it to actually work correctly?
After this command, there's also a couple of CREATE INDEX IF NOT EXISTS commands. These occasionally fail in a similar way too. But most of the time, it's the CREATE TABLE statement that fails.
You can reproduce this with 2 parallel sessions:
First session:
begin;
create table if not exists log(id bigint generated always as identity, t timestamp with time zone, message text not null);
Notice that the first session did not commit yet, so the table does not really exists.
Second session:
begin;
create table if not exists log(id bigint generated always as identity, t timestamp with time zone, message text not null);
The second session will now block, as the name "log" is reserved by the first session. But it is not yet known, if the transaction, that reserved it, will be committed or not.
Then, when you commit the first session, the second will fail:
ERROR: duplicate key value violates unique constraint "pg_class_relname_nsp_index"
DETAIL: Key (relname, relnamespace)=(log_id_seq, 2200) already exists.
To avoid it you have to make sure that the check for existence of a table, is done after some common advisory lock is taken:
begin;
select pg_advisory_xact_lock(12345);
-- any bigint value, but has to be the same for all parallel sessions
create table if not exists log(id bigint generated always as identity, t timestamp with time zone, message text not null);
commit;

Unable to insert table in Postgres due to sequence being out of order

I have a table called person with primary key on id;
I am trying to insert into this table with:
insert into person (first_name, last_name, email, gender, date_of_birth, country_of_birth) values ('Ellissa', 'Gordge', 'ggordge0#gnu.org', 'Male', '2022-03-19', 'Fiji');
There should not be any ID constraint which are being violated since it is a BIGSERIAL yet I am getting this:
It says Key id=(8) already exists and it is incrementing on each attempt to run this command. How can ID already exist? And why is it not incrementing from the bottom of the list?
If i specify the id in the insert statement, with a number which i know is unique it works. I just don't understand why is it not doing it automatically since I am using BIGSERIAL.
Your sequence apparently is out of sync with the values in the column. This can happen when someone did INSERT INTO person(id, …) VALUES (8, …) (or maybe a csv COPY import, or anything else that did provide values for the id column instead of using the default), or when someone did reset the sequence of having inserted data.
You can alter the sequence to fix this:
ALTER SEQUENCE person_id_seq RESTART WITH (SELECT MAX(id)+1 FROM person);
You can set the sequence value to fix this:
SELECT setval('person_id_seq', MAX(id)+1) FROM person;
Also notice that it is recommended to use an identity column rather than a serial one to avoid this kind of problem.
SELECT pg_catalog.setval(pg_get_serial_sequence('table_name', 'id'), MAX(id)) FROM table_name;
This should kickstart your sequence table back in sync, which should fix everything. Make sure to change 'table_name' to the actual name. Cheers!

Postgres: Can I bypass the error "cannot insert into generated column" using a PostgreSQL INSTEAD OF INSERT rule?

I know this isn't pretty but it would be helpful to bypass the error for insert into a generated column in Postgres. Let's say, we have a table like so:
create table testing (
id int primary key,
fullname_enc bytea,
fullname text generated always as (pgp_sym_decrypt(fullname_enc, 'key')) stored
);
A query like the following returns the expected error: ERROR: cannot insert into column "fullname" DETAIL: Column "fullname" is a generated column.
insert into testing(id, fullname) values (3, 'John Doe');
I want to create a rule on this table on INSERTs like:
create rule encrypter as on insert to testing DO INSTEAD insert into testing (id, fullname_enc) values (new.id, pgp_sym_encrypt(new.fullname, 'key'));
Since we rewrite the query, I was naively thinking if this would not result in the error from the engine but it still does. Any idea how this could be achieved?
The reason for asking this is migration to PostgreSQL 12.
This cannot be achieved, and if it could be achieved somehow, that would be a bug that needs to be fixed. Otherwise, restoring from a dump would change the values.
I think that what you need is a BEFORE trigger that sets fullname.
I hope that this is a mock example and not something that is intended to improve security.

How to check in postgres that lastval() is defined

I'm creating a mini DB handling class and for convinience purposes, I want my generated INSERT queries are always try to return last insert id or null, if not applicable. I had tried to use:
INSERT INTO ... RETURNING lastval();
it works fine with tables which have a sequence, but if table have no sequence key field, that insert nothing and falls with error: lastval is not yet defined in this session.
How to check in RERURNING block, is sequence generator was used in this session or silently return last id or null, without error raising?
Note: for usablity purposes, I want to avoid direct specifing of PK field name.

postgres autoincrement not updated on explicit id inserts

I have the following table in postgres:
CREATE TABLE "test" (
"id" serial NOT NULL PRIMARY KEY,
"value" text
)
I am doing following insertions:
insert into test (id, value) values (1, 'alpha')
insert into test (id, value) values (2, 'beta')
insert into test (value) values ('gamma')
In the first 2 inserts I am explicitly mentioning the id. However the table's auto increment pointer is not updated in this case. Hence in the 3rd insert I get the error:
ERROR: duplicate key value violates unique constraint "test_pkey"
DETAIL: Key (id)=(1) already exists.
I never faced this problem in Mysql in both MyISAM and INNODB engines. Explicit or not, mysql always update autoincrement pointer based on the max row id.
What is the workaround for this problem in postgres? I need it because I want a tighter control for some ids in my table.
UPDATE:
I need it because for some values I need to have a fixed id. For other new entries I dont mind creating new ones.
I think it may be possible by manually incrementing the nextval pointer to max(id) + 1 whenever I am explicitly inserting the ids. But I am not sure how to do that.
That's how it's supposed to work - next_val('test_id_seq') is only called when the system needs a value for this column and you have not provided one. If you provide value no such call is performed and consequently the sequence is not "updated".
You could work around this by manually setting the value of the sequence after your last insert with explicitly provided values:
SELECT setval('test_id_seq', (SELECT MAX(id) from "test"));
The name of the sequence is autogenerated and is always tablename_columnname_seq.
In the recent version of Django, this topic is discussed in the documentation:
Django uses PostgreSQL’s SERIAL data type to store auto-incrementing
primary keys. A SERIAL column is populated with values from a sequence
that keeps track of the next available value. Manually assigning a
value to an auto-incrementing field doesn’t update the field’s
sequence, which might later cause a conflict.
Ref: https://docs.djangoproject.com/en/dev/ref/databases/#manually-specified-autoincrement-pk
There is also management command manage.py sqlsequencereset app_label ... that is able to generate SQL statements for resetting sequences for the given app name(s)
Ref: https://docs.djangoproject.com/en/dev/ref/django-admin/#django-admin-sqlsequencereset
For example these SQL statements were generated by manage.py sqlsequencereset my_app_in_my_project:
BEGIN;
SELECT setval(pg_get_serial_sequence('"my_project_aaa"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "my_project_aaa";
SELECT setval(pg_get_serial_sequence('"my_project_bbb"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "my_project_bbb";
SELECT setval(pg_get_serial_sequence('"my_project_ccc"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "my_project_ccc";
COMMIT;
It can be done automatically using a trigger. This way you are sure that the largest value is always used as the next default value.
CREATE OR REPLACE FUNCTION set_serial_id_seq()
RETURNS trigger AS
$BODY$
BEGIN
EXECUTE (FORMAT('SELECT setval(''%s_%s_seq'', (SELECT MAX(%s) from %s));',
TG_TABLE_NAME,
TG_ARGV[0],
TG_ARGV[0],
TG_TABLE_NAME));
RETURN OLD;
END;
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER set_mytable_id_seq
AFTER INSERT OR UPDATE OR DELETE
ON mytable
FOR EACH STATEMENT
EXECUTE PROCEDURE set_serial_id_seq('mytable_id');
The function can be reused for multiple tables. Change "mytable" to the table of interest.
For more info regarding triggers:
https://www.postgresql.org/docs/9.1/plpgsql-trigger.html
https://www.postgresql.org/docs/9.1/sql-createtrigger.html