When does Postgres check unique constraints? - postgresql

I have a column sort_order with a unique constraint on it.
The following SQL fails on Postgres 9.5:
UPDATE test
SET sort_order = sort_order + 1;
-- [23505] ERROR: duplicate key value violates unique constraint "test_sort_order_key"
-- Detail: Key (sort_order)=(2) already exists.
Clearly, if the sort_order values were unique before the update, they will still be unique after the update. Why is this?
The same statement works fine on Oracle and MS SQL, but also fails on MySQL and SQLite.
Here's the complete setup code for a SQL fiddle:
DROP TABLE IF EXISTS test;
CREATE TABLE test (
val TEXT,
sort_order INTEGER NOT NULL UNIQUE
);
INSERT INTO test
VALUES ('A', 1), ('B', 2);

Postgres decides to check constraints of type IMMEDIATELY at a different time than proposed in the SQL standard.
Specifically, the documentation for SET CONSTRAINTS states (emphasis mine):
NOT NULL and CHECK constraints are always checked immediately when a row is inserted or modified (not at the end of the statement). Uniqueness and exclusion constraints that have not been declared DEFERRABLE are also checked immediately.
Postgres chooses to execute this query using a plan that results in a temporary collision for sort_order and IMMEDIATELY fails. Note that means that for the same schema and the same data, the same query may work or fail depending on the execution plan.
You'll have to make the constraint DEFERRABLE or DEFERRABLE INITIALLY DEFERRED, which delays verification of the constraint until the end of the transaction or up to the point where a statement SET CONSTRAINTS ... IMMEDIATE is executed.
Addendum from #HansGinzel's comment:
for COPY, it seems, that (even IMMEDIATE) constraints are tested after all data are COPYied.

Related

Unexpected creation of duplicate unique constraints in Postgres

I am writing an idempotent schema change script for a Postgres 12 database. However I noticed that if I include the IF NOT EXISTS in an ADD COLUMN statement then even if the column already exists it is adding duplicate Indexes for the uniqueness constraint which already exists. Simple example:
-- set up base table
CREATE TABLE IF NOT EXISTS test_table
(id SERIAL PRIMARY KEY
);
-- statement intended to be idempotent
ALTER TABLE test_table
ADD COLUMN IF NOT EXISTS name varchar(50) UNIQUE;
Running this script creates a new index test_table_name_key[n] each time it is run. I can't find anything in the Postgres documentation and don't understand why this is allowed to happen? If I break it into two parts eg:
ALTER TABLE test_table
ADD COLUMN IF NOT EXISTS name varchar(50);
ALTER TABLE
ADD CONSTRAINT test_table_name_key UNIQUE (name);
Then the transaction fails because Postgres rejects the creation of a constraint which already exists (which I can then catch in a DO EXCEPTION block). As far as I can tell this is because doing it by this approach I am forced to give the constraint a name. This constrasts with the ALTER COLUMN SET NOT NULL which can be run multiple times without error or side effects as far as I can tell.
Question: why does it add a duplicate unique constraint and are there any problems with having multiple identical indexes on a table column? (I think this is a subtle 'error' and only spotted it by chance so am concerned it may arise in a production situation)
You can create multiple unique constraints on the same column as long as they have different names, simply because there is nothing in the PostgreSQL code that forbids that. Each unique constraint will create a unique index with the same name, because that is how unique constraints are implemented.
This can be a valid use case: for example, if the index is bloated, you could create a new constraint and then drop the old one.
But normally, it is useless and does harm, because each index will make data modifications on the table slower.

When does a NOT NULL constraint run and can it wait until the transaction is going to commit?

I have a table being populated during an ETL routine a column at a time.
The mandatory columns (which are foreign keys) are set first and at once, so the initial state of the table is:
key | fkey | a
-------|--------|-------
1 | 1 | null
After processing A values, I insert them using SQL Alchemy with PostgreSQL dialect for a simple upsert:
upsert = sqlalchemy.sql.text("""
INSERT INTO table
(key, a)
VALUES (:key, :a)
ON CONFLICT (key) DO UPDATE SET
a = EXCLUDED.a
""")
But this fails because it apparently tried to insert the fkey value as null.
psycopg2.IntegrityError: null value in column "fkey" violates not-null constraint
DETAIL: Failing row contains (1, null, 0).
Is the syntax really correct? Why is it failing? Does SQLAlchemy has any participation on this error or is it translating PLSQL correctly?
My suspicion is that the constraint checks happen before the CONFLICT resolution triggers, so although it would actually work because fkey is guaranteed to be not null before and won't be overwritten, the constraint check only looks at the tentative insertion and the table constraints.
This is a current documented limitation of PostgreSQL, an area where it breaks the spec.
Currently, only UNIQUE, PRIMARY KEY, REFERENCES (foreign key), and EXCLUDE constraints are affected by this setting. NOT NULL and CHECK constraints are always checked immediately when a row is inserted or modified (not at the end of the statement). Uniqueness and exclusion constraints that have not been declared DEFERRABLE are also checked immediately.
You can't defer the NOT NULL constraint, and it seems you understand the default behavior, seen here.
CREATE TABLE foo ( a int NOT NULL, b int UNIQUE, c int );
INSERT INTO foo (a,b,c) VALUES (1,2,3);
INSERT INTO foo (b,c) VALUES (2,3);
ERROR: null value in column "a" violates not-null constraint
DETAIL: Failing row contains (null, 2, 3).

Can I get all foreign key violations in postgres?

I have a table with two columns with foreign key constraints, for example:
CREATE TABLE example
(
id integer PRIMARY KEY,
f1 integer REFERENCES example(id),
f2 integer REFERENCES example(id)
);
If I then perform the insert:
insert into example (id, f1, f2) values (1, 2, 2);
I will obviously get an error, but only for the first failed constraint:
ERROR: insert or update on table "example" violates foreign key constraint "example_f1_fkey"
DETAIL: Key (f1)=(2) is not present in table "example".
My question is: Is it possible to configure postgres so it returns an error with both of the failed key constraints?
Thanks very much,
Ben
Is it possible to configure postgres so it returns an error with both of the failed key constraints?
No, it isn't. The first FK failure aborts the transaction so no further checks are run.
It would be interesting to be able to capture all violations but there's no way to do that in current versions (true in 9.3, at least).
To do it you'd need to be able to selectively change ERROR level reports for CHECK constraints, foreign key constraint checks, etc into WARNINGs that also set a flag that'd force the transaction to abort at the end of the current statement. That might not be too hard to do technically, but it's certainly going to involve a chunk of work on the PostgreSQL source code.

postgres autoincrement not updated on explicit id inserts

I have the following table in postgres:
CREATE TABLE "test" (
"id" serial NOT NULL PRIMARY KEY,
"value" text
)
I am doing following insertions:
insert into test (id, value) values (1, 'alpha')
insert into test (id, value) values (2, 'beta')
insert into test (value) values ('gamma')
In the first 2 inserts I am explicitly mentioning the id. However the table's auto increment pointer is not updated in this case. Hence in the 3rd insert I get the error:
ERROR: duplicate key value violates unique constraint "test_pkey"
DETAIL: Key (id)=(1) already exists.
I never faced this problem in Mysql in both MyISAM and INNODB engines. Explicit or not, mysql always update autoincrement pointer based on the max row id.
What is the workaround for this problem in postgres? I need it because I want a tighter control for some ids in my table.
UPDATE:
I need it because for some values I need to have a fixed id. For other new entries I dont mind creating new ones.
I think it may be possible by manually incrementing the nextval pointer to max(id) + 1 whenever I am explicitly inserting the ids. But I am not sure how to do that.
That's how it's supposed to work - next_val('test_id_seq') is only called when the system needs a value for this column and you have not provided one. If you provide value no such call is performed and consequently the sequence is not "updated".
You could work around this by manually setting the value of the sequence after your last insert with explicitly provided values:
SELECT setval('test_id_seq', (SELECT MAX(id) from "test"));
The name of the sequence is autogenerated and is always tablename_columnname_seq.
In the recent version of Django, this topic is discussed in the documentation:
Django uses PostgreSQL’s SERIAL data type to store auto-incrementing
primary keys. A SERIAL column is populated with values from a sequence
that keeps track of the next available value. Manually assigning a
value to an auto-incrementing field doesn’t update the field’s
sequence, which might later cause a conflict.
Ref: https://docs.djangoproject.com/en/dev/ref/databases/#manually-specified-autoincrement-pk
There is also management command manage.py sqlsequencereset app_label ... that is able to generate SQL statements for resetting sequences for the given app name(s)
Ref: https://docs.djangoproject.com/en/dev/ref/django-admin/#django-admin-sqlsequencereset
For example these SQL statements were generated by manage.py sqlsequencereset my_app_in_my_project:
BEGIN;
SELECT setval(pg_get_serial_sequence('"my_project_aaa"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "my_project_aaa";
SELECT setval(pg_get_serial_sequence('"my_project_bbb"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "my_project_bbb";
SELECT setval(pg_get_serial_sequence('"my_project_ccc"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "my_project_ccc";
COMMIT;
It can be done automatically using a trigger. This way you are sure that the largest value is always used as the next default value.
CREATE OR REPLACE FUNCTION set_serial_id_seq()
RETURNS trigger AS
$BODY$
BEGIN
EXECUTE (FORMAT('SELECT setval(''%s_%s_seq'', (SELECT MAX(%s) from %s));',
TG_TABLE_NAME,
TG_ARGV[0],
TG_ARGV[0],
TG_TABLE_NAME));
RETURN OLD;
END;
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER set_mytable_id_seq
AFTER INSERT OR UPDATE OR DELETE
ON mytable
FOR EACH STATEMENT
EXECUTE PROCEDURE set_serial_id_seq('mytable_id');
The function can be reused for multiple tables. Change "mytable" to the table of interest.
For more info regarding triggers:
https://www.postgresql.org/docs/9.1/plpgsql-trigger.html
https://www.postgresql.org/docs/9.1/sql-createtrigger.html

ERROR: duplicate key value violates unique constraint in postgreSQL

i am getting a unique constraint issue in postgresql while updating a table. I have a table with 3 columns and an unique constraint on one of the column(internal_state). This table will have only two columns and values for internal_state are 1,0.
The update query is
UPDATE backfeed_state SET internal_state = internal_state - 1
WHERE EXISTS (SELECT 1 FROM backfeed_state d2 WHERE d2.internal_state = 1 )
Running this query is fine in MSSqlserver but in postgre it is throwing unique constraint error.
What i understand is in SQLServer after updating all the rows then only constraint on the columns are checking but in postgre after updating each row, constraints are checking. So after updating the first row(internal_state value from 1 to 0) postgre is checking the constraint and throwing error even before updating the second row.
Is there a way to avoid this situation?
http://www.postgresql.org/docs/9.0/static/sql-createtable.html in section "Non-deferred Uniqueness Constraints" - "When a UNIQUE or PRIMARY KEY constraint is not deferrable, PostgreSQL checks for uniqueness immediately whenever a row is inserted or modified."
Changing your unique constraint to deferrable will hold off checking until the end of the update. Either use SET CONSTRAINTS to disable at the session level (which is annoyingly repetitive) or drop and re-create the uniqueness constraint with the deferrable option (I'm not aware of an ALTER construct to do that without dropping).