Adding a new constraint in postgresql checks the rows added before? - postgresql

Let's suppose I have a table called Clients(ID,Name,Phone) which has several rows in it, with some of them empty in the column «Phone».
If I decide to add a new NOT NULL constraint to said table in the column «Phone», does PostgreSQL will check the rows that were already in the table, or is it only going to work to the rows added after the constraint's declaration ?

I think the documentation is pretty clear:
SET/DROP NOT NULL
These forms change whether a column is marked to allow null values or
to reject null values. You can only use SET NOT NULL when the column
contains no null values.
So, using this form, you cannot add such a constraint without checking the previous values.
If you use add table_constraint, then you can do the same thing using a CHECK cosntraint:
ADD table_constraint [ NOT VALID ]
This form adds a new constraint to a table using the same syntax as
CREATE TABLE, plus the option NOT VALID, which is currently only
allowed for foreign key and CHECK constraints. If the constraint is
marked NOT VALID, the potentially-lengthy initial check to verify that
all rows in the table satisfy the constraint is skipped. The
constraint will still be enforced against subsequent inserts or
updates (that is, they'll fail unless there is a matching row in the
referenced table, in the case of foreign keys; and they'll fail unless
the new row matches the specified check constraints). But the database
will not assume that the constraint holds for all rows in the table,
until it is validated by using the VALIDATE CONSTRAINT option.
So, you cannot add a NOT NULL constraint using alter table. You can do essentially the same thing using CHECK. Then, you by-pass the checking using NOT VALID. Otherwise, the checking takes place.

Related

Unexpected creation of duplicate unique constraints in Postgres

I am writing an idempotent schema change script for a Postgres 12 database. However I noticed that if I include the IF NOT EXISTS in an ADD COLUMN statement then even if the column already exists it is adding duplicate Indexes for the uniqueness constraint which already exists. Simple example:
-- set up base table
CREATE TABLE IF NOT EXISTS test_table
(id SERIAL PRIMARY KEY
);
-- statement intended to be idempotent
ALTER TABLE test_table
ADD COLUMN IF NOT EXISTS name varchar(50) UNIQUE;
Running this script creates a new index test_table_name_key[n] each time it is run. I can't find anything in the Postgres documentation and don't understand why this is allowed to happen? If I break it into two parts eg:
ALTER TABLE test_table
ADD COLUMN IF NOT EXISTS name varchar(50);
ALTER TABLE
ADD CONSTRAINT test_table_name_key UNIQUE (name);
Then the transaction fails because Postgres rejects the creation of a constraint which already exists (which I can then catch in a DO EXCEPTION block). As far as I can tell this is because doing it by this approach I am forced to give the constraint a name. This constrasts with the ALTER COLUMN SET NOT NULL which can be run multiple times without error or side effects as far as I can tell.
Question: why does it add a duplicate unique constraint and are there any problems with having multiple identical indexes on a table column? (I think this is a subtle 'error' and only spotted it by chance so am concerned it may arise in a production situation)
You can create multiple unique constraints on the same column as long as they have different names, simply because there is nothing in the PostgreSQL code that forbids that. Each unique constraint will create a unique index with the same name, because that is how unique constraints are implemented.
This can be a valid use case: for example, if the index is bloated, you could create a new constraint and then drop the old one.
But normally, it is useless and does harm, because each index will make data modifications on the table slower.

Is it possible to access current column data on conflict

I want to get such behaviour on inserting data (conflict on id):
if there is no model with same id in db do INSERT
if there is entry with same id in db and that entry is newer (updated_at field) do NOT UPDATE
if there is entry with same id in db and that entry is older (updated_at field) do UPDATE
I'm using Ecto for that and want to work on constraints, however I cannot find an option to do so in documentation. Pseudo code of constraint could look like:
CHECK: NULL(current.updated_at) or incoming.updated_at > current.updated_at
Is such behaviour possible in Postgres?
PostgreSQL does not support CHECK constraints that reference table
data other than the new or updated row being checked. While a CHECK
constraint that violates this rule may appear to work in simple tests,
it cannot guarantee that the database will not reach a state in which
the constraint condition is false (due to subsequent changes of the
other row(s) involved). This would cause a database dump and reload to
fail. The reload could fail even when the complete database state is
consistent with the constraint, due to rows not being loaded in an
order that will satisfy the constraint. If possible, use UNIQUE,
EXCLUDE, or FOREIGN KEY constraints to express cross-row and
cross-table restrictions.
If what you desire is a one-time check against other rows at row
insertion, rather than a continuously-maintained consistency
guarantee, a custom trigger can be used to implement that. (This
approach avoids the dump/reload problem because pg_dump does not
reinstall triggers until after reloading data, so that the check will
not be enforced during a dump/reload.)
That should be simple using the WHERE clause of ON CONFLICT ... DO UPDATE:
INSERT INTO mytable (id, entry) VALUES (42, '2021-05-29 12:00:00')
ON CONFLICT (id)
DO UPDATE SET entry = EXCLUDED.entry
WHERE mytable.entry < EXCLUDED.entry;

Make a column NOT NULL in a large table without locking issues?

I want to change a column to NOT NULL:
ALTER TABLE "foos" ALTER "bar_id" SET NOT NULL
The "foos" table has almost 1 000 000 records. It does fairly low volumes of writes, but quite constantly. There are a lot of reads.
In my experience, changing a column in a big table to NOT NULL like this can cause downtime in the app, presumably because it leads to (b)locks.
I've yet to find a good explanation corroborating this, though.
And if it is true, what can I do to avoid it?
EDIT: The docs (via this comment) say:
Adding a column with a DEFAULT clause or changing the type of an existing column will require the entire table and its indexes to be rewritten.
I'm not sure if changing NULL counts as "changing the type of an existing column", but I believe I did have an index on the column the last time I saw this issue.
Perhaps removing the index, making the column NOT NULL, and then adding the index back would improve things?
I think you can do that using a check constraint rather then set not null.
ALTER TABLE foos
add constraint id_not_null check (bar_id is not null) not valid;
This will still require an ACCESS EXCLUSIVE lock on the table, but it is very quick because Postgres doesn't validate the constraint (so it doesn't have to scan the entire table). This will already make sure that new rows (or changed rows) can not put a null value into that column
Then (after committing the alter table!) you can do:
alter table foos validate constraint id_not_null;
Which does not require an ACCESS EXCLUSIVE lock and still allows access to the table.

DB2 column constraint for inserting values within restricted length

Is there a constraint available in DB2 such that when a column is restricted to a particular length, the values are trimmed to appropriate length before insertion. For Eg. if a column has been specified to be of length 5 , inserting a value 'overflow' will get inserted as 'overf'.
Can CHECK constraint be used here? My understanding of CHECK constraint is that it will allow insertions or not allow them but it can not modify values to satisfy the condition.
A constraint isn't going to be able to do this.
A before insert trigger is normally the mechanism you'd use to modify data during an insert before it is actually placed in the table.
However, I'm reasonably sure it won't work in this case. You'd get an SQLCODE -404 (SQLSTATE 22001) "The Sql Statement specified contains a String that is too long." thrown before the trigger gets fired.
I see two possible options
1) Create a view over the table where the column is cast to a larger size. Then create an INSTEAD OF trigger on the view to substring the data during write.
2) Create and use a stored procedure that accepts a larger size and substrings the data then inserts it.

Setting constraint for two unique fields in PostgreSQL

I'm new to postgres. I wonder, what is a PostgreSQL way to set a constraint for a couple of unique values (so that each pair would be unique). Should I create an INDEX for bar and baz fields?
CREATE UNIQUE INDEX foo ON table_name(bar, baz);
If not, what is a right way to do that? Thanks in advance.
If each field needs to be unique unto itself, then create unique indexes on each field. If they need to be unique in combination only, then create a single unique index across both fields.
Don't forget to set each field NOT NULL if it should be. NULLs are never unique, so something like this can happen:
create table test (a int, b int);
create unique index test_a_b_unq on test (a,b);
insert into test values (NULL,1);
insert into test values (NULL,1);
and get no error. Because the two NULLs are not unique.
You can do what you are already thinking of: create a unique constraint on both fields. This way, a unique index will be created behind the scenes, and you will get the behavior you need. Plus, that information can be picked up by information_schema to do some metadata inferring if necessary on the fact that both need to be unique. I would recommend this option. You can also use triggers for this, but a unique constraint is way better for this specific requirement.