How to ensure validity of foreign keys in Postgres

How to ensure validity of foreign keys in Postgres - postgresql

Using Postgres 10.6
The issue:
Some data in my tables violates the foreign key constraints (not sure how). The constraints are ON DELETE CASCADE ON UPDATE CASCADE
On a pg_dump of the database, those foreign keys are dropped (due to being in an invalid state?)
A pg_restore is done into a blank database, which no longer has the foreign keys
The new database has all its primary keys updated to valid keys not used in a second database. Tables which had invalid data do not have their foreign keys updated, due to the now missing constraint.
A pg_dump of the new database is done, then the database is deleted
On a pg_restore into a second database which has the foreign key constraints, the data gets imported in an invalid state, and corrupts the new database.
What I want to do is this: Every few hours (or once a day, depending of how long the query would take), is to verify that all data in all the tables which have foreign keys are valid.
I have read about ALTER TABLE ... VALIDATE CONSTRAINT ... but this wouldn't fix my issue, as the data is not currently marked as NOT VALID. I know could do statements like:
DELETE FROM a WHERE a.b_id NOT IN ( SELECT b.id )
However, I have 144 tables with foreign keys, so this would be rather tedious. I would also maybe not want to immediately delete the data, but log the issue and inform user about a correction which will happen.
Of course, I'd like to know how the original corruption occurred, and prevent that; however at the moment I'm just trying to prevent it from spreading.
Example table:
CREATE TABLE dependencies (
...
from_task int references tasks(id) ON DELETE CASCADE ON UPDATE CASCADE NOT NULL,
to_task int references tasks(id) ON DELETE CASCADE ON UPDATE CASCADE NOT NULL,
...
);
Dependencies will end up with values for to_task and from_task which do not exist in the tasks table (see image)
Note:
Have tried EXPLAIN ANALYZE nothing odd
pg_tablespace, has just two records. pg_default and pg_global
relforcerowsecurity, relispartition are both 'false' on both tables
Arguments to pg_dump (from c++ call) arguments << "--file=" + fileName << "--username=" + connection.userName() << databaseName << "--format=c"

This is either an index (or table) corruption problem, or the constraint has been created invalid to defer the validity check till later.
pg_dump will never silently "drop" a constraint — perhaps there was an error while restoring the dump that you didn't notice.
The proper fix is to clean up the data that violate the constraint and re-create it.
If it is a data corruption problem, check your hardware.
There is no need to regularly check for data corruption, PostgreSQL is not in the habit of corrupting data by itself.
The best test would be to take a pg_dump regularly and see if restoring the dump causes any errors.

Related

Out of shared memory when deleting rows with lots of incoming foreign keys

I develop a multi-tenancy application where we have a single master schema to keep track of tenants, along with 99 app databases to distribute load. Each of 33 tables within each app database also has a tenant column pointing to the master schema. This means there are 3,267 foreign keys pointing to the master schema's tenant id, and roughly 6000 triggers associated with the tenant table.
Recently, I added a table and started getting this error in the teardown portion of our test suite where we delete the test tenant:
psycopg2.errors.OutOfMemory: out of shared memory
HINT: You might need to increase max_locks_per_transaction.
CONTEXT: SQL statement "SELECT 1 FROM ONLY "test2"."item" x WHERE $1 OPERATOR(pg_catalog.=) "tenant" FOR KEY SHARE OF x"
For query
SET CONSTRAINTS ALL IMMEDIATE
Raising max_locks_per_transaction as suggested solves the problem, as does deleting some of the app schemas. The obvious solution here would be to reduce the number of redundant schemas or delete the foreign key constraints so we don't have to hold so many locks, but I'm curious if there's something else going on here.
I had imagined that only the rows to be deleted (associated with the test schema) would be locked, and so only the test schema would be locked. And anyway, by this point there is no data left pointing to the tenant table, so the locking is pretty much redundant in practice.
Update:
For more context, I'm not doing anything really fancy here. Below is a simplified example of what my schema and query look like:
CREATE SCHEMA master;
CREATE table master.tenant (id uuid NOT NULL PRIMARY KEY);
CREATE SCHEMA app_00;
CREATE table app_00.account (id uuid NOT NULL PRIMARY KEY, tenant uuid NOT NULL);
ALTER TABLE app_00.account ADD CONSTRAINT fk_tenant FOREIGN KEY (store) REFERENCES master.store(id) DEFERRABLE;
CREATE table app_00.item (id uuid NOT NULL PRIMARY KEY, tenant uuid NOT NULL);
ALTER TABLE app_00.item ADD CONSTRAINT fk_tenant FOREIGN KEY (store) REFERENCES master.store(id) DEFERRABLE;
In reality I'm creating 33 tables for each schema of app_00..99. Now assume my database is populated with data, the query that is failing with the above error is:
DELETE FROM master.tenant WHERE id = 'some uuid';

You don't tell us much about the setup, but probably partitioning or inheritance are involved. These features often require that a statement recurse to table partitions or inheritance children, either during query planning or execution. At any rate, your SQL statements have to touch many tables.
Now whenever PostgreSQL touches a table, it places a lock on it to avoid conflicting concurrent executions. If lots of tables are involved, it can be that the lock table, that originally has max_connections * max_locks_per_transaction entries, is exhausted.
The solution simply is to increase max_locks_per_transaction. Don't worry, there is no negative consequence in raising that parameter, only a little bit more shared memory is allocated during server startup.

Awkward/wrong PostgreSQL foreign-key definition

As a database developer, I experienced this notice when I tried to make a data-only dump a PostgreSQL(10.1) database 'tlesson'.
Notice =>
pg_dump: NOTICE: there are circular foreign-key constraints on this table:
pg_dump: members
Dump command =>
$ pg_dump -U postgres -d translesson -a
A 'tlesson' table 'members' constraint =>
ALTER TABLE ONLY members
ADD CONSTRAINT friend_fk FOREIGN KEY (friend_id) REFERENCES members(member_id);
That is, 'friend_id' column refers own table's primary key as the foreign-key.
Should I drop the 'friend_fk' constraint to remove the notice I'm having?

If you always drop the entire database then this isn't a problem, because the generated SQL (or pg_restore) will enable (create) foreign keys only after all the data was loaded, so there is no problem in that case.
However if you only dump a single table without the FKs then, importing is only going to work if you manually drop the FK before restoring, then re-create it afterwards.
The reason is that it's nearly impossible to generate INSERT statements in the correct order if you have circular references

Errors creating constraint trigger

Let me start by saying that I’m a Linux/Unix admin. That being said my manager has tasked me with moving older PostgreSQL databases to a RedHat server running 8.4.20. I was successful moving a 7.2.1 db but I’m running into issues moving a 7.4.20 db.
I use pg_dump –c filename and psql < filename. For the problematic db everything runs until I get to a CREATE CONSTRAINT TRIGGER statement. If I run it as it is in the file I get :
NOTICE: ignoring incomplete trigger group for constraint "" FOREIGN KEY data(ups) REFERENCES upsinfo(ups)
DETAIL: Found referenced table's DELETE trigger.
CREATE TRIGGER
If I run set schema 'pg_catalog'; I get:
ERROR: relation "upsinfo" does not exist
The tables (I think) involved are:
CREATE TABLE upsinfo (
ups text NOT NULL,
ipaddr inet,
rcomm text,
wcomm text,
reachable boolean,
managed boolean,
comments text,
region text
);
CREATE TABLE data (
date timestamp with time zone,
ups text,
mib text,
value text
);
The trigger problem trigger statement:
CREATE CONSTRAINT TRIGGER "<unnamed>"
AFTER DELETE ON upsinfo
FROM data
NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW
EXECUTE PROCEDURE "RI_FKey_cascade_del"('<unnamed>', 'data', 'upsinfo', 'UNSPECIFIED', 'ups', 'ups');
I know that the RI_FKey_cascade_del function is defined differently in the different versions of pg_catalog. Note that search_path is set to ‘public, pg_catalog’ so I’m also confused why I have to set the schema.
Again I’m not a real PostgreSQL DBA so try to be kind.

Oof, those are really old postgres versions, including the version you're upgrading to (8.4 was released in 2009, and support ended in 2014).
The short answer is that, as long as upsinfo and data are being created and populated, you're probably fine, and good to go. But one of your foreign key relationships is broken.
The long answer, well, let me see if I can explain what is going on (or, at least, what I think is going on).
I'm guessing that the original table definition of data included something like FOREIGN KEY (ups) REFERENCES upsinfo (ups) ON DELETE CASCADE. That causes postgres to automatically make some trigger constraints: 1- every time there's a new row for data, make sure that its ups column matches an existing row in upsinfo, and 2- every time you delete a row from upsinfo, delete the corresponding rows in data, based on the matching ups value.
That (not very informative) error message can come up when the foreign key relationship doesn't work. In order for a foreign key to make sense, the referenced value needs to be unique -- there should be only one row in upsinfo for each distinct value of ups. In order for postgres to know that, there needs to be a unique index or primary key on upsinfo.ups.
In this case, one of a couple things could be breaking it:
There's no primary key or unique index on upsinfo.ups (postgres should not have allowed a foreign key, but may have in very old versions)
There used to be a unique index, but it hadn't properly enforced uniqueness, so it didn't get successfully imported (a bug, again likely from a very old version)
In either case, if that foreign key relationship is important, you can try to fix it once the import is complete. Start by trying to make a unique index on upsinfo.ups, and see if you have problems. If you do, resolve the duplicate entries, and try again till it works. Then issue something like:
ALTER TABLE data
ADD FOREIGN KEY (ups) REFERENCES upsinfo (ups) ON DELETE CASCADE;
Of course, if things are working, it's possible you don't need to fix the foreign key, in which case you're probably able to ignore those errors and just move forward.
Hope that helps, and good luck!

This seems to be a part of ON DELETE CONSTRAINT. If I were you I would delete all such statements and replace them with a proper constraint definition on the target table.
Table definition should then look like this:
CREATE TABLE bookings (
boo_id serial NOT NULL,
boo_hotelid character varying NOT NULL,
boo_roomid integer NOT NULL,
CONSTRAINT pk_bookings
PRIMARY KEY (boo_id),
CONSTRAINT fk_bookings_boo_roomid
FOREIGN KEY (boo_roomid)
REFERENCES rooms (roo_id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE
) WITHOUT OIDS;
And this part is what will internally create the trigger:
CONSTRAINT fk_bookings_boo_roomid
FOREIGN KEY (boo_roomid)
REFERENCES rooms (roo_id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE
But, to be honest, I do not have an understanding for an upgrade to an unsupported version. You know the Postgres is version 9.5 now, right?

Dropping primary key from a materialized view but unable to recreate it - why?

I have created a materialized view with fast refresh. It has a primary key (with using index) which I want to alter. I ran the following statement in sqlplus:
SQL> alter table
2 MV
3 drop constraint PK_MV;
Table altered.
SQL> alter table
2 MV
3 add constraint PK_MV primary key
4 (
5 A_ID
6 , B_ID
7 )
8 using index
9 tablespace IDX;
alter table
*
ERROR in line 1:
ORA-00955: name is already being used by existing object
It seems that the primary key PK_MV still exists. However, isn't it dropped by the first statement?
Oracle version is Enterprise Edition Release 10.2.0.5.0 - 64bit.

Oracle tends to do certain things in an odd way, out of pure spite, causing odd errors, and to make things worse, when errors occur, it tends to give error messages that are anywhere from useless to outright misleading.
In your case, dropping the constraint PK_MV does not also drop the index behind it, so you are still left with a PK_MV index. Then, later, when you try to re-create the constraint, Oracle insists to also create an index for it, and it just won't stand the possibility that an index with that name might already exist.
To make matters worse, the error message does not give you any hints about the nature of the existing object, so it leaves you with the impression that the existing object is a constraint, since that's what you are trying to create, while in fact the existing object is an index, which you never dealt with, have no use for, and probably don't want to know anything about.
Ah, lovely Oracle. My condolences for having to use it.
So, try the following:
alter table MV drop constraint PK_MV cascade;
The cascade keyword will cause the index behind the constraint to also be dropped.

PostgreSql: duplicate pkey error when inserting a new records to a restored database's table

I used the commands pg_dump and psql to backup my production DB and restore it into my development server.
Now when I try to simply insert a new record to one of my tables I get the following error message:
ERROR: duplicate key value violates unique constraint
"communication_methods_pkey" DETAIL: Key (id)=(13) already exists.
How come that the id is already in use? I need to update something in order to have the id increment counter back on the right track?

It sounds like the sequences used to do the primary key for each table are not on the correct value. It is interesting that pg_dump did not include a sequence setval at the end of it (I believe it is supposed to).
Postgres recommends the following process to correct sequences: https://wiki.postgresql.org/wiki/Fixing_Sequences
Essentially, it takes you through identifying all your sequences and creating a sql script to run to set them to 1 more than your inserted value's ids.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse