How to edit a record that results in a uniqueness violation and echo the change to child tables? - postgresql

PostgreSQL 11.1
How can "Editing" of a record be transmitted to dependent records?
Summary of my issue:
The master table, disease, needs a unique constraint on the description column. This unique constraint is needed for foreign key ON UPDATE CASCADE to its children tables.
To allow for a temporary violation of the unique constraint, it must be made deferrable. BUT A DEFERABLE CONSTRAINT CAN NOT BE USED IN A FOREIGN KEY.
Here is the situation.
The database has 100+ tables (and keeps on growing).
Most all information has been normalized in that repeating groups of information have been delegated to their own table.
Following normalization, most tables are lists without duplication of records. Duplication of records within a table is not allowed.
All tables have a unique ID assigned to each record (in addition to a unique constraint placed on the record information).
Most tables are dependent on another table. The foreign keys reference the primary key of the table they are dependent on.
Most unique constraints involve a foreign key (which in turn references the primary key of the parent table).
So, assume the following schema:
CREATE TABLE phoenix.disease
(
recid integer NOT NULL DEFAULT nextval('disease_recid_seq'::regclass),
code text COLLATE pg_catalog."default",
description text COLLATE pg_catalog."default" NOT NULL,
CONSTRAINT disease_pkey PRIMARY KEY (recid),
CONSTRAINT disease_code_unique UNIQUE (code)
DEFERRABLE,
CONSTRAINT disease_description_unique UNIQUE (description)
,
CONSTRAINT disease_description_check CHECK (description <> ''::text)
)
CREATE TABLE phoenix.dx
(
recid integer NOT NULL DEFAULT nextval('dx_recid_seq'::regclass),
disease_recid integer NOT NULL,
patient_recid integer NOT NULL,
CONSTRAINT pk_dx_recid PRIMARY KEY (recid),
CONSTRAINT dx_unique UNIQUE (tposted, patient_recid, disease_recid)
,
CONSTRAINT dx_disease_recid_fkey FOREIGN KEY (disease_recid)
REFERENCES phoenix.disease (recid) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE RESTRICT,
CONSTRAINT dx_patients FOREIGN KEY (patient_recid)
REFERENCES phoenix.patients (recid) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE RESTRICT
)
(Columns not involved in this question have been removed. :) )
There are many other children tables of disease with the same basic dependency on the disease table. Note that the primary key of the disease table is a foreign key to the dx table and that the dx table uses this foreign key in a unique constraint. Also note that the dx table is just one table of a long chain of table references. (That is the dx table also has its primary key referenced by other tables).
The problem: I wish to "edit" the contents of the parent disease record. By "edit", I mean:
change the data in the description column.
if the result of the change causes a duplication in the disease table, then one of the "duplicated" records will need to be deleted.
Herein lies my problem. There are many different tables that use the primary key of the disease table in their own unique constraint. If those tables ALSO have a foreign key reference to the duplicated record (in disease), then cascading the delete to those tables would be appropriate -- i.e., no duplication of records will occur.
However, if the child table does NOT have a reference to the "correct" record in the parent disease table, then simply deleting the record (by cascade) will result in loss of information.
Example:
Disease Table:
record 1: ID = 1 description = "ABC"
record 2: ID = 2 description = "DEF"
Dx Table:
record 5: ID = 5 refers to ID=1 of Disease Table.
Editing of record 1 in Disease table results in description becoming "DEF"
Disease Table:
record 1: ID = 1 "ABC" --> "DEF"
I have tried deferring the primary key of the disease table so as to allow the "correct" ID to be "cascaded" to the child tables. This causes the following errors:
A foreign key can not be dependent on a deferred column. "cannot use a deferrable unique constraint for referenced table "disease"
additionally, the parent table (disease) has no way of knowing ahead of time if its children already have a reference to the "correct" record so allowing deletion, or if the child needs to change its own column data to reflect the new "correct" id.
So, how can I allow a change in the parent table (disease) and notify the child tables to change their column values -- and delete within them selves should a duplicate record arise?
Lastly, I do not know today what future tables I will need. So I cannot "precode" into the parent table who its children are or will be.
Thank you for any help with this.

Related

PostgreSQL declarative partition - unique constraint on partitioned table must include all partitioning columns [duplicate]

This question already has an answer here:
ERROR: unique constraint on partitioned table must include all partitioning columns
(1 answer)
Closed last month.
I'm trying to create a partitioned table which refers to itself, creating a doubly-linked list.
CREATE TABLE test2 (
id serial NOT NULL,
category integer NOT NULL,
time timestamp(6) NOT NULL,
prev_event integer,
next_event integer
) PARTITION BY HASH (category);
Once I add primary key I get the following error.
alter table test2 add primary key (id);
ERROR: unique constraint on partitioned table must include all partitioning columns
DETAIL: PRIMARY KEY constraint on table "test2" lacks column "category" which is part of the partition key.
Why does the unique constrain require all partitioned columns to be included?
EDIT: Now I understand why this is needed: https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-LIMITATIONS
Once I add PK with both columns it works.
alter table test2 add primary key (id, category);
But then adding the FK to itself doesn't work.
alter table test2 add foreign key (prev_event) references test2 (id) on update cascade on delete cascade;
ERROR: there is no unique constraint matching given keys for referenced table "test2"
Since PK is not just id but id-category I can't create FK pointing to id.
Is there any way to deal with this or am I missing something?
I would like to avoid using inheritance partitioning if possible.
EDIT2: It seems this is a known problem. https://www.reddit.com/r/PostgreSQL/comments/di5mbr/postgresql_12_foreign_keys_and_partitioned_tables/f3tsoop/
Seems that there is no straightforward solution. PostgreSQL simply doesn't support this as of v14. One solution is to use triggers to enforce 'foreign key' behavior. Other is to use multi-column foreign keys. Both are far from optimal.

Postgresql Merging of two data

I have a inventory table with the following row data.
Row One -> Barcode : SH9025H36SP23,
Row Two -> Barcode : SH9025H36SP23N1
Both are referred as foreign key in other tables.
Row One is old barcode whereas Row Two is the new barcode.
How do i change Row One to Row Two without loosing any references ?
Thank you for taking your time to answer my question.
Simplified Table Structure for Inventory table
barcode varchar not null
name varchar not null
PRIMARY KEY (barcode)
Referenced by:
TABLE "act" CONSTRAINT "actkey" FOREIGN KEY (fkbarcode) REFERENCES inventory(barcode) ON UPDATE CASCADE ON DELETE RESTRICT
TABLE "do" CONSTRAINT "dokey" FOREIGN KEY (fkbarcode) REFERENCES inventory(barcode) ON UPDATE CASCADE ON DELETE RESTRICT
so on..

Citus: How can I add self referencing table in distributed tables list

I'm trying to run create_distributed_table for tables which i need to shard and almost all of the tables have self relation ( parent child )
but when I run SELECT create_distributed_table('table-name','id');
it throws error cannot create foreign key constraint
simple steps to reproduce
CREATE TABLE TEST (
ID TEXT NOT NULL,
NAME CHARACTER VARYING(255) NOT NULL,
PARENT_ID TEXT
);
ALTER TABLE TEST ADD CONSTRAINT TEST_PK PRIMARY KEY (ID);
ALTER TABLE TEST ADD CONSTRAINT TEST_PARENT_FK FOREIGN KEY (PARENT_ID) REFERENCES TEST (ID);
ERROR
citus=> SELECT create_distributed_table('test','id');
ERROR: cannot create foreign key constraint
DETAIL: Foreign keys are supported in two cases, either in between two colocated tables including partition column in the same ordinal in the both tables or from distributed to reference tables
For the time being, it is not possible to shard a table on PostgreSQL without dropping the self referencing foreign key constraints, or altering them to include a separate and new distribution column.
Citus places records into shards based on the hash values of the distribution column values. It is most likely the case that the hashes of parent and child id values are different and hence the records should be stored in different shards, and possibly on different worker nodes. PostgreSQL does not have a mechanism to create foreign key constraints that reference records on different PostgreSQL clusters.
Consider adding a new column tenant_id and adding this column to the primary key and foreign key constraints.
CREATE TABLE TEST (
tenant_id INT NOT NULL,
id TEXT NOT NULL,
name CHARACTER VARYING(255) NOT NULL,
parent_id TEXT NOT NULL,
FOREIGN KEY (tenant_id, parent_id) REFERENCES test(tenant_id, id),
PRIMARY KEY (tenant_id, id)
);
SELECT create_distributed_table('test','tenant_id');
Note that parent and child should always be in the same tenant for this to work.

understanding an inheritance in Postgres; why key "fails" in insert/update command

(One image, tousands of words)
I'd made few tables that are inherited between themselves. (persons)
And then assign child table (address), and relate it only to "base" table (person).
When try to insert in child table, and record is related to inherited table, insert statement fail because there is no key in master table.
And as I insert records in descendant tables, records are salo available in base table (so, IMHO, should be visible/accessible in inherited tables).
Please take a look on attached image. Obviously do someting wrong or didn't get some point....
Thank You in advanced!
Sorry, that's how Postgres table inheritance works. 5.10.1 Caveats explains.
A serious limitation of the inheritance feature is that indexes (including unique constraints) and foreign key constraints only apply to single tables, not to their inheritance children. This is true on both the referencing and referenced sides of a foreign key constraint. Thus, in the terms of the above example:
Specifying that another table's column REFERENCES cities(name) would allow the other table to contain city names, but not capital names. There is no good workaround for this case.
In their example, capitals inherits from cities as organization_employees inherits from person. If person_address REFERENCES person(idt_person) it will not see entries in organization_employees.
Inheritance is not as useful as it seems, and it's not a way to avoid joins. This can be better done with a join table with some extra columns. It's unclear why an organization would inherit from a person.
person
id bigserial primary key
name text not null
verified boolean not null default false
vat_nr text
foto bytea
# An organization is not a person
organization
id bigserial not null
name text not null
# Joins a person with an organization
# Stores information about that relationship
organization_employee
person_id bigint not null references person(id)
organization_id bigint not null references organization(id)
usr text
pwd text
# Get each employee, their name, and their org's name.
select
person.name
organization.name
from
organization_employee
join person on person_id = person.id
join organization on organization_id = organization.id
Use bigserial (bigint) for primary keys, 2 billion comes faster than you think
Don't enshrine arbitrary business rules in the schema, like how long a name can be. You're not saving any space by limiting it, and every time the business rule changes you have to alter your schema. Use the text type. Enforce arbitrary limits in the application or as constraints.
idt_table_name primary keys makes for long, inconsistent column names hard to guess. Why is the primary key of person_address not idt_person_address? Why is the primary key of organization_employee idt_person? You can't tell, at a glance, which is the primary key and which is a foreign key. You still need to prepend the column name to disambiguate; for example, if you join person with person_address you need person.idt_person and person_address.idt_person. Confusing and redundant. id (or idt if you prefer) makes it obvious what the primary key is and clearly differentiates it from table_id (or idt_table) foreign keys. SQL already has the means to resolve ambiguities: person.id.

Shared Primary key versus Foreign Key

I have a laboratory analysis database and I'm working on the bast data layout. I've seen some suggestions based on similar requirements for using a "Shared Primary Key", but I don't see the advantages over just foreign keys. I'm using PostgreSQL:tables listed below
Sample
___________
sample_id (PK)
sample_type (where in the process the sample came from)
sample_timestamp (when was the sample taken)
Analysis
___________
analysis_id (PK)
sample_id (FK references sample)
method (what analytical method was performed)
analysis_timestamp (when did the analysis take place)
analysis_notes
gc
____________
analysis_id (shared Primary key)
gc_concentration_meoh (methanol concentration)
gc_concentration_benzene (benzene concentration)
spectrophotometer
_____________
analysis_id
spectro_nm (wavelength used)
spectro_abs (absorbance measured)
I could use this design, or I could move the fields from the analysis table into both the gc and spectrophotometer tables, and just use foreign keys between sample, gc, and spectrophotometer tables. The only advantage I see of this design is in cases where I would just want information on how many or what types of analyses were performed, without having to join in the actual results. However, the additional rules to ensure referential integrity between the shared primary keys, and managing extra joins and triggers (on delete cascade, etc) appears to make it more of a headache than the minor advantages. I'm not a DBA, but a scientist, so please let me know what I'm missing.
UPDATE:
A shared primary key (as I understand it) is like a one-to-one foreign key with the additional constraint that each value in the parent tables(analysis) must appear in one of the child tables once, and no more than once.
I've seen some suggestions based on similar requirements for using a
"Shared Primary Key", but I don't see the advantages over just foreign
keys.
If I've understood your comments above, the advantage is that only the first implements the requirement that each row in the parent match a row in one child, and only in one child. Here's one way to do that.
create table analytical_methods (
method_id integer primary key,
method_name varchar(25) not null unique
);
insert into analytical_methods values
(1, 'gc'),(2, 'spec'), (3, 'Atomic Absorption'), (4, 'pH probe');
create table analysis (
analysis_id integer primary key,
sample_id integer not null, --references samples, not shown
method_id integer not null references analytical_methods (method_id),
analysis_timestamp timestamp not null,
analysis_notes varchar(255),
-- This unique constraint lets the pair of columns be the target of
-- foreign key constraints from other tables.
unique (analysis_id, method_id)
);
-- The combination of a) the default value and the check() constraint on
-- method_id, and b) the foreign key constraint on the paired columns
-- analysis_id and method_id guarantee that rows in this table match a
-- gc row in the analysis table.
--
-- If all the child tables have similar constraints, a row in analysis
-- can match a row in one and only one child table.
create table gc (
analysis_id integer primary key,
method_id integer not null
default 1
check (method_id = 1),
foreign key (analysis_id, method_id)
references analysis (analysis_id, method_id),
gc_concentration_meoh integer not null,
gc_concentration_benzene integer not null
);
It looks like in my case this supertype/subtype model in not the best choice. Instead, I should move the fields from the analysis table into all the child tables, and make a series of simple foreign key relationships. The advantage of the supertype/subtype model is when using the primary key of the supertype as a foreign key in another table. Since I am not doing this, the extra layer of complexity will not add anything.