Postgres database inheritance, indexes on child tables - postgresql

I'm having parent table product and child tables product_1 ... product_N based on field store ID. Once a day UPDATE operation is performed explicitly (and in different time) for all stores. So now I want to add index on some field and I'm not sure which table should have this index. Parent table only or every child table should have its own index? Or both?
UPD
UPDATE product p SET
...
FROM newitems n
WHERE n.new_prod='0' AND
n.internal_product_id is not null AND
p.sku = n.sku AND
p.distributor_id=M and
p.store_id=N;
I want to add index on sku field to make join faster.

Index should be added explicitly on each table. Indexing parent table doesn't affect child tables.

When you first create the child table to can specify to automatically inherit the parent's indexes. (I have not found a way to active this after the child tables is already created).
"The LIKE ... INCLUDING ALL indicates that we will copy in defaults,
primary keys, and index definitions. This now provides a
forward-looking way of managing all notes tables going forward.
Uniqueness criteria remains enforced on a per-table basis." https://dzone.com/articles/table-inheritance-whats-it-good-for
CREATE TABLE notes (
id serial primary key,
created_at timestamp not null default now(),
created_by text not null,
subject text not null,
body text not null,
);
CREATE INDEX idx_notes_subject ON notes (subject);
CREATE TABLE invoice_notes (
child_field text not null,
LIKE notes INCLUDING INDEXES, -- automatically inherit parent indexes
) INHERITS (notes);

Related

Avoid scan on attach partition with check constraint

I am recreating an existing table as a partitioned table in PostgreSQL 11.
After some research, I am approaching it using the following procedure so this can be done online while writes are still happening on the table:
add a check constraint on the existing table, first as not valid and then validating
drop the existing primary key
rename the existing table
create the partitioned table under the prior table name
attach the existing table as a partition to the new partitioned table
My expectation was that the last step would be relatively fast, but I don't really have a number for this. In my testing, it's taking about 30s. I wonder if my expectations are incorrect or if I'm doing something wrong with the constraint or anything else.
Here's a simplified version of the DDL.
First, the inserted_at column is declared like this:
inserted_at timestamp without time zone not null
I want to have an index on the ID even after I drop the PK for existing queries and writes, so I create an index:
create unique index concurrently my_events_temp_id_index on my_events (id);
The check constraint is created in one transaction:
alter table my_events add constraint my_events_2022_07_events_check
check (inserted_at >= '2018-01-01' and inserted_at < '2022-08-01')
not valid;
In the next transaction, it's validated (and the validation is successful):
alter table my_events validate constraint my_events_2022_07_events_check;
Then before creating the partitioned table, I drop the primary key of the existing table:
alter table my_events drop constraint my_events_pkey cascade;
Finally, in its own transaction, the partitioned table is created:
alter table my_events rename to my_events_2022_07;
create table my_events (
id uuid not null,
... other columns,
inserted_at timestamp without time zone not null,
primary key (id, inserted_at)
) partition by range (inserted_at);
alter table my_events attach partition my_events_2022_07
for values from ('2018-01-01') to ('2022-08-01');
That last transaction blocks inserts and takes about 30s for the 12M rows in my test database.
Edit
I wanted to add that in response to the attach I see this:
INFO: partition constraint for table "my_events_2022_07" is implied by existing constraints
That makes me think I'm doing this right.
The problem is not the check constraint, it is the primary key.
If you make the original unique index include both columns:
create unique index concurrently my_events_temp_id_index on my_events (id,inserted_at);
And if you make the new table have a unique index rather than a primary key on those two columns, then the attach is nearly instantaneous.
These seem to me like unneeded restrictions in PostgreSQL, both that the unique index on one column can't be used to imply uniqueness on the both columns, and that the unique index on both columns cannot be used to imply the primary key (nor even a unique constraint--but only a unique index).

understanding an inheritance in Postgres; why key "fails" in insert/update command

(One image, tousands of words)
I'd made few tables that are inherited between themselves. (persons)
And then assign child table (address), and relate it only to "base" table (person).
When try to insert in child table, and record is related to inherited table, insert statement fail because there is no key in master table.
And as I insert records in descendant tables, records are salo available in base table (so, IMHO, should be visible/accessible in inherited tables).
Please take a look on attached image. Obviously do someting wrong or didn't get some point....
Thank You in advanced!
Sorry, that's how Postgres table inheritance works. 5.10.1 Caveats explains.
A serious limitation of the inheritance feature is that indexes (including unique constraints) and foreign key constraints only apply to single tables, not to their inheritance children. This is true on both the referencing and referenced sides of a foreign key constraint. Thus, in the terms of the above example:
Specifying that another table's column REFERENCES cities(name) would allow the other table to contain city names, but not capital names. There is no good workaround for this case.
In their example, capitals inherits from cities as organization_employees inherits from person. If person_address REFERENCES person(idt_person) it will not see entries in organization_employees.
Inheritance is not as useful as it seems, and it's not a way to avoid joins. This can be better done with a join table with some extra columns. It's unclear why an organization would inherit from a person.
person
id bigserial primary key
name text not null
verified boolean not null default false
vat_nr text
foto bytea
# An organization is not a person
organization
id bigserial not null
name text not null
# Joins a person with an organization
# Stores information about that relationship
organization_employee
person_id bigint not null references person(id)
organization_id bigint not null references organization(id)
usr text
pwd text
# Get each employee, their name, and their org's name.
select
person.name
organization.name
from
organization_employee
join person on person_id = person.id
join organization on organization_id = organization.id
Use bigserial (bigint) for primary keys, 2 billion comes faster than you think
Don't enshrine arbitrary business rules in the schema, like how long a name can be. You're not saving any space by limiting it, and every time the business rule changes you have to alter your schema. Use the text type. Enforce arbitrary limits in the application or as constraints.
idt_table_name primary keys makes for long, inconsistent column names hard to guess. Why is the primary key of person_address not idt_person_address? Why is the primary key of organization_employee idt_person? You can't tell, at a glance, which is the primary key and which is a foreign key. You still need to prepend the column name to disambiguate; for example, if you join person with person_address you need person.idt_person and person_address.idt_person. Confusing and redundant. id (or idt if you prefer) makes it obvious what the primary key is and clearly differentiates it from table_id (or idt_table) foreign keys. SQL already has the means to resolve ambiguities: person.id.

How to edit a record that results in a uniqueness violation and echo the change to child tables?

PostgreSQL 11.1
How can "Editing" of a record be transmitted to dependent records?
Summary of my issue:
The master table, disease, needs a unique constraint on the description column. This unique constraint is needed for foreign key ON UPDATE CASCADE to its children tables.
To allow for a temporary violation of the unique constraint, it must be made deferrable. BUT A DEFERABLE CONSTRAINT CAN NOT BE USED IN A FOREIGN KEY.
Here is the situation.
The database has 100+ tables (and keeps on growing).
Most all information has been normalized in that repeating groups of information have been delegated to their own table.
Following normalization, most tables are lists without duplication of records. Duplication of records within a table is not allowed.
All tables have a unique ID assigned to each record (in addition to a unique constraint placed on the record information).
Most tables are dependent on another table. The foreign keys reference the primary key of the table they are dependent on.
Most unique constraints involve a foreign key (which in turn references the primary key of the parent table).
So, assume the following schema:
CREATE TABLE phoenix.disease
(
recid integer NOT NULL DEFAULT nextval('disease_recid_seq'::regclass),
code text COLLATE pg_catalog."default",
description text COLLATE pg_catalog."default" NOT NULL,
CONSTRAINT disease_pkey PRIMARY KEY (recid),
CONSTRAINT disease_code_unique UNIQUE (code)
DEFERRABLE,
CONSTRAINT disease_description_unique UNIQUE (description)
,
CONSTRAINT disease_description_check CHECK (description <> ''::text)
)
CREATE TABLE phoenix.dx
(
recid integer NOT NULL DEFAULT nextval('dx_recid_seq'::regclass),
disease_recid integer NOT NULL,
patient_recid integer NOT NULL,
CONSTRAINT pk_dx_recid PRIMARY KEY (recid),
CONSTRAINT dx_unique UNIQUE (tposted, patient_recid, disease_recid)
,
CONSTRAINT dx_disease_recid_fkey FOREIGN KEY (disease_recid)
REFERENCES phoenix.disease (recid) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE RESTRICT,
CONSTRAINT dx_patients FOREIGN KEY (patient_recid)
REFERENCES phoenix.patients (recid) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE RESTRICT
)
(Columns not involved in this question have been removed. :) )
There are many other children tables of disease with the same basic dependency on the disease table. Note that the primary key of the disease table is a foreign key to the dx table and that the dx table uses this foreign key in a unique constraint. Also note that the dx table is just one table of a long chain of table references. (That is the dx table also has its primary key referenced by other tables).
The problem: I wish to "edit" the contents of the parent disease record. By "edit", I mean:
change the data in the description column.
if the result of the change causes a duplication in the disease table, then one of the "duplicated" records will need to be deleted.
Herein lies my problem. There are many different tables that use the primary key of the disease table in their own unique constraint. If those tables ALSO have a foreign key reference to the duplicated record (in disease), then cascading the delete to those tables would be appropriate -- i.e., no duplication of records will occur.
However, if the child table does NOT have a reference to the "correct" record in the parent disease table, then simply deleting the record (by cascade) will result in loss of information.
Example:
Disease Table:
record 1: ID = 1 description = "ABC"
record 2: ID = 2 description = "DEF"
Dx Table:
record 5: ID = 5 refers to ID=1 of Disease Table.
Editing of record 1 in Disease table results in description becoming "DEF"
Disease Table:
record 1: ID = 1 "ABC" --> "DEF"
I have tried deferring the primary key of the disease table so as to allow the "correct" ID to be "cascaded" to the child tables. This causes the following errors:
A foreign key can not be dependent on a deferred column. "cannot use a deferrable unique constraint for referenced table "disease"
additionally, the parent table (disease) has no way of knowing ahead of time if its children already have a reference to the "correct" record so allowing deletion, or if the child needs to change its own column data to reflect the new "correct" id.
So, how can I allow a change in the parent table (disease) and notify the child tables to change their column values -- and delete within them selves should a duplicate record arise?
Lastly, I do not know today what future tables I will need. So I cannot "precode" into the parent table who its children are or will be.
Thank you for any help with this.

How to ensure that two columns in different tables have the same values

What T-SQL DDL is required to create a constraint that ensures that the values in a column in one table are the same as the values in a column in a different table?
I want to do this without using a PK-FK relationship.
The T-SQL DDL at the end of this post is an example of the generic problem that I'm trying to solve.
In this example, I want to know how to add an equality constraint between the two tables that ensures that the set of values in the column:
"PersonMayDriveCar.personName"
is always equal to the set of values in the column
"DriverLicense.personName"
CREATE SCHEMA "Equality Constraint"
GO
CREATE TABLE "Equality Constraint".PersonMayDriveCar
(
carVin nchar(4000) NOT NULL,
personName nchar(70) NOT NULL,
CONSTRAINT PersonMayDriveCar_PK PRIMARY KEY(personName, carVin)
)
GO
CREATE TABLE "Equality Constraint".DriverLicense
(
driverLicenseNr int NOT NULL,
personName nchar(70) NOT NULL,
CONSTRAINT DriverLicense_PK PRIMARY KEY(driverLicenseNr),
CONSTRAINT DriverLicense_UC UNIQUE(personName)
)
GO
I see that you want to maintain referential integrity between the two tables without using a foreign key.
Based on my past experience, I solved such an issue using a trigger.
So you can create a trigger on the DriverLicense table which ensures that any insert or update into the DriverLicense table will be rolled back if the inserted driverLicenseNr doesn't exist in the PersonMayDriveCar table.
You can go through this for a full example:
https://www.mssqltips.com/sqlservertip/4242/sql-server-referential-integrity-without-foreign-keys/
Adhere to convention:
Use an FK. It’s that simple.
Don’t link these table together with an FK, because they are both child tables of ...
Create a person table, which is the parent of the other two tables
Try this:
Person
- id (PK)
- name
- other columns
PersonMayDriveCar
- person_id (FK to person)
- other columns
DriverLicense
- person_id (FK to person)
- other columns

Non-unique foreign key in PostgreSQL?

I have a database design question, to which either of the following two would be help I would appreciate:
1) Explanation why what I'm doing is a bad design decision, and how to design it better
2) Example how to actually implement the desired design in PostgreSQL
In short, what I'm doing is designing a tree-structure where each node should have a revision history like this:
CREATE TABLE Nodes
(
nid BIGSERIAL PRIMARY KEY,
node_id BIGINT NOT NULL,
parent_nodeid BIGINT,
revision_id INTEGER NOT NULL,
.. additional columns with info about this node ..
)
The idea is the following; I might have a structure like:
root node
child node 1
child node 2
When a user edits the information in the "root node"; instead of just replacing the values in the existing log, I want to keep a log of the previous values so I instead create a new "revision" of the row - so the user sometime far in the future can do an "undo" and return to the previous configuration of the node.
What I want to achieve, is that child nodes automatically refer to the new parent node without having to update parent_nodeid of the children as the new revision of the root node should not change the hierarchy of the node tree.
I understand that I cannot add a foreign key from Nodes.parent_nodeid to Nodes.node_id as PostgreSQL requires foreign keys to reference columns with a unique value - but I'm kind of lost on how to add some kind of constraint that at least guarantees that Nodes.parent_nodeid references an existing Nodes.node_id value even though it won't be unique.
Any help/ideas would be highly appreciated!
You do not need a tree structure, as you always have only one level of dependency. Normalized database design:
create table nodes (
node_id bigserial primary key,
description text
)
create table revisions (
revision_id bigserial primary key,
node_id bigint references nodes,
description text
);
You need a trigger on nodes which duplicates old row to descriptions on insert or update and copies a row from revisions instead of delete, implementing undo.
It is also not clear why to keep two nodes identifiers nid and node_id? This seems redundant.