Postgresql 13 - Support for partition by reference - postgresql

We have the following partition set up in Oracle which we need to migrate to Postgresql (version 13)-
CREATE TABLE A (
id number(10) not null,
name varchar2(100),
value varchar2(100),
createdat date
constraint a_pk primary key (id))
partition by range (createdat);
CREATE TABLE B (
id number(10) not null,
a_id number(10) not null,
....
....
constraint b_pk primary key (id),
constraint b_a_fk foreign key (a_id) references a (id) on delete cascade
) partition by reference (b_a_fk)
Partition by reference is not supported in Postgresql. Could anyone please advise the alternatives to achieve the same in Postgresql. Basically we need to ensure that when older partitions are dropped from both tables, all records in table "B" should get dropped corresponding to related records in "A".

You have to keep a redundant copy of createdat in b so that you can use it as partitioning key.
To make sure that the related dates are the same, consider the following idea:
you cannot have id as a primary key, since it does not contain the partitioning key createdat
so instead use (id, createdat) as primary key of a
then you can define the foreign key on b on (a_id, createdat), which will automatically guarantee that the related dates are identical
Sure, that solution is not perfect – in particular, you cannot guarantee uniqueness of id. But I think it is the best you can have.

Related

PostgreSQL declarative partition - unique constraint on partitioned table must include all partitioning columns [duplicate]

This question already has an answer here:
ERROR: unique constraint on partitioned table must include all partitioning columns
(1 answer)
Closed last month.
I'm trying to create a partitioned table which refers to itself, creating a doubly-linked list.
CREATE TABLE test2 (
id serial NOT NULL,
category integer NOT NULL,
time timestamp(6) NOT NULL,
prev_event integer,
next_event integer
) PARTITION BY HASH (category);
Once I add primary key I get the following error.
alter table test2 add primary key (id);
ERROR: unique constraint on partitioned table must include all partitioning columns
DETAIL: PRIMARY KEY constraint on table "test2" lacks column "category" which is part of the partition key.
Why does the unique constrain require all partitioned columns to be included?
EDIT: Now I understand why this is needed: https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-LIMITATIONS
Once I add PK with both columns it works.
alter table test2 add primary key (id, category);
But then adding the FK to itself doesn't work.
alter table test2 add foreign key (prev_event) references test2 (id) on update cascade on delete cascade;
ERROR: there is no unique constraint matching given keys for referenced table "test2"
Since PK is not just id but id-category I can't create FK pointing to id.
Is there any way to deal with this or am I missing something?
I would like to avoid using inheritance partitioning if possible.
EDIT2: It seems this is a known problem. https://www.reddit.com/r/PostgreSQL/comments/di5mbr/postgresql_12_foreign_keys_and_partitioned_tables/f3tsoop/
Seems that there is no straightforward solution. PostgreSQL simply doesn't support this as of v14. One solution is to use triggers to enforce 'foreign key' behavior. Other is to use multi-column foreign keys. Both are far from optimal.

Postgresql Partition by column without a primary key

I'd like to be able to partition a table by worksite_uuid. But worksite_uuid needs to be nullable and I need uuid to be a primary key.
The only way I've been able create the partition is if worksite_uuid is not nullable like so:
CREATE TABLE test (
uuid uuid DEFAULT uuid_generate_v1mc(),
worksite_uuid text,
notes text
)
PARTITION BY LIST(worksite_uuid)
//then add worksite_uuid and uuid as a primary key
create table test_worksite1 partition of test for values in ('1');
create table test_worksite2 partition of test for values in ('2');
Does anyone know how I can create a partition with only uuid as the primary key and make worksite_uuid nullable?
--
Example: I can't do this
CREATE TABLE test (
uuid uuid DEFAULT uuid_generate_v1mc() PRIMARY KEY,
worksite_uuid text,
notes text
)
PARTITION BY LIST(worksite_uuid)
I get the following error:
Query 1 ERROR: ERROR: unique constraint on partitioned table must include all partitioning columns
DETAIL: PRIMARY KEY constraint on table "test" lacks column "worksite_uuid" which is part of the partition key.
It is no problem to create a partition for the NULL values:
create table test_worksite_null partition of test for values in (NULL);
But it is impossible to have a primary key column that is nullable. You just cannot do it.
I see two ways out:
You live without a primary key. Instead, create primary keys on each individual partition. That won't guarantee global uniqueness, but almost.
You use another value instead of NULL, for example -1.

Citus: How can I add self referencing table in distributed tables list

I'm trying to run create_distributed_table for tables which i need to shard and almost all of the tables have self relation ( parent child )
but when I run SELECT create_distributed_table('table-name','id');
it throws error cannot create foreign key constraint
simple steps to reproduce
CREATE TABLE TEST (
ID TEXT NOT NULL,
NAME CHARACTER VARYING(255) NOT NULL,
PARENT_ID TEXT
);
ALTER TABLE TEST ADD CONSTRAINT TEST_PK PRIMARY KEY (ID);
ALTER TABLE TEST ADD CONSTRAINT TEST_PARENT_FK FOREIGN KEY (PARENT_ID) REFERENCES TEST (ID);
ERROR
citus=> SELECT create_distributed_table('test','id');
ERROR: cannot create foreign key constraint
DETAIL: Foreign keys are supported in two cases, either in between two colocated tables including partition column in the same ordinal in the both tables or from distributed to reference tables
For the time being, it is not possible to shard a table on PostgreSQL without dropping the self referencing foreign key constraints, or altering them to include a separate and new distribution column.
Citus places records into shards based on the hash values of the distribution column values. It is most likely the case that the hashes of parent and child id values are different and hence the records should be stored in different shards, and possibly on different worker nodes. PostgreSQL does not have a mechanism to create foreign key constraints that reference records on different PostgreSQL clusters.
Consider adding a new column tenant_id and adding this column to the primary key and foreign key constraints.
CREATE TABLE TEST (
tenant_id INT NOT NULL,
id TEXT NOT NULL,
name CHARACTER VARYING(255) NOT NULL,
parent_id TEXT NOT NULL,
FOREIGN KEY (tenant_id, parent_id) REFERENCES test(tenant_id, id),
PRIMARY KEY (tenant_id, id)
);
SELECT create_distributed_table('test','tenant_id');
Note that parent and child should always be in the same tenant for this to work.

Create a Partition Table in Postgresql

I'm trying to find an example to create a partition table.
I have some tables with many tuples and I can classify them according to a value of one column, but, I just find examples using range and date (my column is a varchar and, in other table, is a int/foreign key).
I'm trying to speed my SELECT with this technique.
Here one of my CREATE tables (column Source will be used to partition this table):
CREATE TABLE tb_hit_source (
Hit_SourceId bigserial NOT NULL,
Source varchar(50) NOT NULL,
UniqueId varchar(50) NOT NULL,
tb_hit_HitId int8 NOT NULL,
CONSTRAINT tb_hit_source_ak_1 UNIQUE (Source, tb_hit_HitId, UniqueId) NOT DEFERRABLE INITIALLY IMMEDIATE,
CONSTRAINT tb_hit_source_pk PRIMARY KEY (Hit_SourceId)
);
CREATE INDEX tb_hit_source_idx_1 on tb_hit_source (Source ASC);
CREATE INDEX tb_hit_source_idx_2 on tb_hit_source (tb_hit_HitId ASC);
ALTER TABLE tb_hit_source ALTER COLUMN Hit_SourceId SET DEFAULT nextval('"HitSourceId_seq_tb_hit_source"');;
to create the table do.
CREATE TABLE tb_hit_source (
Hit_SourceId bigserial NOT NULL,
Source varchar(50) NOT NULL,
UniqueId varchar(50) NOT NULL,
tb_hit_HitId int8 NOT NULL,
CONSTRAINT tb_hit_source_ak_1
UNIQUE (Source, tb_hit_HitId, UniqueId) NOT DEFERRABLE,
CONSTRAINT tb_hit_source_pk PRIMARY KEY (Hit_SourceId)
PARTITION BY RANGE (Source);
then to create the partitions use the same value at each end of the range to force a single value partition.
CREATE TABLE tb_hit_source_a PARTITION OF tb_hit_source
FOR VALUES FROM ('a') TO ('a');
etc.
podtgresql 11 offers PARTITION BY LIST (source) allowing the partitions to be declared more simply.
CREATE TABLE tb_hit_source_a PARTITION OF tb_hit_source
FOR VALUES IN ('a');
to create the partitions do
create table part_a (check source='part_a' )inherits (tb_hit_source);
create table part_a (check source='part_b' )inherits (tb_hit_source);
etc.
but if there are going to be many partitions it will probably be more convenient to put them in a separate schema.
create schema hit_source_parts;
create table hit_source_parts.a (check(source='a'))inherits (tb_hit_source);
create table hit_source_parts.b (check(source='b'))inherits (tb_hit_source);
etc.
Any partitions you make will also need the aproptiate indexes.
unique constraints won't work across partitions this is one reaon why most uses of partitioning partition on one of the unique colums, by fragmenting this way uniqueness in each partition also enforces global uniqueness.

Will a primary key index serve as an index for a foreign key when fk columns are subset of pk?

I have a table where part of the primary key is a foreign key to another table.
create table player_result (
event_id integer not null,
pub_time timestamp not null,
name_key varchar(128) not null,
email_address varchar(128),
withdrawn boolean not null,
place integer,
realized_values hstore,
primary key (event_id, pub_time, name_key),
foreign key (email_address) references email(address),
foreign key (event_id, pub_time) references event_publish(event_id, pub_time));
Will the index generated for the primary key suffice to back the foreign key on event_id and pub_time?
Yes.
Index A,B,C
is good for:
A
A,B
A,B,C (and any other combination of the full 3 fields, if default order is unimportant)
but not good for other combinations (such as B,C, C,A etc.).
It will be useful for the referencing side, such that a DELETE or UPDATE on the referenced table can use the PRIMARY KEY of the referencing side as an index when performing checks for the existence of referencing rows or running cascade update/deletes. PostgreSQL doesn't require this index to exist at all, it just makes foreign key constraint checks faster if it is there.
It is not sufficient to serve as the unique constraint for a reference to those columns. You couldn't create a FOREIGN KEY that REFERENCES player_result(event_id, pub_time) because there is no unique constraint on those columns. That pair can appear multiple times in the table so long as each pair has a different name_key.
As #xagyg accurately notes, the unique b-tree index created by the foreign key reference is also only useful for references to columns from the left of the index. It could not be used for a lookup of pub_time, name_key or just name_key, for example.