Postgresql Partition by column without a primary key - postgresql

I'd like to be able to partition a table by worksite_uuid. But worksite_uuid needs to be nullable and I need uuid to be a primary key.
The only way I've been able create the partition is if worksite_uuid is not nullable like so:
CREATE TABLE test (
uuid uuid DEFAULT uuid_generate_v1mc(),
worksite_uuid text,
notes text
)
PARTITION BY LIST(worksite_uuid)
//then add worksite_uuid and uuid as a primary key
create table test_worksite1 partition of test for values in ('1');
create table test_worksite2 partition of test for values in ('2');
Does anyone know how I can create a partition with only uuid as the primary key and make worksite_uuid nullable?
--
Example: I can't do this
CREATE TABLE test (
uuid uuid DEFAULT uuid_generate_v1mc() PRIMARY KEY,
worksite_uuid text,
notes text
)
PARTITION BY LIST(worksite_uuid)
I get the following error:
Query 1 ERROR: ERROR: unique constraint on partitioned table must include all partitioning columns
DETAIL: PRIMARY KEY constraint on table "test" lacks column "worksite_uuid" which is part of the partition key.

It is no problem to create a partition for the NULL values:
create table test_worksite_null partition of test for values in (NULL);
But it is impossible to have a primary key column that is nullable. You just cannot do it.
I see two ways out:
You live without a primary key. Instead, create primary keys on each individual partition. That won't guarantee global uniqueness, but almost.
You use another value instead of NULL, for example -1.

Related

Postgresql 13 - Support for partition by reference

We have the following partition set up in Oracle which we need to migrate to Postgresql (version 13)-
CREATE TABLE A (
id number(10) not null,
name varchar2(100),
value varchar2(100),
createdat date
constraint a_pk primary key (id))
partition by range (createdat);
CREATE TABLE B (
id number(10) not null,
a_id number(10) not null,
....
....
constraint b_pk primary key (id),
constraint b_a_fk foreign key (a_id) references a (id) on delete cascade
) partition by reference (b_a_fk)
Partition by reference is not supported in Postgresql. Could anyone please advise the alternatives to achieve the same in Postgresql. Basically we need to ensure that when older partitions are dropped from both tables, all records in table "B" should get dropped corresponding to related records in "A".
You have to keep a redundant copy of createdat in b so that you can use it as partitioning key.
To make sure that the related dates are the same, consider the following idea:
you cannot have id as a primary key, since it does not contain the partitioning key createdat
so instead use (id, createdat) as primary key of a
then you can define the foreign key on b on (a_id, createdat), which will automatically guarantee that the related dates are identical
Sure, that solution is not perfect – in particular, you cannot guarantee uniqueness of id. But I think it is the best you can have.

PostgreSQL declarative partition - unique constraint on partitioned table must include all partitioning columns [duplicate]

This question already has an answer here:
ERROR: unique constraint on partitioned table must include all partitioning columns
(1 answer)
Closed last month.
I'm trying to create a partitioned table which refers to itself, creating a doubly-linked list.
CREATE TABLE test2 (
id serial NOT NULL,
category integer NOT NULL,
time timestamp(6) NOT NULL,
prev_event integer,
next_event integer
) PARTITION BY HASH (category);
Once I add primary key I get the following error.
alter table test2 add primary key (id);
ERROR: unique constraint on partitioned table must include all partitioning columns
DETAIL: PRIMARY KEY constraint on table "test2" lacks column "category" which is part of the partition key.
Why does the unique constrain require all partitioned columns to be included?
EDIT: Now I understand why this is needed: https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-LIMITATIONS
Once I add PK with both columns it works.
alter table test2 add primary key (id, category);
But then adding the FK to itself doesn't work.
alter table test2 add foreign key (prev_event) references test2 (id) on update cascade on delete cascade;
ERROR: there is no unique constraint matching given keys for referenced table "test2"
Since PK is not just id but id-category I can't create FK pointing to id.
Is there any way to deal with this or am I missing something?
I would like to avoid using inheritance partitioning if possible.
EDIT2: It seems this is a known problem. https://www.reddit.com/r/PostgreSQL/comments/di5mbr/postgresql_12_foreign_keys_and_partitioned_tables/f3tsoop/
Seems that there is no straightforward solution. PostgreSQL simply doesn't support this as of v14. One solution is to use triggers to enforce 'foreign key' behavior. Other is to use multi-column foreign keys. Both are far from optimal.

Citus: How can I add self referencing table in distributed tables list

I'm trying to run create_distributed_table for tables which i need to shard and almost all of the tables have self relation ( parent child )
but when I run SELECT create_distributed_table('table-name','id');
it throws error cannot create foreign key constraint
simple steps to reproduce
CREATE TABLE TEST (
ID TEXT NOT NULL,
NAME CHARACTER VARYING(255) NOT NULL,
PARENT_ID TEXT
);
ALTER TABLE TEST ADD CONSTRAINT TEST_PK PRIMARY KEY (ID);
ALTER TABLE TEST ADD CONSTRAINT TEST_PARENT_FK FOREIGN KEY (PARENT_ID) REFERENCES TEST (ID);
ERROR
citus=> SELECT create_distributed_table('test','id');
ERROR: cannot create foreign key constraint
DETAIL: Foreign keys are supported in two cases, either in between two colocated tables including partition column in the same ordinal in the both tables or from distributed to reference tables
For the time being, it is not possible to shard a table on PostgreSQL without dropping the self referencing foreign key constraints, or altering them to include a separate and new distribution column.
Citus places records into shards based on the hash values of the distribution column values. It is most likely the case that the hashes of parent and child id values are different and hence the records should be stored in different shards, and possibly on different worker nodes. PostgreSQL does not have a mechanism to create foreign key constraints that reference records on different PostgreSQL clusters.
Consider adding a new column tenant_id and adding this column to the primary key and foreign key constraints.
CREATE TABLE TEST (
tenant_id INT NOT NULL,
id TEXT NOT NULL,
name CHARACTER VARYING(255) NOT NULL,
parent_id TEXT NOT NULL,
FOREIGN KEY (tenant_id, parent_id) REFERENCES test(tenant_id, id),
PRIMARY KEY (tenant_id, id)
);
SELECT create_distributed_table('test','tenant_id');
Note that parent and child should always be in the same tenant for this to work.

CREATE TABLE LIKE with different primary key for partitioning

I've got an existing table of dogs which I would like to partition by list using the colour column:
CREATE TABLE dogs (
id int PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
colour text,
name text
)
;
Because it's not possible to partition an existing table, I'm going to make a new empty partitioned table then copy the data across.
CREATE TABLE
trade_capture._customer_invoice
(
LIKE
trade_capture.customer_invoice
INCLUDING ALL
)
PARTITION BY LIST (bill_month)
;
This reports:
ERROR: insufficient columns in PRIMARY KEY constraint definition
DETAIL: PRIMARY KEY constraint on table "_dogs" lacks column "colour" which is part of the partition key.
I know I can ignore the primary key like this, but it seems bad to have a table with no primary key!
CREATE TABLE
_dogs
(
LIKE
dogs
INCLUDING ALL
EXCLUDING INDEXES
)
PARTITION BY LIST (colour)
;
What's the best way to proceed so I have a partitioned table which still has a primary key?
Add the partitioning key to the primary key:
ALTER TABLE trade_capture._customer_invoice
ADD PRIMARY KEY (id, bill_month);
There is no other option to have a unique constraint on a partitioned table.

Create a Partition Table in Postgresql

I'm trying to find an example to create a partition table.
I have some tables with many tuples and I can classify them according to a value of one column, but, I just find examples using range and date (my column is a varchar and, in other table, is a int/foreign key).
I'm trying to speed my SELECT with this technique.
Here one of my CREATE tables (column Source will be used to partition this table):
CREATE TABLE tb_hit_source (
Hit_SourceId bigserial NOT NULL,
Source varchar(50) NOT NULL,
UniqueId varchar(50) NOT NULL,
tb_hit_HitId int8 NOT NULL,
CONSTRAINT tb_hit_source_ak_1 UNIQUE (Source, tb_hit_HitId, UniqueId) NOT DEFERRABLE INITIALLY IMMEDIATE,
CONSTRAINT tb_hit_source_pk PRIMARY KEY (Hit_SourceId)
);
CREATE INDEX tb_hit_source_idx_1 on tb_hit_source (Source ASC);
CREATE INDEX tb_hit_source_idx_2 on tb_hit_source (tb_hit_HitId ASC);
ALTER TABLE tb_hit_source ALTER COLUMN Hit_SourceId SET DEFAULT nextval('"HitSourceId_seq_tb_hit_source"');;
to create the table do.
CREATE TABLE tb_hit_source (
Hit_SourceId bigserial NOT NULL,
Source varchar(50) NOT NULL,
UniqueId varchar(50) NOT NULL,
tb_hit_HitId int8 NOT NULL,
CONSTRAINT tb_hit_source_ak_1
UNIQUE (Source, tb_hit_HitId, UniqueId) NOT DEFERRABLE,
CONSTRAINT tb_hit_source_pk PRIMARY KEY (Hit_SourceId)
PARTITION BY RANGE (Source);
then to create the partitions use the same value at each end of the range to force a single value partition.
CREATE TABLE tb_hit_source_a PARTITION OF tb_hit_source
FOR VALUES FROM ('a') TO ('a');
etc.
podtgresql 11 offers PARTITION BY LIST (source) allowing the partitions to be declared more simply.
CREATE TABLE tb_hit_source_a PARTITION OF tb_hit_source
FOR VALUES IN ('a');
to create the partitions do
create table part_a (check source='part_a' )inherits (tb_hit_source);
create table part_a (check source='part_b' )inherits (tb_hit_source);
etc.
but if there are going to be many partitions it will probably be more convenient to put them in a separate schema.
create schema hit_source_parts;
create table hit_source_parts.a (check(source='a'))inherits (tb_hit_source);
create table hit_source_parts.b (check(source='b'))inherits (tb_hit_source);
etc.
Any partitions you make will also need the aproptiate indexes.
unique constraints won't work across partitions this is one reaon why most uses of partitioning partition on one of the unique colums, by fragmenting this way uniqueness in each partition also enforces global uniqueness.