Do I need to declare a unique constraint for `bigint generated always as identity`? - postgresql

I'm creating a multi-tenant application and am prepending the tenant_id to all tables that my tenants will access. All of the tables will also have an incrementing surrogate key. Will I need to declare a unique constraint of the surrogate key or is that redundant?
CREATE TABLE tenant (
primary key (tenant_id),
tenant_id bigint generated always as identity
);
CREATE TABLE person (
primary key (tenant_id, person_id)
person_id bigint generated always as identity,
tenant_id bigint not null,
unique (person_id), -- Do I need this?
foreign key (tenant_id) references tenant
);

The primary key of a table should be a minimal set of columns that uniquely identify a table row. So that should be person_id, as it was specifically created for that purpose.
Add another (non-unique) index on tenant_id or (tenant_id, person_id) if you need to speed up searches based on tenant_id.

Related

Citus: How can I add self referencing table in distributed tables list

I'm trying to run create_distributed_table for tables which i need to shard and almost all of the tables have self relation ( parent child )
but when I run SELECT create_distributed_table('table-name','id');
it throws error cannot create foreign key constraint
simple steps to reproduce
CREATE TABLE TEST (
ID TEXT NOT NULL,
NAME CHARACTER VARYING(255) NOT NULL,
PARENT_ID TEXT
);
ALTER TABLE TEST ADD CONSTRAINT TEST_PK PRIMARY KEY (ID);
ALTER TABLE TEST ADD CONSTRAINT TEST_PARENT_FK FOREIGN KEY (PARENT_ID) REFERENCES TEST (ID);
ERROR
citus=> SELECT create_distributed_table('test','id');
ERROR: cannot create foreign key constraint
DETAIL: Foreign keys are supported in two cases, either in between two colocated tables including partition column in the same ordinal in the both tables or from distributed to reference tables
For the time being, it is not possible to shard a table on PostgreSQL without dropping the self referencing foreign key constraints, or altering them to include a separate and new distribution column.
Citus places records into shards based on the hash values of the distribution column values. It is most likely the case that the hashes of parent and child id values are different and hence the records should be stored in different shards, and possibly on different worker nodes. PostgreSQL does not have a mechanism to create foreign key constraints that reference records on different PostgreSQL clusters.
Consider adding a new column tenant_id and adding this column to the primary key and foreign key constraints.
CREATE TABLE TEST (
tenant_id INT NOT NULL,
id TEXT NOT NULL,
name CHARACTER VARYING(255) NOT NULL,
parent_id TEXT NOT NULL,
FOREIGN KEY (tenant_id, parent_id) REFERENCES test(tenant_id, id),
PRIMARY KEY (tenant_id, id)
);
SELECT create_distributed_table('test','tenant_id');
Note that parent and child should always be in the same tenant for this to work.

How to use constraints on ranges with a junction table?

Based on the documentation it's pretty straightforward how to prevent any overlapping reservations in the table at the same time.
CREATE EXTENSION btree_gist;
CREATE TABLE room_reservation (
room text,
during tsrange,
EXCLUDE USING GIST (room WITH =, during WITH &&)
);
However, when you have multiple resources that can be reserved by users, what is the best approach to check for overlappings? You can see below that I want to have users reserve multiple resources. That's why I'm using the junction table Resources_Reservations. Is there any way I can use EXCLUDE in order to check that no resources are reserved at the same time?
CREATE TABLE Users(
id serial primary key,
name text
);
CREATE TABLE Resources(
id serial primary key,
name text
);
CREATE TABLE Reservations(
id serial primary key,
duration tstzrange,
user_id serial,
FOREIGN KEY (user_id) REFERENCES Users(id)
);
CREATE TABLE Resources_Reservations(
resource_id serial,
reservation_id serial,
FOREIGN KEY (resource_id) REFERENCES Resources(id),
FOREIGN KEY (reservation_id) REFERENCES Reservations(id),
PRIMARY KEY (resource_id, reservation_id)
);
I think what you want is doable with a slight model change.
But first let's correct a misconception. You have foreign key columns (user_id, resource_id, etc) defined as SERIAL. This is incorrect, they should be INTEGER. This is because SERIAL is not actually a data type. It is a psuedo-data type that is actually a shortcut for: creating a sequence, creating a column of type integer, and defining the sequence created as the default value. With that out of the way.
I think your Resources_Reservations is redundant. A reservation is by a user, but a reservation without something reserved would just be user information. Bring the resource_id into Reservation. Now a Reservation is by a user for a resource with a duration. Everything your current model contains but less complexity.
Assuming you don't have data that needs saving, then:
create table users(
id serial primary key,
name text
);
create table resources(
id serial primary key,
name text
);
create table reservations(
user_id integer
resource_id integer
duration tstzrange,
foreign key (user_id) references users(id)
foreign key (resource_id) references resources(id),
primary key (resource_id, user_id)
);
You should now be able to create your GIST exclusion.

Postgresql sharding with citus extension not working

I am using Postgresql with citus extension for sharding and unable to shard tables like below.
Below table has a primary key and 2 unique keys. I am trying to shard against column with primary key i.e pid.
Note: I am not allowed to change the table structure. These tables are created by tool.
CREATE TABLE person
(
pid bigint NOT NULL,
name character varying(100),
address_pid bigint NOT NULL,
address_type character varying(100),
CONSTRAINT id_pkey PRIMARY KEY (pid),
CONSTRAINT addr_id UNIQUE (address_pid),
CONSTRAINT addr_type_id UNIQUE (address_type, address_pid)
);
This my sharding query:
select create_distributed_table('person', 'pid');
Error it throw is:
Error: Distributed relations cannot have UNIQUE, EXCLUDE, or PRIMARY KEY constraints that do not include the partition column
Can anyone help me with sharding these kind of tables?
#CraigKerstiens Addition to this question:
How to handle sharding when we have multiple foreign keys like this one.
CREATE TABLE table
(
pid bigint NOT NULL,
search_order integer NOT NULL,
resource_pid bigint NOT NULL,
search_pid bigint NOT NULL,
CONSTRAINT hfj_search_result_pkey PRIMARY KEY (pid),
CONSTRAINT idx_searchres_order UNIQUE (search_pid, search_order),
CONSTRAINT fk_searchres_res FOREIGN KEY (resource_pid)
REFERENCES public.table1 (res_id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION,
CONSTRAINT fk_searchres_search FOREIGN KEY (search_pid)
REFERENCES public.table2 (pid) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
)
Assuming that table1 and table2 are already sharded.
Within Citus at this time you cannot have a unique constraint that doesn't include they column you are partitioning on. In this case, it'd be possible to enforce addresses were unique to the person id, but not globally unique. To do that you could:
CREATE TABLE person
(
pid bigint NOT NULL,
name character varying(100),
address_pid bigint NOT NULL,
address_type character varying(100),
CONSTRAINT id_pkey PRIMARY KEY (pid),
CONSTRAINT addr_id UNIQUE (pid, address_pid),
CONSTRAINT addr_type_id UNIQUE (pid, address_type, address_pid)
);

Will a primary key index serve as an index for a foreign key when fk columns are subset of pk?

I have a table where part of the primary key is a foreign key to another table.
create table player_result (
event_id integer not null,
pub_time timestamp not null,
name_key varchar(128) not null,
email_address varchar(128),
withdrawn boolean not null,
place integer,
realized_values hstore,
primary key (event_id, pub_time, name_key),
foreign key (email_address) references email(address),
foreign key (event_id, pub_time) references event_publish(event_id, pub_time));
Will the index generated for the primary key suffice to back the foreign key on event_id and pub_time?
Yes.
Index A,B,C
is good for:
A
A,B
A,B,C (and any other combination of the full 3 fields, if default order is unimportant)
but not good for other combinations (such as B,C, C,A etc.).
It will be useful for the referencing side, such that a DELETE or UPDATE on the referenced table can use the PRIMARY KEY of the referencing side as an index when performing checks for the existence of referencing rows or running cascade update/deletes. PostgreSQL doesn't require this index to exist at all, it just makes foreign key constraint checks faster if it is there.
It is not sufficient to serve as the unique constraint for a reference to those columns. You couldn't create a FOREIGN KEY that REFERENCES player_result(event_id, pub_time) because there is no unique constraint on those columns. That pair can appear multiple times in the table so long as each pair has a different name_key.
As #xagyg accurately notes, the unique b-tree index created by the foreign key reference is also only useful for references to columns from the left of the index. It could not be used for a lookup of pub_time, name_key or just name_key, for example.

How do you create a super key in a table?

How do I make both the columns sid and ccode in the table below a super key when creating the table?:
Enrolled (sid integer, ccode varchar(6))
Here's what I've tried with but obviously SQL does not allow multiple primary keys to be declared this way:
CREATE TABLE Enrolled
(
sid integer,
ccode varchar(6),
CONSTRAINT enrolled_pkey1 PRIMARY KEY (sid),
CONSTRAINT enrolled_pkey2 PRIMARY KEY (ccode)
);
I'm using pgAdmin3 - Postgres.
You need to create one primary key over both columns:
CREATE TABLE Enrolled
(
sid integer,
ccode varchar(6),
CONSTRAINT enrolled_pkey PRIMARY KEY (sid, code)
);
However, it is usually a good idea to keep a simple serial-type primary key and just use e.g. UNIQUE constraints for field or combination of fields that need to be unique.