Unique constraint that includes serial primary key in postgresql - postgresql

I have a postgresql tables with the following layout:
create table bar(
bar_id serial primary key,
... other columns
)
create table foo(
foo_id serial primary key,
bar_id bigint not null,
... other columns
)
create table baz(
baz_id serial primary key,
foo_id bigint not null references foo(foo_id),
bar_id bigint not null references bar(bar_id),
constraint fk_fb foreign key (foo_id, bar_id) references foo(foo_id, bar_id)
)
I want to refer to both foo_id and bar_id in another table (baz) and have a foreign key constraint, so I need to add a unique constraint to (foo_id, bar_id). The fact that foo_id is a primary key guarantees that the combination of foo_id and bar_id is unique, even if every single value of bar_id is the same. My question is if there is a performance hit to adding the unique constraint on (foo_id, bar_id), or if postgresql is smart enough to know that the fact that foo_id is unique across the table by virtue of being the primary key means that there is no need to do anything with bar_id.
The table foo contains rows that are not present in baz, so dropping the bar_id from the foo table won't work.

There is a performance penalty for adding another UNIQUE constraint, because such a constraint is implemented with an index that needs to be updated for every data modification on the table.
One thing that you could consider if DML performance on this table is at a premium is defining the primary key over both columns. Then you would lose the uniqueness guarantee for foo_id, but you don't have to pay the price for an extra index.
Perhaps you can also come up with an alternative data model that does not require you to reference both columns with a foreign key, like GMB suggested in his answer.

Related

unique constraint to a foreign table

I have two tables, experiment and sample. Samples must be unique within experiments of the same type, but can be shared between experiments that have a different type. I understand how I can add a unique constraint to the samples to make them always unique, but is it possible to create a unique constraint based on the information in the foreign table?
create table experiment (
id integer primary key,
name text,
type text
);
create table sample (
id integer primary key,
name text,
experiment_id integer,
);
alter table sample add constraint exp_fkey foreign key (experiment_id) references experiment(id);
You can create a (redundant) unique constraint on experiment (id, type), add type to sample, create a foreign key constraint from (experiment_id, type) to (id, type) and a unique constraint on sample.type.

Postgresql sharding with citus extension not working

I am using Postgresql with citus extension for sharding and unable to shard tables like below.
Below table has a primary key and 2 unique keys. I am trying to shard against column with primary key i.e pid.
Note: I am not allowed to change the table structure. These tables are created by tool.
CREATE TABLE person
(
pid bigint NOT NULL,
name character varying(100),
address_pid bigint NOT NULL,
address_type character varying(100),
CONSTRAINT id_pkey PRIMARY KEY (pid),
CONSTRAINT addr_id UNIQUE (address_pid),
CONSTRAINT addr_type_id UNIQUE (address_type, address_pid)
);
This my sharding query:
select create_distributed_table('person', 'pid');
Error it throw is:
Error: Distributed relations cannot have UNIQUE, EXCLUDE, or PRIMARY KEY constraints that do not include the partition column
Can anyone help me with sharding these kind of tables?
#CraigKerstiens Addition to this question:
How to handle sharding when we have multiple foreign keys like this one.
CREATE TABLE table
(
pid bigint NOT NULL,
search_order integer NOT NULL,
resource_pid bigint NOT NULL,
search_pid bigint NOT NULL,
CONSTRAINT hfj_search_result_pkey PRIMARY KEY (pid),
CONSTRAINT idx_searchres_order UNIQUE (search_pid, search_order),
CONSTRAINT fk_searchres_res FOREIGN KEY (resource_pid)
REFERENCES public.table1 (res_id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION,
CONSTRAINT fk_searchres_search FOREIGN KEY (search_pid)
REFERENCES public.table2 (pid) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
)
Assuming that table1 and table2 are already sharded.
Within Citus at this time you cannot have a unique constraint that doesn't include they column you are partitioning on. In this case, it'd be possible to enforce addresses were unique to the person id, but not globally unique. To do that you could:
CREATE TABLE person
(
pid bigint NOT NULL,
name character varying(100),
address_pid bigint NOT NULL,
address_type character varying(100),
CONSTRAINT id_pkey PRIMARY KEY (pid),
CONSTRAINT addr_id UNIQUE (pid, address_pid),
CONSTRAINT addr_type_id UNIQUE (pid, address_type, address_pid)
);

Migrating from Postgresql to Postgres-XL: distributed tables design

I need to scale out our application DB due to the amount of data. It's on PostgreSQL 9.3. So, I've found PostgreSQL-XL and it looks awesome, but I'm having a hard time trying to wrap my head around the limitations for distributed tables. To distribute them by replication (where the whole table is replicated in every datanode) is quite OK, but let's say I have two big related tables that need to be "sharded" along the datanodes:
CREATE TABLE foos
(
id bigserial NOT NULL,
project_id integer NOT NULL,
template_id integer NOT NULL,
batch_id integer,
dataset_id integer NOT NULL,
name text NOT NULL,
CONSTRAINT pk_foos PRIMARY KEY (id),
CONSTRAINT fk_foos_batch_id FOREIGN KEY (batch_id)
REFERENCES batches (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT fk_foos_dataset_id FOREIGN KEY (dataset_id)
REFERENCES datasets (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT fk_foos_project_id FOREIGN KEY (project_id)
REFERENCES projects (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT fk_foos_template_id FOREIGN KEY (template_id)
REFERENCES templates (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT uc_foos UNIQUE (project_id, name)
);
CREATE TABLE foo_childs
(
id bigserial NOT NULL,
foo_id bigint NOT NULL,
template_id integer NOT NULL,
batch_id integer,
ffdata hstore,
CONSTRAINT pk_ff_foos PRIMARY KEY (id),
CONSTRAINT fk_fffoos_batch_id FOREIGN KEY (batch_id)
REFERENCES batches (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT fk_fffoos_foo_id FOREIGN KEY (foo_id)
REFERENCES foos (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT fk_fffoos_template_id FOREIGN KEY (template_id)
REFERENCES templates (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE
);
Now Postgres-XL documentation states that:
"(...) in distributed tables, UNIQUE constraints must include the
distribution column of the table"
"(...) the distribution column must be included in PRIMARY KEY"
"(...) column with REFERENCES (FK) should be the distribution column.
(...) PRIMARY KEY must be the distribution column as well."
Their examples are over simplistic and scarse, so can someone please DDL me the two above tables for postgres-XL using DISTRIBUTE BY HASH()?
Or maybe suggest other ways to scale out?
CREATE TABLE foos
( ... ) DISTRIBUTE BY HASH(id);
CREATE TABLE foos_child
( ... ) DISTRIBUTE BY HASH(foo_id);
Now any join on foos.id = foos_child.foo_id can be pushed down and done locally.

Will a primary key index serve as an index for a foreign key when fk columns are subset of pk?

I have a table where part of the primary key is a foreign key to another table.
create table player_result (
event_id integer not null,
pub_time timestamp not null,
name_key varchar(128) not null,
email_address varchar(128),
withdrawn boolean not null,
place integer,
realized_values hstore,
primary key (event_id, pub_time, name_key),
foreign key (email_address) references email(address),
foreign key (event_id, pub_time) references event_publish(event_id, pub_time));
Will the index generated for the primary key suffice to back the foreign key on event_id and pub_time?
Yes.
Index A,B,C
is good for:
A
A,B
A,B,C (and any other combination of the full 3 fields, if default order is unimportant)
but not good for other combinations (such as B,C, C,A etc.).
It will be useful for the referencing side, such that a DELETE or UPDATE on the referenced table can use the PRIMARY KEY of the referencing side as an index when performing checks for the existence of referencing rows or running cascade update/deletes. PostgreSQL doesn't require this index to exist at all, it just makes foreign key constraint checks faster if it is there.
It is not sufficient to serve as the unique constraint for a reference to those columns. You couldn't create a FOREIGN KEY that REFERENCES player_result(event_id, pub_time) because there is no unique constraint on those columns. That pair can appear multiple times in the table so long as each pair has a different name_key.
As #xagyg accurately notes, the unique b-tree index created by the foreign key reference is also only useful for references to columns from the left of the index. It could not be used for a lookup of pub_time, name_key or just name_key, for example.

How to make a Foreigne key that references NOT Primary or Unique column from another table?

I have made classical Employee - Employer table that has one primary key and one foreign key. Foreign key references primary so it is a self referencing table:
CREATE TABLE Worker
(
OIB NUMERIC(2,0),
Name NVARCHAR(10),
Surname NVARCHAR(20),
DateOfEmployment DATETIME2 NOT NULL,
Adress NVARCHAR(20),
City NVARCHAR(10),
SUPERIOR NUMERIC(2,0) UNIQUE,
Constraint PK_Worker PRIMARY KEY(OIB),
CONSTRAINT FK_Worker FOREIGN KEY (Superior) REFERENCES Worker(OIB)
);
Second table that should keep points for all my employees is made like this:
CREATE TABLE Point
(
OIB_Worker NUMERIC(2,0) NOT NULL,
OIB_Superior NUMERIC(2,0) NOT NULL,
Pt_To_Worker tinyint,
Pt_To_Superior tinyint,
Month_ INT NOT NULL,
Year_ INT NOT NULL,
CONSTRAINT FK_Point_Worker FOREIGN KEY (OIB_Worker) References Worker(OIB),
CONSTRAINT FK_Point_Worker_2 FOREIGN KEY (OIB_Superior) References Worker(Superior),
CONSTRAINT PK_Point PRIMARY KEY(OIB_Worker,OIB_Superior,Month_,Year_)
);
It should enable storing grades per month for every employee.
That is, I have two foreign keys, one for Worker.OIB and one for Worker.Superior. Also, I have a composite primary key made of columns Point.OIB_Worker, Point.OIB_superior, Point.Month_ and Point.Year_. Key is composite because it needs to disable entering grades more then once a month.
My question is:
How to make a foreign key from Point to Worker so that any superior can have more then one employee assigned to him?
If you look closely, my implementation works but it can have only one employee per manager.
That is because of a fact that a foreign key has to reference either the primary or the unique column from other table. And my Worker.Superior is UNIQUE, so it can have only unique values (no repetition).
I think many people will find this example interesting as it is a common problem when making a new database.
I think your FK_Point_Worker_2 should also have References Worker(OIB), and you should remove the UNIQUE constraint from Worker.Superior. That way a superior can have more than one worker assigned to him.
Think about it. You have unique constraint on SUPERIOR and you are confused as to why two employees cannot have the same SUPERIOR. That is what a unique constraint does - not allow duplicates.
A FK can only reference a unique column or columns.
A FK_Point_Worker_2 with a References Worker(OIB) does not assure OIB is a Superior.
I would add a unique constraint on Worker on (OIB, SUPERIOR)
and remove the unique constraint on SUPERIOR.
It will always be unique as OIB is unique.
Then have composite FK relationship.
This is an example of a composite FK relationship
ALTER TABLE [dbo].[wfBchFolder] WITH CHECK ADD CONSTRAINT [FK_wfBchFolder_wfBch] FOREIGN KEY([wfID], [bchID])
REFERENCES [dbo].[WFbch] ([wfID], [ID])
GO