Single primary key with BIGINT vs compound primary key including foreign keys - database-schema

I have a parent-child relationship between two tables
CREATE TABLE symbol (
id INT,
name VARCHAR(50)
)
CREATE TABLE time_series (
id INT,
actual_date DATE,
symbol_id INT,
data1 FLOAT,
data2 VARCHAR(50)
)
time_series has a symbol_id foreign key to symbol, and a unique date column, actual_date.
Now at the moment time_series has an INT primary key column but I have been blitzing the app with load tests and almost reached the limit of available primary keys.
It seems my alternatives are either to increase the size of the primary key column to BIGINT, or to drop the id column and create the primary key on symbol_id and actual_date.
I've read that a BIGINT column primary key is slower than an INT column, and performance is a factor in my decision-making. How much of a hit would I take using BIGINT? Does a compound primary key also perform slower?
In terms of other factors, this time_series table will never have any child tables, so there would be no inconvenient compound foreign keys.
Read this already: Sql Data Type for Primary Key - SQL Server?

Related

Do I need to declare a unique constraint for `bigint generated always as identity`?

I'm creating a multi-tenant application and am prepending the tenant_id to all tables that my tenants will access. All of the tables will also have an incrementing surrogate key. Will I need to declare a unique constraint of the surrogate key or is that redundant?
CREATE TABLE tenant (
primary key (tenant_id),
tenant_id bigint generated always as identity
);
CREATE TABLE person (
primary key (tenant_id, person_id)
person_id bigint generated always as identity,
tenant_id bigint not null,
unique (person_id), -- Do I need this?
foreign key (tenant_id) references tenant
);
The primary key of a table should be a minimal set of columns that uniquely identify a table row. So that should be person_id, as it was specifically created for that purpose.
Add another (non-unique) index on tenant_id or (tenant_id, person_id) if you need to speed up searches based on tenant_id.

How to use constraints on ranges with a junction table?

Based on the documentation it's pretty straightforward how to prevent any overlapping reservations in the table at the same time.
CREATE EXTENSION btree_gist;
CREATE TABLE room_reservation (
room text,
during tsrange,
EXCLUDE USING GIST (room WITH =, during WITH &&)
);
However, when you have multiple resources that can be reserved by users, what is the best approach to check for overlappings? You can see below that I want to have users reserve multiple resources. That's why I'm using the junction table Resources_Reservations. Is there any way I can use EXCLUDE in order to check that no resources are reserved at the same time?
CREATE TABLE Users(
id serial primary key,
name text
);
CREATE TABLE Resources(
id serial primary key,
name text
);
CREATE TABLE Reservations(
id serial primary key,
duration tstzrange,
user_id serial,
FOREIGN KEY (user_id) REFERENCES Users(id)
);
CREATE TABLE Resources_Reservations(
resource_id serial,
reservation_id serial,
FOREIGN KEY (resource_id) REFERENCES Resources(id),
FOREIGN KEY (reservation_id) REFERENCES Reservations(id),
PRIMARY KEY (resource_id, reservation_id)
);
I think what you want is doable with a slight model change.
But first let's correct a misconception. You have foreign key columns (user_id, resource_id, etc) defined as SERIAL. This is incorrect, they should be INTEGER. This is because SERIAL is not actually a data type. It is a psuedo-data type that is actually a shortcut for: creating a sequence, creating a column of type integer, and defining the sequence created as the default value. With that out of the way.
I think your Resources_Reservations is redundant. A reservation is by a user, but a reservation without something reserved would just be user information. Bring the resource_id into Reservation. Now a Reservation is by a user for a resource with a duration. Everything your current model contains but less complexity.
Assuming you don't have data that needs saving, then:
create table users(
id serial primary key,
name text
);
create table resources(
id serial primary key,
name text
);
create table reservations(
user_id integer
resource_id integer
duration tstzrange,
foreign key (user_id) references users(id)
foreign key (resource_id) references resources(id),
primary key (resource_id, user_id)
);
You should now be able to create your GIST exclusion.

Unique constraint that includes serial primary key in postgresql

I have a postgresql tables with the following layout:
create table bar(
bar_id serial primary key,
... other columns
)
create table foo(
foo_id serial primary key,
bar_id bigint not null,
... other columns
)
create table baz(
baz_id serial primary key,
foo_id bigint not null references foo(foo_id),
bar_id bigint not null references bar(bar_id),
constraint fk_fb foreign key (foo_id, bar_id) references foo(foo_id, bar_id)
)
I want to refer to both foo_id and bar_id in another table (baz) and have a foreign key constraint, so I need to add a unique constraint to (foo_id, bar_id). The fact that foo_id is a primary key guarantees that the combination of foo_id and bar_id is unique, even if every single value of bar_id is the same. My question is if there is a performance hit to adding the unique constraint on (foo_id, bar_id), or if postgresql is smart enough to know that the fact that foo_id is unique across the table by virtue of being the primary key means that there is no need to do anything with bar_id.
The table foo contains rows that are not present in baz, so dropping the bar_id from the foo table won't work.
There is a performance penalty for adding another UNIQUE constraint, because such a constraint is implemented with an index that needs to be updated for every data modification on the table.
One thing that you could consider if DML performance on this table is at a premium is defining the primary key over both columns. Then you would lose the uniqueness guarantee for foo_id, but you don't have to pay the price for an extra index.
Perhaps you can also come up with an alternative data model that does not require you to reference both columns with a foreign key, like GMB suggested in his answer.

Postgresql sharding with citus extension not working

I am using Postgresql with citus extension for sharding and unable to shard tables like below.
Below table has a primary key and 2 unique keys. I am trying to shard against column with primary key i.e pid.
Note: I am not allowed to change the table structure. These tables are created by tool.
CREATE TABLE person
(
pid bigint NOT NULL,
name character varying(100),
address_pid bigint NOT NULL,
address_type character varying(100),
CONSTRAINT id_pkey PRIMARY KEY (pid),
CONSTRAINT addr_id UNIQUE (address_pid),
CONSTRAINT addr_type_id UNIQUE (address_type, address_pid)
);
This my sharding query:
select create_distributed_table('person', 'pid');
Error it throw is:
Error: Distributed relations cannot have UNIQUE, EXCLUDE, or PRIMARY KEY constraints that do not include the partition column
Can anyone help me with sharding these kind of tables?
#CraigKerstiens Addition to this question:
How to handle sharding when we have multiple foreign keys like this one.
CREATE TABLE table
(
pid bigint NOT NULL,
search_order integer NOT NULL,
resource_pid bigint NOT NULL,
search_pid bigint NOT NULL,
CONSTRAINT hfj_search_result_pkey PRIMARY KEY (pid),
CONSTRAINT idx_searchres_order UNIQUE (search_pid, search_order),
CONSTRAINT fk_searchres_res FOREIGN KEY (resource_pid)
REFERENCES public.table1 (res_id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION,
CONSTRAINT fk_searchres_search FOREIGN KEY (search_pid)
REFERENCES public.table2 (pid) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
)
Assuming that table1 and table2 are already sharded.
Within Citus at this time you cannot have a unique constraint that doesn't include they column you are partitioning on. In this case, it'd be possible to enforce addresses were unique to the person id, but not globally unique. To do that you could:
CREATE TABLE person
(
pid bigint NOT NULL,
name character varying(100),
address_pid bigint NOT NULL,
address_type character varying(100),
CONSTRAINT id_pkey PRIMARY KEY (pid),
CONSTRAINT addr_id UNIQUE (pid, address_pid),
CONSTRAINT addr_type_id UNIQUE (pid, address_type, address_pid)
);

Will a primary key index serve as an index for a foreign key when fk columns are subset of pk?

I have a table where part of the primary key is a foreign key to another table.
create table player_result (
event_id integer not null,
pub_time timestamp not null,
name_key varchar(128) not null,
email_address varchar(128),
withdrawn boolean not null,
place integer,
realized_values hstore,
primary key (event_id, pub_time, name_key),
foreign key (email_address) references email(address),
foreign key (event_id, pub_time) references event_publish(event_id, pub_time));
Will the index generated for the primary key suffice to back the foreign key on event_id and pub_time?
Yes.
Index A,B,C
is good for:
A
A,B
A,B,C (and any other combination of the full 3 fields, if default order is unimportant)
but not good for other combinations (such as B,C, C,A etc.).
It will be useful for the referencing side, such that a DELETE or UPDATE on the referenced table can use the PRIMARY KEY of the referencing side as an index when performing checks for the existence of referencing rows or running cascade update/deletes. PostgreSQL doesn't require this index to exist at all, it just makes foreign key constraint checks faster if it is there.
It is not sufficient to serve as the unique constraint for a reference to those columns. You couldn't create a FOREIGN KEY that REFERENCES player_result(event_id, pub_time) because there is no unique constraint on those columns. That pair can appear multiple times in the table so long as each pair has a different name_key.
As #xagyg accurately notes, the unique b-tree index created by the foreign key reference is also only useful for references to columns from the left of the index. It could not be used for a lookup of pub_time, name_key or just name_key, for example.