uuid as primary key and partition key for hash partitioning - postgresql

I'm setting up a partitioned by hash table in PostgreSQL 12 which will have 256 partitions. I'm using uuid as my primary key for the table. Is it acceptable to use the same uuid column as the hash key?

Not only acceptable, but required.
Per the docs, 11.6. Unique Indexes
"PostgreSQL automatically creates a unique index when a unique constraint or primary key is defined for a table. The index covers the columns that make up the primary key or unique constraint (a multicolumn index, if appropriate), and is the mechanism that enforces the constraint."
Also per the docs, 5.11.2.3. Limitations,
The following limitations apply to partitioned tables:
"Unique constraints on partitioned tables must include all the partition key columns. This limitation exists because PostgreSQL can only enforce uniqueness in each partition individually."

Related

Postgresql partition table unique index problem

postgres 14
I have some table:
CREATE TABLE sometable (
id integer NOT NULL PRIMARY KEY UNIQUE ,
a integer NOT NULL DEFAULT 1,
b varchar(32) UNIQUE)
PARTITION BY RANGE (id);
But when i try to execute it, i get
ERROR: unique constraint on partitioned table must include all partitioning columns
If i execute same table definition without PARTITION BY RANGE (id) and check indexes, i get:
tablename indexname indexdef
sometable, sometable_b_key, CREATE UNIQUE INDEX sometable_b_key ON public.sometable USING btree (b)
sometable, sometable_pkey, CREATE UNIQUE INDEX sometable_pkey ON public.sometable USING btree (id)
So... unique constraints exist
whats the problem? how can i fix it?
On partitioned tables, all primary keys, unique constraints and unique indexes must contain the partition expression. That is because indexes on partitioned tables are implemented by individual indexes on each partition, and there is no way to enforce uniqueness across different indexes.
If you want to use partitioning, you have to sacrifice some consistency guarantees. There is no way around that. What you can do is create unique constraints on the partitions. That will guarantee uniqueness within each partition, but not global uniqueness.
This limitation is also mentioned in the docs
5.11.2.3. Limitations The following limitations apply to partitioned tables:
Unique constraints (and hence primary keys) on partitioned tables must
include all the partition key columns. This limitation exists because
the individual indexes making up the constraint can only directly
enforce uniqueness within their own partitions; therefore, the
partition structure itself must guarantee that there are not
duplicates in different partitions.
There is no way to create an exclusion constraint spanning the whole
partitioned table. It is only possible to put such a constraint on
each leaf partition individually. Again, this limitation stems from
not being able to enforce cross-partition restrictions.
https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE

Unique Index on partition table isse

How we can eliminate duplicate insertion after postgres partitions.
As Partition key on unique constraint causing non key attribute duplicate.
EX: ID date
1 1-01-2022
1 02-01-2022
To make ID unique we have before insert trigger is the only option, Any other ways?
You cannot have a unique constraint on a partitioned table that does not contain the partitioning key. A trigger won't be a reliable solution either, because it would be subject to race conditions (concurrent inserts) unless you are running with the SERIALIZABLE isolation level.
The best you can have is unique constraints on each individual partitions.
The best you can do is fill the values from a sequence, so that the values are automatically unique, for example with an identity column.

TimescaleDB/PostgreSQL: how to use unique constraint when creating hypertables?

I am trying to create a table in PostgreSQL to contain lots of data and for that reason I want to use timescales hypertable as in the example below.
CREATE TABLE "datapoints" (
"tstz" timestamptz NOT NULL,
"id" bigserial UNIQUE NOT NULL,
"entity_id" bigint NOT NULL,
"value" real NOT NULL,
PRIMARY KEY ("id", "tstz", "entity_id")
);
SELECT create_hypertable('datapoints','tstz');
However, this throws an error - shown below. As far as I have figured out the error arise since the unique constraint isn't allowed in hypertables, but I really need the uniqueness. So does anyone have an idea on how to solve it or work around it?
ERROR: cannot create a unique index without the column "tstz" (used in partitioning)
SQL state: TS103
There is no way to avoid that.
TimescaleDB uses PostgreSQL partitioning, and it is not possible to have a primary key or unique constraint on a partitioned table that does not contain the partitioning key.
The reason behind that is that an index on a partitioned table consists of individual indexes on the partitions (these are the partitions of the partitioned index). Now the only way to guarantee uniqueness for such a partitioned index is to have the uniqueness implicit in the definition, which is only the case if the partitioning key is part of the index.
So you either have to sacrifice the uniqueness constraint on id (which is pretty much given if you use a sequence) or you have to do without partitioning.

Must the search key of a primary index be or related to the primary key?

From https://stackoverflow.com/a/51087864/3284469
primary keys can be primary indices.
Must the search key of a primary index be or related to a primary key ? Will the answer be different in PostgreSQL and other DBMS? Thanks.
Postgres doesn't have "primary index", all indexes are implemented the same way, and point directly to the data rows.
Must the search key of a primary index be or related to a primary key
It must be a search on the expression used to form the primary index.
if the primary index is constrained to be an index on the primary key then yes else no.
Will the answer be different in PostgreSQL and other DBMS?
yes, because postgresql does not have primary index. although a clustered index is a bit like a primary index. the clustered index can be an index on on any expression, it need not reference the primary key at all.
with postgreql there is no requirement that a table have any index. but if you want to define relations between tables then indexes are required.

Postgres - unique index on primary key

On Postgres, a unique index is automatically created for primary key columns. From the docs,
When an index is declared unique, multiple table rows with equal
indexed values are not allowed. Null values are not considered equal.
A multicolumn unique index will only reject cases where all indexed
columns are equal in multiple rows.
From my understanding, it seems like this index only checks uniqueness and isn't actually present for faster access when querying by primary key id's. Does this mean that this index structure doesn't consist of a sorted table (or a tree) for the primary key column? Is this correct?
In theory a unique or primary key constraint could be enforced without the presence of an index, but it would be a painful process. The index is mainly there for performance purposes.
However some databases (eg Oracle) allow a unique or primary key constraint to be supported by a non-unique index. Primarily this allows the enforcement of the constraint to be deferred until the end of a transaction, so lack of uniqueness can be permitted temporarily during a transaction, but also allows indexes to be built in parallel and with the constraint then defined as a secondary step.
Also, I'm not sure how the internals work on a PostgreSQL btree index, but all Oracle btree's are internally declared to be unique either:
on the key column(s), for an index that is intended to be UNIQUE, or
on the key column(s) plus the indexed row's ROWID, for a non-unique index.
Quite the contrary, The index is created in order to allow faster access - mainly to check for duplicates when a new record is inserted but can also be used by other queries against PK columns. The best structure for uk indexes is a btree because during the insert the index is created - If the rdbms detects collision in the leaf he will raise a unique constraint violation.