Postgresql - Optimize ordering of columns in table range partitioning with multiple columns range

Postgresql - Optimize ordering of columns in table range partitioning with multiple columns range - postgresql

I am testing with creating a data warehouse for a relatively big dataset. Based on ~10% sample of the data I decided to partition some tables that are expected to exceed memory which currently 16GB
Based on the recommendation in postgreSQL docs: (edited)
These benefits will normally be worthwhile only when a table would otherwise be very large. The exact point at which a table will benefit from partitioning depends on the application, although a rule of thumb is that the size of the table should exceed the physical memory of the database server.
One particular table I am not sure how to partition, this table is frequently queried in 2 different ways, with WHERE clause that may include primary key OR another indexed column, so figured I need a range partition using the existing primary key and the other column (with the other column added to the primary key).
Knowing that the order of columns matters, and given the below information my question is:
What is the best order for primary key and range partitioning columns?
Original table:
CREATE TABLE items (
item_id BIGSERIAL NOT NULL, -- primary key
src_doc_id bigint NOT NULL, -- every item can exist in one src_doc only and a src_doc can have multiple items
item_name character varying(50) NOT NULL, -- used in `WHERE` clause with src_doc_id and guaranteed to be unique from source
attr_1 bool,
attr_2 bool, -- +15 other columns all bool or integer types
PRIMARY KEY (item_id)
);
CREATE INDEX index_items_src_doc_id ON items USING btree (src_doc_id);
CREATE INDEX index_items_item_name ON items USING hash (item_name);
Table size for 10% of the dataset is ~2GB (result of pg_total_relation_size) with 3M+ rows, loading or querying performance is excellent, but thinking that this table is expected to grow to 30M rows and size 20GB I do not know what to expect in terms of performance.
Partitioned table being considered:
CREATE TABLE items (
item_id BIGSERIAL NOT NULL,
src_doc_id bigint NOT NULL,
item_name character varying(50) NOT NULL,
attr_1 bool,
attr_2 bool,
PRIMARY KEY (item_id, src_doc_id) -- should the order be reversed?
) PARTITION BY RANGE (item_id, src_doc_id); -- should the order be reversed?
CREATE INDEX index_items_src_doc_id ON items USING btree (src_doc_id);
CREATE INDEX index_items_item_name ON items USING hash (item_name);
-- Ranges are not initially known, so maxvalue is used as upper bound,
-- when upper bound is known, partition is detached and reattached using
-- known known upper bound and a new partition is added for the next range
CREATE TABLE items_00 PARTITION OF items FOR VALUES FROM (MINVALUE, MINVALUE) TO (MAXVALUE, MAXVALUE);
Table usage
On loading data, the load process (python script) looks up existing items based on src_doc_id and item_name and stores item_id, so it does not reinsert existing items. Item_id gets referenced in lot of other tables, no foreign keys are used.
On querying for analytics item information is always looked up based on item_id.
So I can't decide the suitable order for the table PRIMARY KEY and PARTITION BY RANGE,
Should it be (item_id, src_doc_id) or (src_doc_id, item_id)?

Related

Postgresql partition table unique index problem

postgres 14
I have some table:
CREATE TABLE sometable (
id integer NOT NULL PRIMARY KEY UNIQUE ,
a integer NOT NULL DEFAULT 1,
b varchar(32) UNIQUE)
PARTITION BY RANGE (id);
But when i try to execute it, i get
ERROR: unique constraint on partitioned table must include all partitioning columns
If i execute same table definition without PARTITION BY RANGE (id) and check indexes, i get:
tablename indexname indexdef
sometable, sometable_b_key, CREATE UNIQUE INDEX sometable_b_key ON public.sometable USING btree (b)
sometable, sometable_pkey, CREATE UNIQUE INDEX sometable_pkey ON public.sometable USING btree (id)
So... unique constraints exist
whats the problem? how can i fix it?

On partitioned tables, all primary keys, unique constraints and unique indexes must contain the partition expression. That is because indexes on partitioned tables are implemented by individual indexes on each partition, and there is no way to enforce uniqueness across different indexes.
If you want to use partitioning, you have to sacrifice some consistency guarantees. There is no way around that. What you can do is create unique constraints on the partitions. That will guarantee uniqueness within each partition, but not global uniqueness.

This limitation is also mentioned in the docs
5.11.2.3. Limitations The following limitations apply to partitioned tables:
Unique constraints (and hence primary keys) on partitioned tables must
include all the partition key columns. This limitation exists because
the individual indexes making up the constraint can only directly
enforce uniqueness within their own partitions; therefore, the
partition structure itself must guarantee that there are not
duplicates in different partitions.
There is no way to create an exclusion constraint spanning the whole
partitioned table. It is only possible to put such a constraint on
each leaf partition individually. Again, this limitation stems from
not being able to enforce cross-partition restrictions.
https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE

Converting PostgreSQL table to TimescaleDB hypertable

I have a PostgreSQL table which I am trying to convert to a TimescaleDB hypertable.
The table looks as follows:
CREATE TABLE public.data
(
event_time timestamp with time zone NOT NULL,
pair_id integer NOT NULL,
entry_id bigint NOT NULL,
event_data int NOT NULL,
CONSTRAINT con1 UNIQUE (pair_id, entry_id ),
CONSTRAINT pair_id_fkey FOREIGN KEY (pair_id)
REFERENCES public.pairs (id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
)
When I attempt to convert this table to a TimescaleDB hypertable using the following command:
SELECT create_hypertable(
'data',
'event_time',
chunk_time_interval => INTERVAL '1 hour',
migrate_data => TRUE
);
I get the Error: ERROR: cannot create a unique index without the column "event_time" (used in partitioning)
Question 1: From this post How to convert a simple postgresql table to hypertable or timescale db table using created_at for indexing my understanding is that this is because I have specified a unique constraint (pair_id_fkey) which does not contain the column I am partitioning by - event_time. Is that correct?
Question 2: How should I change my table or hypertable to be able to convert this? I have added some data on how I plan to use the data and the structure of the data bellow.
Data Properties and usage:
There can be multiple entries with the same event_time - those entries would have entry_id's which are in sequence
This means that if I have 2 entries (event_time 2021-05-18::10:16, id 105, <some_data>) and (event_time 2021-05-18::10:16, id 107, <some_data>) then the entry with id 106 would also have event_time 2021-05-18::10:16
The entry_id is not generated by me and I use the unique constraint con1 to ensure that I am not inserting duplicate data
I will query the data mainly on event_time e.g. to create plots and perform other analysis
At this point the database contains around 4.6 Billion rows but should contain many more soon
I would like to take advantage of TimescaleDB's speed and good compression
I don't care too much about insert performance
Solutions I have been considering:
Pack all the events which have the same timestamp in to an array somehow and keep them in one row. I think this would have downsides on compression and provide less flexibility on querying the data. Also I would probably end up having to unpack the data on each query.
Remove the unique constraint con1 - then how do I ensure that I don't add the same row twice?
Expand unique constraint con1 to include event_time - would that not somehow decrease performance while at the same time open up for the error where I accidentally insert 2 rows with entry_id and pair_id but different event_time? (I doubt this is a likely thing to happen though)

You understand correctly that UNIQUE (pair_id, entry_id ) doesn't allow to create hypertable from the table, since unique constraints need to include the partition key, i.e., event_time in your case.
I don't follow how the first option, where records with the same timestamp are packed into single record, will help with the uniqueness.
Removing the unique constraint will allow to create hypertable and as you mentioned you will lose possibility to check the constraint.
Adding the time column, e.g., UNIQUE (pair_id, entry_id, event_time) is quite common approach, but it allows to insert duplicates with different timestamps as you mentioned. It will perform worse than option 2 during inserts. You can replace index on event_time (which you need, since you query on this column, and it is created automatically by TimescaleDB) with unique index, so you save a little bit e.g.,
CREATE UNIQUE INDEX indx ON (event_time, pair_id, entry_id);
Manually create unique constraint on each chunk table. This will guarantee uniqueness within the chunk, but it will be still possible to have duplicates in different chunks. The main drawback is you will need to figure out how to create it when new chunk is created.
Unique constraints without partition keys are not supported in TimescaleDB, since it will require to access all existing chunks to check uniqueness and it will kill performance. (or it will require to create a global index, which can be large) I don't think it is common case for time series data to have unique constraints as it is usually related to artificially generated counter-based identifiers.

How to create TimescaleDB Hypertable with time partitioning on non unique timestamp?

I have just started to use TimescaleDB and want to create a hypertable on a table with events.
Originally I thought of following the conventional pattern of:
CREATE TABLE event (
id serial PRIMARY KEY,
ts timestamp with time zone NOT NULL,
details varchar(255) NOT NULL
);
CREATE INDEX event_ts_idx on event(ts);
However, when I tried to create the hypertable with the following query:
SELECT create_hypertable('event', 'ts');
I got: ERROR: cannot create a unique index without the column "ts" (used in partitioning)
After doing some research, it seems that the timestamp itself needs to be the (or part of the) primary key.
However, I do not want the timestamp ts to be unique. It is very likely that these high frequency events will coincide in the same microsecond (the maximum resolution of the timestamp type). It is the whole reason why I am looking into TimescaleDB in the first place.
What is the best practice in this case?
I was thinking of maybe keeping the serial id as part of the primary key, and making it composite like this:
CREATE TABLE event_hyper (
id serial,
ts timestamp with time zone NOT NULL,
details varchar(255) NOT NULL,
PRIMARY KEY (id, ts)
);
SELECT create_hypertable('event_hyper', 'ts');
This sort of works, but I am unsure if it is the right approach, or if I am creating a complicated primary key which will slow down inserts or create other problems.
What is the right approach when you have possible collision in timestamps when using TimescaleDB hypertables?

How to create TimescaleDB Hypertable with time partitioning on non unique timestamp?
There is no need to create unique constraint on time dimension (unique constraints are not required). This works:
CREATE TABLE event (
id serial,
ts timestamp with time zone NOT NULL,
details varchar(255) NOT NULL
);
SELECT create_hypertable('event', 'ts');
Note that the primary key on id is removed.
If you want to create unique constraint or primary key, then TimescaleDB requires that any unique constraint or primary key includes the time dimension. This is similar to limitation of PostgreSQL in declarative partitioning to include partition key into unique constraint:
Unique constraints (and hence primary keys) on partitioned tables must include all the partition key columns. This limitation exists because PostgreSQL can only enforce uniqueness in each partition individually.
TimescaleDB also enforces uniqueness in each chunk individually. Maintaining uniqueness across chunks can affect ingesting performance dramatically.
The most common approach to fix the issue with the primary key is to create a composite key and include the time dimension as proposed in the question. If the index on the time dimension is not needed (no queries only on time is expected), then the index on time dimension can be avoided:
CREATE TABLE event_hyper (
id serial,
ts timestamp with time zone NOT NULL,
details varchar(255) NOT NULL,
PRIMARY KEY (id, ts)
);
SELECT create_hypertable('event_hyper', 'ts', create_default_indexes => FALSE);
It is also possible to use an integer column as the time dimension. It is important that such column has time dimension properties: the value is increasing over time, which is important for insert performance, and queries will select a time range, which is critical for query performance over large database. The common case is for storing unix epoch.
Since id in event_hyper is SERIAL, it will increase with time. However, I doubt the queries will select the range on it. For completeness SQL will be:
CREATE TABLE event_hyper (
id serial PRIMARY KEY,
ts timestamp with time zone NOT NULL,
details varchar(255) NOT NULL
);
SELECT create_hypertable('event_hyper', 'id', chunk_time_interval => 1000000);

To build on #k_rus 's answer, it seems like the generated primary key here is not actually what you're looking for. What meaning does that id have? Isn't it just identifying a unique details, ts combination? Or can there meaningfully be two values that have the same timestamp and the same details but different ids that actually has some sort of semantic meaning. It seems to me that that is somewhat nonsensical, in which case, I would do a primary key on (details, ts) which should provide you the uniqueness condition that you need. I do not know if your ORM will like this, they tend to be overly dependent on generated primary keys because, among other things, not all databases support composite primary keys. But in general, my advice for cases like this is to actually use a composite primary key with logical meaning.
Now if you actually care about multiple messages with the same details at the same timestamp, I might suggest a table structure something like
CREATE TABLE event_hyper (
ts timestamp with time zone NOT NULL,
details varchar(255) NOT NULL,
count int,
PRIMARY KEY (details, ts)
);
with which you can do an INSERT ON CONFLICT DO UPDATE in order to increment it.
I wish that ORMs were better about doing this sort of thing, but you can usually trick ORMs into reading from other tables (or a view over them because then they think they can't update records there etc, which is why they need to have the generated PK). Then it just means that there's a little bit of custom ingest code to write that inserts into the hypertable. It's often better to do this anyway because, in general, I've found that ORMs don't always follow best practices for high volume inserts, and often don't use bulk loading techniques.
So a table like that, with a view that just select's * from the table should then allow you to use the ORM for reads, write a very small amount of custom code to do ingest into the timeseries table and voila - it works. The rest of your relational model, which is the part that the ORM excels at doing can live in the ORM and then have a minor integration here with a bit of custom SQL and a few custom methods.

The limitation is:
Need to make all partition columns (primary & secondary, if any) as a unique key of table.
Refer: https://github.com/timescale/timescaledb/issues/447#issuecomment-369371441
2 choices in my opinion:
partition by a single column, which is a unique key (e.g the primary key),
partition with a 2nd space partition key, need to make the 2 columns a combined unique key,

I got the same problem.
The solution was to avoid this field:
id: 'id'

I think I'm replying a little bit too late, but still.
You can try something like this:
CREATE TABLE event_hyper (
id serial,
ts timestamp with time zone NOT NULL,
details varchar(255) NOT NULL
);
SELECT create_hypertable('event_hyper', 'ts', partitioning_column => 'id', number_partitions => X);
Where X is the desirable number of hash partitions by column 'id'.
https://docs.timescale.com/api/latest/hypertable/create_hypertable/#optional-arguments
As you can also notice there's no PRIMARY KEY constraint in table 'event_hyper'.
Output of create_hypertable() operation should be:
create_hypertable
---------------------------
(1,public,event_hyper,t)

Postgresql 11 Partitioning detail tables based on column in master table in foreign-key relationship

The improvements to declarative range-based partitioning in version 11 seem like they can really work for my use case, but I'm not completely sure how foreign keys work with partitions.
I have tables Files -< Segments -< Entries in which there are hundreds of Segments per File and hundreds of Entries per segment, so Entries is on the order of 10,000 times the size of Files. Files have a CreationDate field and customers will define a retention period so they'll drop old Entries. All of this clearly points to date-based partitions so it's quicker to query the most recent entries first, and easy to drop old ones.
The first issue I encounter is that when I try to create the Files table, it sounds like I have to include createdDate as part of the primary key in order to have it in the RANGE partition:
CREATE TABLE Files2
(
FileId BIGSERIAL PRIMARY KEY,
createdDate timestamp with time zone,
filepath character varying COLLATE pg_catalog."default" NOT NULL
) PARTITION BY RANGE (createdDate)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
ERROR: insufficient columns in PRIMARY KEY constraint definition
DETAIL: PRIMARY KEY constraint on table "files2" lacks column "createddate" which is part of the partition key.
If I drop the "PRIMARY KEY" from the definition of FileId, I don't get the error, but does that affect the efficiency of child lookups?
I also don't know how to declare the partitions for the child table. PARTITION BY RANGE (Files.createdDate) doesn't work.
Since it's only in version 11 that this use case would even be possible, I haven't found much information about it and would appreciate any pointers! Thanks!

I believe it is a new feature of the version 11. There is a document from which you can get the following information:
"A primary key constraint must include a partition key column. Attempting to create a primary key constraint that does not contain partitioned columns results in an error"
https://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL11NewFeaturesGAen20181022-1.pdf

Index to query sorted values in keyed time range

Suppose I have key/value/timerange tuples, e.g.:
CREATE TABLE historical_values(
key TEXT,
value NUMERIC,
from_time TIMESTAMPTZ,
to_time TIMESTAMPTZ
)
and would like to be able to efficiently query values (sorted descending) for a specific key and time, e.g.:
SELECT value
FROM historical_values
WHERE
key = [KEY]
AND from_time <= [TIME]
AND to_time >= [TIME]
ORDER BY value DESC
What kind of index/types should I use to get the best lookup performance? I suspect my solution will involve a tstzrange and a gist index, but I'm
not sure how to make that play well with the key matching and value ordering requirements.
Edit: Here's some more information about usage.
Ideally uses features available in Postgres v9.6.
Relation will contain approx. 1k keys and 5m values per key. Values are large integers (up to 32 bytes), mostly unique. Time ranges between few hours to a couple years. Time horizon is 5 years. No NULL values allowed, but some time ranges are open-ended (could either use NULL or a time far into the future for to_time).
The primary key is the key and time range (as there is only one historical value for a time range, per key).
Common operations are a) updating to_time to "close" a historical value, and b) inserting a new value with from_time = NOW.
All values may be queried. Partitioning is an option.

DB design
For a big table like that ("1k keys and 5m values per key") I would suggest to optimize storage like:
CREATE TABLE hist_keys (
key_id serial PRIMARY KEY
, key text NOT NULL UNIQUE
);
CREATE TABLE hist_values (
hist_value_id bigserial PRIMARY KEY -- optional, see below!
, key_id int NOT NULL REFERENCES hist_keys
, value numeric
, from_time timestamptz NOT NULL
, to_time timestamptz NOT NULL
, CONSTRAINT range_valid CHECK (from_time <= to_time) -- or < ?
);
Also helps index performance.
And consider partitioning. List-partitioning on key_id. Maybe even add sub-partitioning on (range partitioning this time) on from_time. Read the manual here.
With one partition per key_id, (and constraint exclusion enabled!) Postgres would only look at the small partition (and index) for the given key, instead of the whole big table. Major win.
But I would strongly suggest to upgrade to at least Postgres 10 first, which added "declarative partitioning". Makes managing partition a lot easier.
Better yet, skip forward to Postgres 11 (currently beta), which adds major improvements for partitioning (incl. performance improvements). Most notably, for your goal to get the best lookup performance, quoting the chapter on partitioning in release notes for Postgres 11 (currently beta):
Allow faster partition elimination during query processing (Amit Langote, David Rowley, Dilip Kumar)
This speeds access to partitioned tables with many partitions.
Allow partition elimination during query execution (David Rowley, Beena Emerson)
Previously partition elimination could only happen at planning time,
meaning many joins and prepared queries could not use partition elimination.
Index
From the perspective of the value column, the small subset of selected rows is arbitrary for every new query. I don't expect you'll find a useful way to support ORDER BY value DESC with an index. I'd concentrate on the other columns. Maybe add value as last column to each index if you can get index-only scans out of it (possible for btree and GiST).
Without partitioning:
CREATE UNIQUE INDEX hist_btree_idx ON hist_values (key_id, from_time, to_time DESC);
UNIQUE is optional, but see below.
Note the importance of opposing sort orders for from_time and to_time. See (closely related!):
Optimizing queries on a range of timestamps (two columns)
This is almost the same index as the one implementing your PK on (key_id, from_time, to_time). Unfortunately, we cannot use it as PK index. Quoting the manual:
Also, it must be a b-tree index with default sort ordering.
So I added a bigserial as surrogate primary key in my suggested table design above and NOT NULL constraints plus the UNIQUE index to enforce your uniqueness rule.
In Postgres 10 or later consider an IDENTITY column instead:
Auto increment table column
You might even do with PK constraint in this exceptional case to avoid duplicating the index and keep the table at minimum size. Depends on the complete situation. You may need it for FK constraints or similar. See:
How does PostgreSQL enforce the UNIQUE constraint / what type of index does it use?
A GiST index like you already suspected may be even faster. I suggest to keep your original timestamptz columns in the table (16 bytes instead of 32 bytes for a tstzrange) and add key_id after installing the additional module btree_gist:
CREATE INDEX hist_gist_idx ON hist_values
USING GiST (key_id, tstzrange(from_time, to_time, '[]'));
The expression tstzrange(from_time, to_time, '[]') constructs a range including upper and lower bound. Read the manual here.
Your query needs to match the index:
SELECT value
FROM hist_values
WHERE key = [KEY]
AND tstzrange(from_time, to_time, '[]') #> tstzrange([TIME_FROM], [TIME_TO], '[]')
ORDER BY value DESC;
It's equivalent to your original.
#> being the range contains operator.
With list-partitioning on key_id
With a separate table for each key_id, we can omit key_id from the index, improving size and performance - especially for the GiST index - for which we then also don't need the additional module btree_gist. Results in ~ 1000 partitions and the corresponding indexes:
CREATE INDEX hist999_gist_idx ON hist_values USING GiST (tstzrange(from_time, to_time, '[]'));
Related:
Store the day of the week and time?

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse