Create exclusion constraint with non commutative operator in Postgres - postgresql

I have a question about exclusion constraint.
I have a following table:
-- auto-generated definition
create table archives_seasonmodel
(
id serial not null
constraint archives_seasonmodel_pkey
primary key,
series_id integer not null
constraint archives_seasonmodel_series_id_e05c6f84_fk_archives_
references archives_tvseriesmodel
deferrable initially deferred,
last_watched_episode smallint
constraint archives_seasonmodel_last_watched_episode_check
check (last_watched_episode >= 0),
season_number smallint not null
constraint archives_seasonmodel_season_number_check
check (season_number >= 0)
constraint season_number_gte_1_check
check (season_number >= 1),
_order integer not null,
number_of_episodes smallint not null
constraint archives_seasonmodel_number_of_episodes_check
check (number_of_episodes >= 0),
episodes hstore,
translation_years daterange not null,
constraint archives_seasonmodel_series_id_season_number_4368dab7_uniq
unique (series_id, season_number),
constraint last_watched_episode_and_number_of_episodes_are_gte_one
check (((last_watched_episode >= 1) OR (last_watched_episode IS NULL)) AND (number_of_episodes >= 1)),
constraint mutual_watched_episode_and_number_of_episodes_check
check (number_of_episodes >= last_watched_episode)
);
alter table archives_seasonmodel
owner to postgres;
create index archives_seasonmodel_series_id_e05c6f84
on archives_seasonmodel (series_id);
create index archives_seasonmodel_series_id_season_number_4368dab7_idx
on archives_seasonmodel (series_id, season_number);
create index exclude_overlapping_seasons_translation_time_check
on archives_seasonmodel (translation_years, series_id);
In general this table contains 3 specific columns:
series_id - positive integer, foreign key to table archives_tvseriesmodel
season_number – positive integer, number of season in series
translation_years - daterange, translation dates range of each season in series
General idea:
Series contains multiple seasons.
Each season has season_number (from 1 to infinity) that represents obviously number of seasons in series.
Each season has translation_years daterange which represent start and end date of each season.
There are exclusion constraint ‘exclude_overlapping_seasons_translation_time_check thats prevents
translation_years dateranges from overlapping each other.
But there are another problem in validation. I need to maintain translation_years in a such way that season with number for example 4 has datarange fully lower then season 5.
For example:
correct:
season_number =4 , translation_years =(2012-01-01, 2013-01-01)
season_number =5 , translation_years =(2013-03-01, 2014-01-01)
incorrect:
season_number =4 , translation_years =(2012-01-01, 2013-01-01)
season_number =5 , translation_years =(2010-01-01, 2011-01-01)
what I tried to do:
ALTER TABLE archives_seasonmodel
ADD CONSTRAINT test
EXCLUDE USING gist(series_id WITH =, translation_years WITH <<)
WHERE (season_number = season_number - 1 )
But it says that only commutative operator are allowed in constrains, and << is not a commutative one.
Question is is it possible to make a such constraint somehow?
Thank you

Perhaps instead of strictly left operator (<<) try with the range overlap operator (&&). See Range Functions. NOT TESTED as I do not have a DB available at the moment.
alter table archives_seasonmodel
add constraint test
exclude using gist(series_id with =, translation_years with &&)
where (season_number = season_number - 1 );

Related

PostgreSQL multicolumn index not fully used

I have a large (~110 million rows) table on PostgreSQL 12.3 whose relevant fields can be described by the following DDL:
CREATE TABLE tbl
(
item1_id integer,
item2_id integer,
item3_id integer,
item4_id integer,
type_id integer
)
One of the queries we execute often is:
SELECT type_id, item1_id, item2_id, item3_id, item4_id
FROM tbl
WHERE
type_id IS NOT NULL
AND item1_id IN (1, 2, 3)
AND (
item2_id IN (4, 5, 6)
OR item2_id IS NULL
)
AND (
item3_id IN (7, 8, 9)
OR item3_id IS NULL
)
AND (
item4_id IN (10, 11, 12)
OR item4_id IS NULL
)
Although we have indexes for each of the individual columns, the query is still relatively slow (a couple of seconds). Hoping to optimize this, I created the following index:
CREATE INDEX tbl_item_ids
ON public.tbl USING btree
(item1_id ASC, item2_id ASC, item3_id ASC, item4_id ASC)
WHERE type_id IS NOT NULL;
Unfortunately the query performance barely improved - EXPLAIN tells me this is because although an index scan is done with this newly created index, only item1_id is used as an Index Cond, whereas all the other filters are applied at table level (i.e. plain Filter).
I'm not sure why the index is not used in its entirety (or at least for more than the item1_id column). Is there an obvious reason for this? Is there a way I can restructure the index or the query itself to help with performance?
A multi-column index can only be used for more than the first column if the condition on the first column uses an equality comparison (=). IN or = ANY does not qualify.
So you will be better off with individual indexes for each column, which can be combined with a bitmap or.
You should try to avoid OR in the WHERE condition, perhaps with
WHERE coalesce(item2_id, -1) IN (-1, 4, 5, 6)
where -1 is a value that doesn't occur. Then you could use an index on the coalesce expression.

Update and insert performance with partial indexes

I have different queries for fetching data from a large table (about 100-200M rows). I've created partial indexes for my table with different predicates to fit the query because I know each query.
For example, the table similar to this:
CREATE TABLE public.contacts (
id int8 NOT NULL DEFAULT ssng_generate_id(8::bigint),
created timestamp NOT NULL DEFAULT timezone('UTC'::text, now()),
contact_pool_id int8 NOT NULL,
project_id int8 NOT NULL,
state_id int4 NOT NULL DEFAULT 10,
order_x int4 NOT NULL,
next_attempt_date timestamp NULL,
CONSTRAINT contacts_pkey PRIMARY KEY (id)
);
And there are two types of query:
SELECT * FROM contacts WHERE contact_pool_id = X AND state_id = 10 ORDER BY order_x LIMIT 1;
and
SELECT * FROM contacts WHERE contact_pool_id = X AND state_id = 20 AND next_attemp_date <= NOW ORDER BY next_attemp_date LIMIT 1;
For those queries I've created partial indexes:
For state_id = 10 (new contacts)
CREATE INDEX ix_contacts_cpid_orderx_id_for_new ON contacts USING btree (contact_pool_id, order_x, id) WHERE state_id = 10;
For state_id = 20 (available contacts)
CREATE INDEX ix_contacts_cpid_nextattepmdate_id_for_available ON contacts USING btree (contact_pool_id, next_attempt_date, id) WHERE state_id = 20;
For me, those partial indexes are faster than a single index.
And what about an update and insert performance? If I change a row with state_id = 20, will it affect only index 2 (for available contacts) or both of them will be affected?
Partial indexes which are not relevant to the tuple will not get updated.
If PostgreSQL can do a HOT update (if the column being changed is not part of an index, and there is room on the same page for the new tuple), then even the relevant index doesn't need to get updated.
Yes, with a partial index you only pay the overhead of modifying the index for rows that meet the WHERE condition, so you will always only need to modify at most one of the indexes at the same time (unless you change state_id from 10 to 20 or vice versa).

How to prevent overlapping of int ranges

I have a table as follow :
CREATE TABLE appointments (
id SERIAL PRIMARY KEY,
date TIMESTAMP NOT NULL,
start_mn INT NOT NULL,
end_mn INT NOT NULL,
EXCLUDE using gist((array[start_mn, end_mn]) WITH &&)
)
I want to prevent start_mn and end_mn overlapping between rows so I've added a gist exclusion :
EXCLUDE using gist((array[start_mn, end_mn]) WITH &&)
But inserting the two following do not trigger the exclusion:
INSERT INTO appointments(date, start_mn, end_mn) VALUES('2020-08-08', 100, 200);
INSERT INTO appointments(date, start_mn, end_mn) VALUES('2020-08-08', 90, 105);
How can I achieve this exclusion ?
If you want to prevent an overlapping range you will have to use a range type not an array.
I also assume that start and end should never overlap on the same day, so you need to include the date column in the exclusion constraint:
CREATE TABLE appointments
(
id SERIAL PRIMARY KEY,
date TIMESTAMP NOT NULL,
start_mn INT NOT NULL,
end_mn INT NOT NULL,
EXCLUDE using gist( int4range(start_mn, end_mn, '[]') WITH &&, "date" with =)
)
If start_mn and end_mn are supposed to be "time of the day", then those columns should be defined as time, not as integers.

How do I add a constraint with a where clause in PostgreSQL?

I have a table with reservations. A reservation is made of a date range, and a time range. They also belong to a couple of other models. I would like to add a constraint that makes it impossible for a reservation to happen for overlapping times.
I have this:
CREATE TABLE reservations (
id integer NOT NULL,
dates daterange,
times timerange,
desk_id integer NOT NULL,
space_id integer,
);
ALTER TABLE reservations ADD EXCLUDE USING gist (dates WITH &&, times WITH &&)
It works well. But I want this constraint to be scoped to desk_id and client_id.
It should be possible to save a record for overlapping times/dates when this record is about different desk_id or space_id.
How can I do this?
You just can use the exact same mechanism you were using, but also adding desk_id and space_id to your exclusions. This time, instead of using the && operator (meaning overlaps) with the = operator:
ALTER TABLE reservations
ADD EXCLUDE
USING gist (desk_id WITH =, space_id WITH =, dates WITH &&, times WITH &&) ;
Theses inserts will work, because they involve two different desk_id:
INSERT INTO
reservations
(id, dates, times, desk_id, space_id)
VALUES
(1, '[20170101,20170101]'::daterange, '[10:00,11:00]'::timerange, 10, 10),
(2, '[20170101,20170101]'::daterange, '[10:30,11:00]'::timerange, 20, 10) ;
This insert will fail, because you'd be having a time-range overlap, and the same desk_id and space_id:
INSERT INTO
reservations
(id, dates, times, desk_id, space_id)
VALUES
(3, '[20170101,20170101]'::daterange, '[10:00,11:00]'::timerange, 10, 10) ;

an empty row with null-like values in not-null field

I'm using postgresql 9.0 beta 4.
After inserting a lot of data into a partitioned table, i found a weird thing. When I query the table, i can see an empty row with null-like values in 'not-null' fields.
That weird query result is like below.
689th row is empty. The first 3 fields, (stid, d, ticker), are composing primary key. So they should not be null. The query i used is this.
select * from st_daily2 where stid=267408 order by d
I can even do the group by on this data.
select stid, date_trunc('month', d) ym, count(*) from st_daily2
where stid=267408 group by stid, date_trunc('month', d)
The 'group by' results still has the empty row.
The 1st row is empty.
But if i query where 'stid' or 'd' is null, then it returns nothing.
Is this a bug of postgresql 9b4? Or some data corruption?
EDIT :
I added my table definition.
CREATE TABLE st_daily
(
stid integer NOT NULL,
d date NOT NULL,
ticker character varying(15) NOT NULL,
mp integer NOT NULL,
settlep double precision NOT NULL,
prft integer NOT NULL,
atr20 double precision NOT NULL,
upd timestamp with time zone,
ntrds double precision
)
WITH (
OIDS=FALSE
);
CREATE TABLE st_daily2
(
CONSTRAINT st_daily2_pk PRIMARY KEY (stid, d, ticker),
CONSTRAINT st_daily2_strgs_fk FOREIGN KEY (stid)
REFERENCES strgs (stid) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT st_daily2_ck CHECK (stid >= 200000 AND stid < 300000)
)
INHERITS (st_daily)
WITH (
OIDS=FALSE
);
The data in this table is simulation results. Multithreaded multiple simulation engines written in c# insert data into the database using Npgsql.
psql also shows the empty row.
You'd better leave a posting at http://www.postgresql.org/support/submitbug
Some questions:
Could you show use the table
definitions and constraints for the
partions?
How did you load your data?
You get the same result when using
another tool, like psql?
The answer to your problem may very well lie in your first sentence:
I'm using postgresql 9.0 beta 4.
Why would you do that? Upgrade to a stable release. Preferably the latest point-release of the current version.
This is 9.1.4 as of today.
I got to the same point: "what in the heck is that blank value?"
No, it's not a NULL, it's a -infinity.
To filter for such a row use:
WHERE
case when mytestcolumn = '-infinity'::timestamp or
mytestcolumn = 'infinity'::timestamp
then NULL else mytestcolumn end IS NULL
instead of:
WHERE mytestcolumn IS NULL