Applying unique constraint of date on TIMESTAMP column in postgresql - postgresql

I have a postgresql table as
CREATE TABLE IF NOT EXISTS table_name
(
expiry_date DATE NOT NULL,
created_at TIMESTAMP with time zone NOT NULL DEFAULT CURRENT_TIMESTAMP(0),
CONSTRAINT user_review_uniq_key UNIQUE (expiry_date, created_at::date) -- my wrong attempt of using ::
)
I want to put uniue constraint on this table in such a way that expiry_date and date of created_at should be unique. Problem is created_at column is timestamp not date.
so is there any way to put unique constraint such that expire_date and created_at::date should be unique?
My attempt was to use
CONSTRAINT user_review_uniq_key UNIQUE (expiry_date, created_at::date) which is not valid.

If you do not need a time zone for your created date : create a unique index has follows :
create unique index idx_user_review_uniq_key on table_name (expiry_date, cast(created_at as date));
If you need that badly to have a time zone then you need to use a little trick (https://gist.github.com/cobusc/5875282) :
create unique index idx_user_review_uniq_key on table_name (expiry_date, date(created_at at TIME zone 'UTC'));

Related

ERROR: could not create exclusion constraint

I have a table:
CREATE TABLE attendances
(
id_attendance serial PRIMARY KEY,
id_user integer NOT NULL
REFERENCES users (user_id) ON UPDATE CASCADE ON DELETE CASCADE,
entry_date timestamp with time zone DEFAULT NULL,
departure_date timestamp with time zone DEFAULT NULL,
created_at timestamp with time zone DEFAULT current_timestamp
);
I want to add an exclusion constraint avoiding attendance to overlap (There can be multiple rows for the same day, but time ranges cannot overlap).
So I wrote this code to add the constraint:
ALTER TABLE attendances
ADD CONSTRAINT check_attendance_overlaps
EXCLUDE USING GIST (box(
point(
extract(epoch from entry_date at time zone 'UTC'),
id_user
),
point(
extract(epoch from departure_date at time zone 'UTC') - 0.5,
id_user + 0.5
)
)
WITH && );
But when I tried to run it on the database I got this error:
Error: could not create exclusion constraint "check_attendance_overlaps"
To exclude overlapping time ranges per user, work with a multicolumn constraint on id_user and a timestamptz range (tstzrange).
You need the additional module btree_gist once per database:
CREATE EXTENSION IF NOT EXISTS btree_gist;
Then:
ALTER TABLE attendances ADD CONSTRAINT check_attendance_overlaps
EXCLUDE USING gist (id_user WITH =
, tstzrange(entry_date, departure_date) WITH &&)
See:
Store the day of the week and time?
Postgres constraint for unique datetime range
Or maybe spgist instead of gist. Might be faster. See:
Perform this hours of operation query in PostgreSQL
Of course, there cannot be overlapping rows in the table, or adding the constraint will fail with a similar error message.

Multicolum index vs singel column index for time series data in Postgres

This table started out at short term storage for meter data before it was going to be validated and added to some long term storage tables.
Turns out the clients wants to keep this data for a long time since we saved it and it is growing fast.
create table metering_meterreading
(
id bigserial not null. # Primary Key
created_at timestamp with time zone not null,
updated_at timestamp with time zone not null,
timestamp timestamp with time zone not null, # BTREE index
value numeric(15, 3) not null,
meter_device_id uuid not null, # FK to meter_device, BTREE index
series_id uuid not null # FK to series, BTREE index
organization_id uuid not null. # FK to org , BTREE index
);
I am planning on dropping the primary key since (org_id, meter_device_id, series_id, timestamp) makes it unique. It was just added by my ORM (django) and I didn't care when we started.
But since I pretty much always want to filter in organization, meter_device, and series to get a range of time series data I am wondering if it would be more efficient to have a multicolumn index on (organization_id, meter_device_id, series_id, timestamp) instead of the separate indexes.
I read somewhere that if I had a range it should be the rightmost in the index.
This is still not an super efficient table for timeseries data, since it will grow large, but I am planning in fixing that by partitioning on range, or maybe even use Timescale. But before partitioning I would like it to be as efficient as possible to look up data in it.
I also saw an example somewhere that used a separate table to identify the metric:
create table metric
(
id
organization_id
meter_device_id
series_id
) UNIQE (organization_id, meter_device_id, series_id)
;
create table metering_meterreading
(
metric_id. bigserial, FK to metric, BTREE index
timestamp timestamp with time zone not null, # BTREE index
value numeric(15, 3) not null,
created_at timestamp with time zone not null,
updated_at timestamp with time zone not null,
);
But I am not sure if that is actually better than just putting them all in table. It might impact ingestion rate since there is another table involved now.
If (org_id, meter_device_id, series_id, timestamp) uniquely determine a table row, you need to use a multi-column primary key over all of them. So you automatically have a 4-column index on these columns. Just make sure that timestamp is last in the list, then that index will support your query ideally.

Add timestamp column with default NOW() for new rows only

I have a table that has thousands of rows. Since the table wasn't constructed with created_at column initially, there is no way of getting their creation timestamp. It is crucial though to start getting the timestamps for future rows.
Is there a way I can add a timestamp column with default value NOW() so that it won't populate the values to previous rows but only for the future ones?
If I do the ALTER query, it populates all rows with timestamp:
ALTER TABLE mytable ADD COLUMN created_at TIMESTAMP DEFAULT NOW()
You need to add the column with a default of null, then alter the column to have default now().
ALTER TABLE mytable ADD COLUMN created_at TIMESTAMP;
ALTER TABLE mytable ALTER COLUMN created_at SET DEFAULT now();
You could add the default rule with the alter table,
ALTER TABLE mytable ADD COLUMN created_at TIMESTAMP DEFAULT NOW()
then immediately set to null all the current existing rows:
UPDATE mytable SET created_at = NULL
Then from this point on the DEFAULT will take effect.
For example, I will create a table called users as below and give a column named date a default value NOW()
create table users_parent (
user_id varchar(50),
full_name varchar(240),
login_id_1 varchar(50),
date timestamp NOT NULL DEFAULT NOW()
);
Thanks
minor optimization.
select pg_typeof(now()); --returns: timestamp with time zone. So now include timezone.
So better with timestamptz.
begin;
ALTER TABLE mytable ADD COLUMN created_at TIMESTAMPTZ;
ALTER TABLE mytable ALTER COLUMN created_at SET DEFAULT now();
commit;
Try something like:-
ALTER TABLE table_name ADD CONSTRAINT [DF_table_name_Created]
DEFAULT (getdate()) FOR [created_at];
replacing table_name with the name of your table.

How can I use a WHERE BETWEEN clause in an INSERT query?

I have the following table.
CREATE TABLE public.ad
(
id integer NOT NULL DEFAULT nextval('ad_id_seq'::regclass),
uuid uuid NOT NULL DEFAULT uuid_generate_v4(),
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL,
cmdb_id integer,
platform character varying(100),
bidfloor numeric(15,6),
views integer NOT NULL DEFAULT 1,
year integer,
month integer,
day integer,
CONSTRAINT ad_pkey PRIMARY KEY (id),
CONSTRAINT ad_cmdb_id_foreign FOREIGN KEY (cmdb_id)
REFERENCES public.cmdb (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT ad_id_unique UNIQUE (uuid)
)
WITH (
OIDS=FALSE
);
Without going in too much detail, this table logs all the requests and impressions of advertisements on electronic screens throughout the country. This table is also being used to generate reports and consists of +- 50 million records.
Currently, the reports are filtered on the created_at timestamp. You can imagine that with +- 50 million records the query will get slow, even with an index on the created_at column. The reports are generated by selecting between which dates you want to request the data on the UI of the system.
The year, month and day columns are new columns that I just added to make the reporting more efficient. Instead of indexing on the date, I want the system to index on a year, month and day, all separate values.
The newly added columns are still empty. I want to run a query that inserts a value where the created_at column is between two dates. For example:
INSERT INTO ad (year) VALUES (2016) WHERE created_at BETWEEN '2016-01-01 00:00:00' AND '2016-12-31 23:59:59';
This doesn't work of course. I cannot seem to find anything on the internet where an INSERT statement makes use of a WHERE BETWEEN clause. I also tried using subqueries and the WITH clausule to generate a series of years between 2012 and 2020 using generate_series. It all didn't work out.
You don't want to insert new rows, you should update your table.
UPDATE table_name
SET column1=value1,column2=value2,...
WHERE column_name BETWEEN value1 AND value2;
Otherwise you'll have 100 milions rows ;)

Postgres LIKE unique constraint possible?

I'm new to Postgres and am creating a table (metrics_reaches) using pgAdmin III.
In my table, I have an insertion_timestamp of timestamp with timezome type column.
I'd like to create a UNIQUE constraint that, amongst other fields, checks only the date portion of the insertion_timestamp and not the time.
Is there a way to do that? Here's what my script looks like at the moment (see the last CONSTRAINT).
-- Table: metrics_reaches
-- DROP TABLE metrics_reaches;
CREATE TABLE metrics_reaches
(
organizations_id integer NOT NULL,
applications_id integer NOT NULL,
countries_id integer NOT NULL,
platforms_id integer NOT NULL,
...
insertion_timestamp timestamp with time zone NOT NULL,
id serial NOT NULL,
CONSTRAINT metrics_reaches_pkey PRIMARY KEY (id),
CONSTRAINT metrics_reaches_applications_id_fkey FOREIGN KEY (applications_id)
REFERENCES applications (id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT metrics_reaches_countries_id_fkey FOREIGN KEY (countries_id)
REFERENCES countries (id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT metrics_reaches_organizations_id_fkey FOREIGN KEY (organizations_id)
REFERENCES organizations (id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT metrics_reaches_platforms_id_fkey FOREIGN KEY (platforms_id)
REFERENCES platforms (id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT metrics_reaches_organizations_id_key UNIQUE (organizations_id, applications_id, countries_id, platforms_id, insertion_timestamp)
)
WITH (
OIDS=FALSE
);
ALTER TABLE metrics_reaches
OWNER TO postgres;
Try a CAST():
CONSTRAINT metrics_reaches_organizations_id_key UNIQUE (
organizations_id,
applications_id,
countries_id,
platforms_id,
CAST(insertion_timestamp AS date)
)
This is really a comment to Frank's answer, but it's too long for the comment box.
If you are being paranoid, you need to watch the local timezone carefully when dealing with date casts:
bookings=> SET timezone='GMT';
SET
bookings=> SELECT now() at time zone 'GMT', (now() at time zone 'GMT')::date, now(), now()::date;
timezone | timezone | now | now
---------------------------+------------+------------------------------+------------
2013-05-30 19:36:04.23684 | 2013-05-30 | 2013-05-30 19:36:04.23684+00 | 2013-05-30
(1 row)
bookings=> set timezone='GMT-7';
SET
bookings=> SELECT now() at time zone 'GMT', (now() at time zone 'GMT')::date, now(), now()::date;
timezone | timezone | now | now
----------------------------+------------+-------------------------------+------------
2013-05-30 19:36:13.723558 | 2013-05-30 | 2013-05-31 02:36:13.723558+07 | 2013-05-31
Now, PG is smart enough to know this is a problem, and if you try to create a constraint with a date cast then you should see something like:
ERROR: functions in index expression must be marked IMMUTABLE
If you try to cast after applying "at time zone" then it really is immutable and you can have your constraint.
Of course the other option is to wrap the cast in a function and mark the function as immutable. If you're going to lie to the system like that though, don't come complaining when your database behaves oddly a year from now.