postgresql 10 date range contraint - postgresql

I want to make a date range constraint in postgresql 10. In postgresql 9.6 this worked:
CREATE TABLE project_lines (
id SERIAL PRIMARY KEY,
project_id INTEGER NOT NULL REFERENCES projects(id),
description VARCHAR(200) NOT NULL,
start_time TIMESTAMP NOT NULL,
end_time TIMESTAMP CHECK(end_time > start_time),
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
CONSTRAINT overlapping_times EXCLUDE USING GIST(
project_id WITH =,
tstzrange(start_time, COALESCE(end_time, 'infinity')) WITH &&
)
);
But in postgresql 10 I am getting this error:
functions in index expression must be marked IMMUTABLE
How can I make this constraint working?

The start_time and end_time columns have type TIMESTAMP, while the tstzrange expects TIMESTAMPTZ (with time zone). Apparently this conversion happens automatically, but it's not considered "IMMUTABLE".
The documentation at
https://www.postgresql.org/docs/10/static/xfunc-volatility.html says
A common
error is to label a function IMMUTABLE when its results depend on a configuration parameter. For example, a function that manipulates timestamps might well have results that depend on the TimeZone setting. For safety, such functions should be labeled STABLE instead.
You should probably use tsrange instead, or explicitly convert to timestamp with time zone (in a way that doesn't depend on server settings):
tstzrange(start_time at time zone 'utc', COALESCE(end_time at time zone 'utc', 'infinity')) WITH &&
I don't know what has changed between versions, though (and I get the same error message with 9.6.6).

Related

How to set the resolution of the TSTZRANGE Postgres type?

Postgres supports range types(permalink) such as int4range or tstzrange. The documentation for tstzrange is "Range of timestamp with time zone".
The default precision for timestamps is to use 6 digits for second fractions (microsecond resolution), however my application uses 3 digits for its timestamps (millisecond) through timestamp (3) with time zone. For consistency I would like to also use millisecond-resolution for my time periods.
What is the default tstzrange resolution? How can I configure it?
I tried to define my table using tstzrange(3) but this form is not supported. Applied to the example from the documentation, it would look like:
CREATE TABLE reservation (
during tstzrange(3),
EXCLUDE USING GIST (during WITH &&)
);
As of Postgres 13, the time resolution for tstzrange uses 6 digit for second fractions (microsecond resolution). This resolution cannot be changed.
To create a range between timestamps with a custom resolution, you have to define a new range type. Here is the query to create a timestamp range with a millisecond resolution:
CREATE TYPE tstzrange3 AS RANGE (
subtype = timestamp (3) with time zone
);
This defines both the tstzrange3 type as well as a constructor function with the same name.
You can use this type like any other range:
CREATE TABLE reservation (
during tstzrange3,
EXCLUDE USING GIST (during WITH &&)
);
INSERT INTO reservation(during) VALUES(tstzrange3(NOW(), NULL));

Exclusion constraint that allows overlapping at the boundaries

I tried to have a PostgreSQL constraint so that there will be no overlap between two date intervals. My requirement is that the date c_from for one entry can be the same as c_until for another date.
Eg: "01/12/2019 12/12/2019" and "12/12/2019 31/21/2019" are still date ranges that do not conflict. I have "[]" in my query but it seems not to work.
user_no INTEGER NOT NULL REFERENCES usr,
c_from DATE DEFAULT NOW(),
c_until DATE DEFAULT 'INFINITY',
CONSTRAINT unique_user_per_daterange EXCLUDE USING gist (user_no WITH =, daterange(c_from, c_until, '[]') WITH && )
When I have the date range above, I get this error:
(psycopg2.IntegrityError) conflicting key value violates exclusion constraint "unique_user_per_daterange"
Could you please help?
Use ranges that do not include one of the ends:
daterange(c_from, c_until, '[)')
Then they won't conflict, even if one interval ends at the same point where another begins.

Properly handle TIME WITH TIME ZONE in PostgreSQL

We have a table that is filled with data from a legacy report of another system. The columns of that table reflect the same structure of the report.
Here are a abbreviated structure of the table:
CREATE TABLE IF NOT EXISTS LEGACY_TABLE (
REPORT_DATE DATE NOT NULL,
EVENT_ID BIGINT PRIMARY KEY NOT NULL,
START_HOUR TIMESTAMP WITHOUT TIME ZONE,
END_HOUR TIME WITHOUT TIME ZONE,
EXPECTED_HOUR TIME WITHOUT TIME ZONE
);
We are refactoring this table to deal with different time zones of different clients. The new structure would be something like:
CREATE TABLE IF NOT EXISTS LEGACY_TABLE (
REPORT_DATE DATE NOT NULL,
EVENT_ID BIGINT PRIMARY KEY NOT NULL,
START_HOUR TIMESTAMP WITH TIME ZONE,
END_HOUR TIME WITH TIME ZONE,
EXPECTED_HOUR TIME WITH TIME ZONE
);
These hour fields represents a specific point in time during the day represented by the REPORT_DATE column. What I mean by that is that every TIME column represents a moment during the day specified in REPORT_DATE.
Some other points to consider:
We don't know why the START_HOUR is in TIMESTAMP format in the report we receive from the legacy system. But we import the data the way it comes to us.
The fields in the report are formatted according to the timezone of the client, so to refactor this table we need to combine the timezone of the client (we have this info) to properly insert the timestamps/times in UTC.
But now to the problem. The value of these columns are used to compute another values multiple times in our system, something like the following:
START_HOUR - END_HOUR (the result of this operation is currently being casted to TIME WITHOUT TIME ZONE)
START_HOUR < END_HOUR
START_HOUR + EXPECTED_HOUR
EXPECTED_HOUR - END_HOUR
EXPECTED_HOUR < '05:00'
After some research I found that is not recommended to use the type TIME WITH TIME ZONE (Postgres time with time zone equality) and now I'm a bit confused about what is the best way to refactor this table to deal with different time zones and handle the different column operations that we need to.
Besides that, I already know that is safe to subtract two columns of type TIMESTAMP WITH TIME ZONE. This subtraction operation is taking into account DST changes (Subtracting two columns of type timestamp with time zone) but how about the others? And the one subtracting a TIME from a TIMESTAMP?.
And about the table refactoring, should we use TIME WITH TIME ZONE anyways? Should we continue using TIME WITHOUT TIME ZONE? Or is better to forget the type TIME altogether and combine the DATE with the TIME and change the columns to TIMESTAMP WITH TIME ZONE?
I think these questions are related because the new column types we choose to use, will define how we operate with the columns.
You asserted that:
every TIME column represents a moment during the day specified in REPORT_DATE.
So you never cross the a dateline within the same row. I suggest to save 1x date 3x time and the time zone (as text or FK column):
CREATE TABLE legacy_table (
event_id bigint PRIMARY KEY NOT NULL
, report_date date NOT NULL
, start_hour time
, end_hour time
, expected_hour time
, tz text -- time zone
);
Like you already found, timetz (time with time zone) should generally be avoided. It cannot deal with DST rules properly (daylight saving time).
So basically what you already had. Just drop the date component from start_hour, that's dead freight. Cast timestamp to time to cut off the date. Like: (timestamp '2018-03-25 1:00:00')::time
tz can be any string accepted by the AT TIME ZONE construct, but to deal with different time zones reliably, it's best to use time zone names exclusively. Any name you find in the system catalog pg_timezone_names.
To optimize storage, you could collect allowed time zone names in a small lookup table and replace tz text with tz_id int REFERENCES my_tz_table.
Two example rows with and without DST:
INSERT INTO legacy_table VALUES
(1, '2018-03-25', '1:00', '3:00', '2:00', 'Europe/Vienna') -- sadly, with DST
, (2, '2018-03-25', '1:00', '3:00', '2:00', 'Europe/Moscow'); -- Russians got rid of DST
For representation purposes or calculations you can do things like:
SELECT (report_date + start_hour) AT TIME ZONE tz AT TIME ZONE 'UTC' AS start_utc
, (report_date + end_hour) AT TIME ZONE tz AT TIME ZONE 'UTC' AS end_utc
, (report_date + expected_hour) AT TIME ZONE tz AT TIME ZONE 'UTC' AS expected_utc
-- START_HOUR - END_HOUR
, (report_date + start_hour) AT TIME ZONE tz
- (report_date + end_hour) AT TIME ZONE tz AS start_minus_end
FROM legacy_table;
You might create one or more views to readily display strings as needed. The table is for storing the information you need.
Note the parentheses! Else the operator + would bind before AT TIME ZONE due to operator precedence.
And behold the results:
db<>fiddle here
Since the time is manipulated in Vienna (like any place where silly DST rules apply), you get "surprising" results.
Related:
Accounting for DST in Postgres, when selecting scheduled items
Ignoring time zones altogether in Rails and PostgreSQL

INTERVAL vs TIMESTAMP (or similar)

I am wondering: Do I have any advantages when using INTERVAL instead of using to TIMESTAMP attributes from and to?
I am asking because I am having a table time_period that is supposed to be function as an abstract way of storing time interval information on a day and/or hour level:
CREATE TABLE time_period (
-- PRIMARY KEY
id BIGSERIAL PRIMARY KEY,
-- ATTRIBUTES
valid_for_days INT NOT NULL,
day_from TIMESTAMP WITH TIME ZONE NOT NULL,
day_to TIMESTAMP WITH TIME ZONE NOT NULL,
time_from TIMESTAMP WITH TIME ZONE NOT NULL,
time_to TIMESTAMP WITH TIME ZONE NOT NULL
);
Would I be better off using INTERVAL here?
The difference between two timestamps is an interval:
select now() - '2015-01-01';
?column?
--------------------------
370 days 18:45:20.312403
You only need two timestamps to have an interval not four.
If you store that as an interval you will lose the timestamp reference.

date_trunc on timestamp column returns nothing

I have a strange problem when retrieving records from db after comparing a truncated field with date_trunc().
This query doesn't return any data:
select id from my_db_log
where date_trunc('day',creation_date) >= to_date('2014-03-05'::text,'yyyy-mm-dd');
But if I add the column creation_date with id then it returns data(i.e. select id, creation_date...).
I have another column last_update_date having same type and when I use that one, still does the same behavior.
select id from my_db_log
where date_trunc('day',last_update_date) >= to_date('2014-03-05'::text,'yyyy-mm-dd');
Similar to previous one. it also returns record if I do id, last_update_date in my select.
Now to dig further, I have added both creation_date and last_updated_date in my where clause and this time it demands to have both of them in my select clause to have records(i.e. select id, creation_date, last_update_date).
Does anyone encountered the same problem ever? This similar thing works with my other tables which are having this type of columns!
If it helps, here is my table schema:
id serial NOT NULL,
creation_date timestamp without time zone NOT NULL DEFAULT now(),
last_update_date timestamp without time zone NOT NULL DEFAULT now(),
CONSTRAINT db_log_pkey PRIMARY KEY (id),
I have asked a different question earlier that didn't get any answer. This problem may be related to that one. If you are interested on that one, here is the link.
EDITS:: EXPLAIN (FORMAT XML) with select * returns:
<explain xmlns="http://www.postgresql.org/2009/explain">
<Query>
<Plan>
<Node-Type>Result</Node-Type>
<Startup-Cost>0.00</Startup-Cost>
<Total-Cost>0.00</Total-Cost>
<Plan-Rows>1000</Plan-Rows>
<Plan-Width>658</Plan-Width>
<Plans>
<Plan>
<Node-Type>Result</Node-Type>
<Parent-Relationship>Outer</Parent-Relationship>
<Alias>my_db_log</Alias>
<Startup-Cost>0.00</Startup-Cost>
<Total-Cost>0.00</Total-Cost>
<Plan-Rows>1000</Plan-Rows>
<Plan-Width>658</Plan-Width>
<Node/s>datanode1</Node/s>
<Coordinator-quals>(date_trunc('day'::text, creation_date) >= to_date('2014-03-05'::text, 'yyyy-mm-dd'::text))</Coordinator-quals>
</Plan>
</Plans>
</Plan>
</Query>
</explain>
"Impossible" phenomenon
The number of rows returned is completely independent of items in the SELECT clause. (But see #Craig's comment about SRFs.) Something must be broken in your db.
Maybe a broken covering index? When you throw in the additional column, you force Postgres to visit the table itself. Try to re-index:
REINDEX TABLE my_db_log;
The manual on REINDEX. Or:
VACUUM FULL ANALYZE my_db_log;
Better query
Either way, use instead:
select id from my_db_log
where creation_date >= '2014-03-05'::date
Or:
select id from my_db_log
where creation_date >= '2014-03-05 00:00'::timestamp
'2014-03-05' is in ISO 8601 format. You can just cast this string literal to date. No need for to_date(), works with any locale. The date is coerced to timestamp [without time zone] automatically when compared to creation_date (being timestamp [without time zone]). More details about timestamps in Postgres here:
Ignoring timezones altogether in Rails and PostgreSQL
Also, you gain nothing by throwing in date_trunc() here. On the contrary, your query will be slower and any plain index on the column cannot be used (potentially making this much slower)