We have a table that is filled with data from a legacy report of another system. The columns of that table reflect the same structure of the report.
Here are a abbreviated structure of the table:
CREATE TABLE IF NOT EXISTS LEGACY_TABLE (
REPORT_DATE DATE NOT NULL,
EVENT_ID BIGINT PRIMARY KEY NOT NULL,
START_HOUR TIMESTAMP WITHOUT TIME ZONE,
END_HOUR TIME WITHOUT TIME ZONE,
EXPECTED_HOUR TIME WITHOUT TIME ZONE
);
We are refactoring this table to deal with different time zones of different clients. The new structure would be something like:
CREATE TABLE IF NOT EXISTS LEGACY_TABLE (
REPORT_DATE DATE NOT NULL,
EVENT_ID BIGINT PRIMARY KEY NOT NULL,
START_HOUR TIMESTAMP WITH TIME ZONE,
END_HOUR TIME WITH TIME ZONE,
EXPECTED_HOUR TIME WITH TIME ZONE
);
These hour fields represents a specific point in time during the day represented by the REPORT_DATE column. What I mean by that is that every TIME column represents a moment during the day specified in REPORT_DATE.
Some other points to consider:
We don't know why the START_HOUR is in TIMESTAMP format in the report we receive from the legacy system. But we import the data the way it comes to us.
The fields in the report are formatted according to the timezone of the client, so to refactor this table we need to combine the timezone of the client (we have this info) to properly insert the timestamps/times in UTC.
But now to the problem. The value of these columns are used to compute another values multiple times in our system, something like the following:
START_HOUR - END_HOUR (the result of this operation is currently being casted to TIME WITHOUT TIME ZONE)
START_HOUR < END_HOUR
START_HOUR + EXPECTED_HOUR
EXPECTED_HOUR - END_HOUR
EXPECTED_HOUR < '05:00'
After some research I found that is not recommended to use the type TIME WITH TIME ZONE (Postgres time with time zone equality) and now I'm a bit confused about what is the best way to refactor this table to deal with different time zones and handle the different column operations that we need to.
Besides that, I already know that is safe to subtract two columns of type TIMESTAMP WITH TIME ZONE. This subtraction operation is taking into account DST changes (Subtracting two columns of type timestamp with time zone) but how about the others? And the one subtracting a TIME from a TIMESTAMP?.
And about the table refactoring, should we use TIME WITH TIME ZONE anyways? Should we continue using TIME WITHOUT TIME ZONE? Or is better to forget the type TIME altogether and combine the DATE with the TIME and change the columns to TIMESTAMP WITH TIME ZONE?
I think these questions are related because the new column types we choose to use, will define how we operate with the columns.
You asserted that:
every TIME column represents a moment during the day specified in REPORT_DATE.
So you never cross the a dateline within the same row. I suggest to save 1x date 3x time and the time zone (as text or FK column):
CREATE TABLE legacy_table (
event_id bigint PRIMARY KEY NOT NULL
, report_date date NOT NULL
, start_hour time
, end_hour time
, expected_hour time
, tz text -- time zone
);
Like you already found, timetz (time with time zone) should generally be avoided. It cannot deal with DST rules properly (daylight saving time).
So basically what you already had. Just drop the date component from start_hour, that's dead freight. Cast timestamp to time to cut off the date. Like: (timestamp '2018-03-25 1:00:00')::time
tz can be any string accepted by the AT TIME ZONE construct, but to deal with different time zones reliably, it's best to use time zone names exclusively. Any name you find in the system catalog pg_timezone_names.
To optimize storage, you could collect allowed time zone names in a small lookup table and replace tz text with tz_id int REFERENCES my_tz_table.
Two example rows with and without DST:
INSERT INTO legacy_table VALUES
(1, '2018-03-25', '1:00', '3:00', '2:00', 'Europe/Vienna') -- sadly, with DST
, (2, '2018-03-25', '1:00', '3:00', '2:00', 'Europe/Moscow'); -- Russians got rid of DST
For representation purposes or calculations you can do things like:
SELECT (report_date + start_hour) AT TIME ZONE tz AT TIME ZONE 'UTC' AS start_utc
, (report_date + end_hour) AT TIME ZONE tz AT TIME ZONE 'UTC' AS end_utc
, (report_date + expected_hour) AT TIME ZONE tz AT TIME ZONE 'UTC' AS expected_utc
-- START_HOUR - END_HOUR
, (report_date + start_hour) AT TIME ZONE tz
- (report_date + end_hour) AT TIME ZONE tz AS start_minus_end
FROM legacy_table;
You might create one or more views to readily display strings as needed. The table is for storing the information you need.
Note the parentheses! Else the operator + would bind before AT TIME ZONE due to operator precedence.
And behold the results:
db<>fiddle here
Since the time is manipulated in Vienna (like any place where silly DST rules apply), you get "surprising" results.
Related:
Accounting for DST in Postgres, when selecting scheduled items
Ignoring time zones altogether in Rails and PostgreSQL
Related
Seeking help understanding Postgres timezone conversion and comparison
query
http://sqlfiddle.com/#!17/9eecb/79009
SELECT
now()
, ( now() at time zone 'AEDT' ) AS now_AEDT
, ( ( now() at time zone 'AEDT' ) at time zone 'AEDT' ) AS now_AEDT_AEDT
, ( now() at time zone 'UTC' ) + interval '1' second
> ( now() at time zone 'AEDT' ) AS should_be_true
Result
now : 2021-07-30T01:03:10.707834Z (i assume system is UTC?)
now_AEDT : 2021-07-30T12:03:10.707834Z
now_AEDT_AEDT : 2021-07-30T01:03:10.707834Z (why is this different?)
should_be_true : false (after adding a second; this should be true, no?)
questions / confusions
why does double-setting the timezone change the time? (now_AEDT_AEDT)
how are timestamps with timezones handled in comparison? It seems the zone is ignored and just the values are compared?
"at time zone" does opposite things when applied to timestamptz versus a timestamp. So applying it twice in a row just gives you back the original. First it does something, then it undoes it.
When applied to timestamptz, it converts the time to look like what it would be expressed as in the indicated time zone, and datatypes it as a timestamp without timezone (except in sqlfiddle, where it seems to do something slightly different, but without changing the overall effect). When applied to a timestamp without timezone, it assumes that that time expressed was already in the indicated time zone, and converts it back to the system time with it datatyped as timestamptz.
how are timestamps with timezones handled in comparison? It seems the zone is ignored and just the values are compared?
You aren't comparing timestamps with timezones. You are comparing timestamps without timezones.
select pg_typeof(now() at time zone 'AEDT');
pg_typeof
-----------------------------
timestamp without time zone
So yes, it ignores the timezones in your comparison, because they are no longer there anymore.
I am getting my data: date and time in UTC, in a csv file format in separate columns. Since I will need to convert this zone to date and time of the place where I live, currently in summer to UTC+2, and maybe some other zones I was wondering what is the best practice to insert data in postgres when we are talking about type of data. Should I place both of my data in a single column or keep them separate as types: date and time, and if not should I use timestamp or timestampz (or something else).
use timestamptz it will store your time stamp in UTC. and will display it to the client according to it's locale.
https://www.postgresql.org/docs/current/static/datatype-datetime.html
For timestamp with time zone, the internally stored value is always in
UTC (Universal Coordinated Time, traditionally known as Greenwich Mean
Time, GMT). An input value that has an explicit time zone specified is
converted to UTC using the appropriate offset for that time zone. If
no time zone is stated in the input string, then it is assumed to be
in the time zone indicated by the system's TimeZone parameter, and is
converted to UTC using the offset for the timezone zone.
When a timestamp with time zone value is output, it is always
converted from UTC to the current timezone zone, and displayed as
local time in that zone. To see the time in another time zone, either
change timezone or use the AT TIME ZONE construct (see Section 9.9.3).
updated with another good point from Lukasz, I had to mention:
Also in favor of single column is the fact that if you would store
both date and time in separate columns you would still need to combine
them and convert to timestamp if you wanted to change time zone of
date.
Not doing that would lead to date '2017-12-31' with time '23:01:01' would in other time zone in fact be not only different time, but different date with all YEAR and MONTH and DAY different
another update As per Laurenz notice, don't forget the above docs quote
An input value that has an explicit time zone specified is converted to UTC using the appropriate offset for that time zone. Which means you have to manage the input dates carefully. Eg:
t=# create table t(t timestamptz);
CREATE TABLE
t=# set timezone to 'GMT+5';
SET
t=# insert into t select '2017-01-01 00:00:00';
INSERT 0 1
t=# insert into t select '2017-01-01 00:00:00' at time zone 'UTC';
INSERT 0 1
t=# insert into t select '2017-01-01 00:00:00+02';
INSERT 0 1
t=# select * from t;
t
------------------------
2017-01-01 00:00:00-05
2017-01-01 05:00:00-05
2016-12-31 17:00:00-05
(3 rows)
I have a log table in a PostgreSQL database with an event column of type timestamp without time zone.
Now I have a bash script, which creates a CSV file from the log database:
...
psql .. -c "COPY (SELECT event, ... FROM logtable order by event desc) TO STDOUT WITH CSV" logdb > log.csv
...
This is executed on the cloud server on which the DB is hosted and therefore, the timestamp strings in log.csv are in local time of the timezone of the server.
However, I like to have the timestamp strings to represent the time of my own time zone. So I shall be able to let psql transform the timestamp -> string to a given timezone. How can I achieve this?
First of all, you should use timestamptz instead of timestamp whenever working with multiple times zones. Would avoid the problem completely.
Details:
Ignoring timezones altogether in Rails and PostgreSQL
You can use the AT TIME ZONE construct like #NuLo suggests, it may even work, but not exactly as described.
AT TIME ZONE converts the type timestamp (timestamp without time zone) to timestamptz (timestamp with time zone) and vice versa. The text representation of a timestamptz value depends on the current setting of the time zone in the session in which you run the command. These two timestamptz values are 100 % identical (denote the same point in time):
'2015-09-02 15:55:00+02'::timestamptz
'2015-09-02 14:55:00+01'::timestamptz
But the text representation is not. The display is for different time zones. If you take this string literal and feed it to a timestamp type, the time zone part is just ignored and you end up with different values. Hence, if you run your COPY statement in a session with the same time zone setting as your original timestamp values are for, the suggested operation happens to work.
The clean way, however, is to produce correct timestamp values to begin with by applying AT TIME ZONE twice:
SELECT event AT TIME ZONE 'my_target_tz' AT TIME ZONE 'my_source_tz', ...
FROM logtable
ORDER BY event desc;
'my_target_tz' is "your own time zone" and 'my_source_tz' the time zone of the of the cloud server in the example. To make sure that DST is respected use time zone names, not time zone abbreviations. The documentation:
A time zone abbreviation, for example PST. Such a specification merely
defines a particular offset from UTC, in contrast to full time zone names
which can imply a set of daylight savings transition-date rules as well.
Related:
Accounting for DST in Postgres, when selecting scheduled items
Time zone names with identical properties yield different result when applied to timestamp
Or, much better yet, use timestamptz everywhere and it works correctly automatically.
I really need to store local date and time (aircraft wheels up date and time) and the utc offset so I can convert to utc to calculate intervals (flight duration). People travel on local time and you need utc for cross time zone computations. I had hoped to use timestamptz but it absolutely does not work for this purpose. It converts everything to a function of the postgres time zone. In my case this is yyyy-dd-mm hh:mi:ss-07.
However, I investigated timetz just to cover all the bases. It stores exactly what I need. It preserves the local time while providing the offset to utc. Except that now I need two columns instead of one to store the information.
My questions are:
Why do the timestamptz and the timetz functions give different results?
Is there a way to make the timestamptz include the local time zone offset rather than the system time zone offset?
Here are the queries that illustrate the difference:
select cast('2015-05-01 11:25:00 america/caracas' as timestamptz)
-- 2015-05-01 08:55:00-07
;
select cast('2015-05-01 11:25:00 america/caracas' as timetz)
-- 11:25:00-04:30
;
I personally find the wording timestamp with time zone confusing when trying to understand PostgreSQL's timestamptz, because it doesn't store any time zone. According to the docs:
All timezone-aware dates and times are stored internally in UTC. They are converted to local time in the zone specified by the TimeZone configuration parameter before being displayed to the client.
Notice, on that page, that the storage characteristics and limits of timestamp and timestamptz are identical.
In my head, to keep things straight, I translate timestamptz to "timestamp in the real world", and plain timestamp to "you probably didn't mean to use this type" (because, so far, I've only found myself needing to store timestamps that are related to the real world.)
So:
Why do the timestamptz and the timetz functions give different results?
It appears to be because PostgreSQL didn't feel they were allowed to make timetz work like timestamptz does:
The type time with time zone is defined by the SQL standard, but the definition exhibits properties which lead to questionable usefulness.
My guess is that some of these "properties" are the ones that they don't like, and you do.
And:
Is there a way to make the timestamptz include the local time zone offset rather than the system time zone offset?
There's no offset being stored for timestamptz values. They're just real-world timestamps. So the way to store the local time zone is the way you've already thought of: store it separately.
create table my_table (
happened timestamptz,
local_time_zone varchar
);
select happened at time zone 'UTC', happened at time zone local_time_zone
from my_table;
I have a column added_at of type timestamp without time zone. I want it's default value to be the current date-time but without time zone. The function now() returns a timezone as well.
How do I solve that problem?
SELECT now()::timestamp;
The cast converts the timestamptz returned by now() to the corresponding timestamp in your time zone - defined by the timezone setting of the session. That's also how the standard SQL function LOCALTIMESTAMP is implemented in Postgres.
If you don't operate in multiple time zones, that works just fine. Else switch to timestamptz for added_at. The difference?
Ignoring time zones altogether in Rails and PostgreSQL
BTW, this does exactly the same, just more noisy and expensive:
SELECT now() AT TIME ZONE current_setting('timezone');
Well you can do something like:
SELECT now() AT TIME ZONE current_setting('TimeZone');
SELECT now() AT TIME ZONE 'Europe/Paris';
SELECT now() AT TIME ZONE 'UTC';
Not sure how that makes any sense for a column "added_at". You almost always want an absolute timestamp (timestamp with time zone) not a floating one.
Edit responding to points below:
Yes, should use timestamp with time zone (absolute time) unless you have a good reason not to.
The client timezone is given by SHOW TimeZone or current_setting(...) as shown above.
Do take some time to skim the manuals - they cover all this quite well.
"Current Date/Time":
CURRENT_TIME and CURRENT_TIMESTAMP deliver values with time zone; LOCALTIME and LOCALTIMESTAMP deliver values without time zone.
New, and Native Answer in 2020
In PostgreSQL, If you only want the current date-time by calling CURRENT_TIMESTAMP() without time zone, and fractional digits in the seconds field which come after the decimal point of the seconds field?
(Tested on PostgreSQL v12.4)
Then use this:
SELECT CURRENT_TIMESTAMP(0)::TIMESTAMP WITHOUT TIME ZONE;
If you define your column's data type as timestamp (not as timestamptz), then you can store the timestamp without time zone, in that case you don't neet to add TIMESTAMP WITHOUT TIME ZONE
Like this:
CREATE TABLE foo (created timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP(0))
In the above function, 0 is passed to get rid of the fractional digits in the seconds field.
If your application doesn't care about timezone, you can use SELECT LOCALTIMESTAMP for it.
Ex:
SELECT LOCALTIMESTAMP
-- Result: 2023-01-30 17:43:33.628952