PostgreSQL: Can't get timezones with DST to work properly - postgresql

I'm trying to figure out how Postgres handles DST in combination with intervals. Specifically I want to allow my users to create events with recurring dates - e.g. everyday at 16:00 local time.
For what I'm doing I need to store the first date in the user's local time, and then add a number of days to it, without changing the time of day in the user's local time. I was hoping that timestamptz with a full timezone name (so it knows when to apply DST?) combined with simple 1 day intervals would do the job - but it fails at my simple example:
Germany uses CET (+1:00) and switches to CEST (+2:00) at 2:00 in the morning on March 28.
Tunesia uses CET all year.
Thus I expected, that using a timestamptz on March 27 and adding 1 day to it, I'd see a different utc-offset in Berlin, and no change in Tunis - but they both changed the offset equally, as if Tunis was using DST:
select
'2021-03-27 16:00:00 Africa/Tunis'::timestamptz as "tunis_before_dst",
'2021-03-27 16:00:00 Africa/Tunis'::timestamptz + INTERVAL '1 day' as "tunis_after_dst",
'2021-03-27 16:00:00 Europe/Berlin'::timestamptz as "berlin_before_dst",
'2021-03-27 16:00:00 Europe/Berlin'::timestamptz + INTERVAL '1 day' as "berlin_after_dst"
results in:
tunis_before_dst: '2021-03-27 16:00:00+01'
tunis_after_dst: '2021-03-28 16:00:00+02'
berlin_before_dst: '2021-03-27 16:00:00+01'
berlin_after_dst: '2021-03-28 16:00:00+02'
Looking through pg_timezone_names, I can see that my Postgres instance is aware of Africa/Tunis not having DST - so I'm wondering why it's changing the UTC offset for it then.
I guess it's obvious that timezones and DST are very confusing to me, but am I doing something wrong handling them or is timezonetz not what I think it is?

Rant first. The concept of DST is breathtaking nonsense. Even the name is obvious BS. "Daylight Saving Time". No daylight has been saved. I can't believe the EU still did not manage to get rid of it, even though the overwhelming majority wants it gone, and it's been consensus to scrap it for a while now.
With that out of my system, the primary misunderstanding is this: you assume that the data type timestamp with time zone would store time zone information. It does not. Becomes obvious here:
Thus I expected, [...] I'd see a different utc-offset in Berlin, and no change in
Tunis - but they both changed the offset equally
The time zone offset you see in the output is the offset determined by the current timezone setting of your session. Time zone serves as input / output modifier / decorator. Postgres always stores UTC time internally. And no time zone information whatsoever.
The type name is a bit deceiving there. It has been known to fool the best:
Time zone storage in data type "timestamp with time zone"
Once you've grasped that concept, the rest should become obvious.
To preserve the local time (wall clock time of day), use the data type timestamp without time zone (timestamp), or even just time (never use the broken timetz), and store time zone information additionally - ideally the time zone name ('Europe/Berlin' like you have it), not a time zone abbreviation or a numeric offset.
timestamp with time zone (timestamptz) is the right choice to store unique points in time, independent of any time zones. The time zone offset is just an input modifier. Both of the following literals result in the same timestamptz value exactly, because both time zones happen to apply the same offset at this time of the year:
'2021-03-27 16:00:00 Africa/Tunis'::timestamptz
'2021-03-27 16:00:00 Europe/Berlin'::timestamptz
But these differ by one hour, because the German offset has changed according to the local DST regime:
'2021-03-28 16:00:00 Africa/Tunis'::timestamptz
'2021-03-28 16:00:00 Europe/Berlin'::timestamptz
Related:
Ignoring time zones altogether in Rails and PostgreSQL
Preserve timezone in PostgreSQL timestamptz type

Related

Strange time zone in PostgreSQL timestamp conversion

This SQL:
select to_timestamp(extract(epoch from '0001-01-01 00:00:00'::timestamp))
produces this output:
0001-01-01 08:06:00+08:06
I realize that to_timestamp() always adds a time zone, hence the additional 8 hours and +8 in the time zone segment. But what is the :06? And where did the extra 6 minutes come from?
EDIT
If I initially execute set local timezone to 'UTC'; then I get expected results.
Before UTC was invented, each city had its own local time, mostly with a difference of just some minutes among each other.
Just after standardization of timezones (and the respective adoption by everybody), the local times were set to the values we know today.
That's why you get these strange results for ancient dates, specially before the year 1900.
Actually, Taipei changed from UTC+08:06 to UTC+08:00 only in Jan 1st of 1896, so dates before it will have the +08:06 offset.
If you set your timezone to UTC this doesn't happen, basically because UTC's offset is zero and never changes.

PostgreSQL/JDBC and TIMESTAMP vs. TIMESTAMPTZ

I've been going through a lot of pain dealing with Timestamps lately with JPA. I have found that a lot of my issues have been cleared up by using TIMESTAMPTZ for my fields instead of TIMESTAMP. My server is in UTC while my JVM is in PST. It seems almost impossible with JPA to normalize on UTC values in the database when using TIMESTAMP WITHOUT TIMEZONE.
For me I use these fields for stuff like "when was the user created", "when did they last use their device", "when was the last time they got an alert", etc. These are typically events so they are instance in time sorts of values. And because they will now by TIMESTAMPTZ I can always query them for a particular zone if I don't want them UTC.
So my question is, for a Java/JPA/PostgreSQL server, when WOULD I want to use TIMESTAMP over TIMESTAMPTZ? What are the use cases for it? Right now I have a hard time seeing why I'd ever want to use TIMESTAMP and because of that I'm concerned that I'm not grasping its value.
Generally use TIMESTAMPTZ
Here's advice from David E. Wheeler, a Postgres expert, in a blog post whose title says it all:Always Use TIMESTAMP WITH TIME ZONE (TIMESTAMPTZ)
If you are tracking actual moments, specific points on the timeline, use TIMESTAMP WITH TIME ZONE.
One Exception: Partitioning
Wheeler’s sole exception is when partitioning on timestamps, because of technical limitations. A rare exception for most of us.
For information about partitioning, see doc and see the Wiki.
Misnomer
The data types names timestamp with time zone and timestamp without time zone are misnomers. In both cases the date-time value is stored in UTC (no time zone offset). Read that previous sentence again. UTC, always. The "with time zone" phrase means "with attention paid to time zone", not "store the time zone alongside this value". The difference between the types is whether any time zone should be applied either during storage (INSERT or UPDATE) or retrieval (SELECT query). (This behavior is described for Postgres -- Other databases vary widely in this regard.)
More precisely, one should say that TIMESTAMP WITHOUT TIME ZONE stores date-time values with no time zone. But without any time frame reference, anyone looking at that data would have to assume (hope, pray?) that the values are UTC. But again, moot as you should almost never use this type.
Read the doc carefully, and experiment a bit to clarify your understanding.
Unzoned
If you want to store the general idea of a possible time rather than a specific moment, use the other type, TIMESTAMP WITHOUT TIME ZONE.
For example, Christmas starts this year at the first moment of December 25th, 2017. That would be 2017-12-25T
00:00:00 with no indicator of time zone nor offset-from-UTC. This value is only a vague idea about possible moments. It has no meaning until we apply a time zone (or offset). So we store this using TIMESTAMP WITHOUT TIME ZONE.
The elves staffing Santa’s Special Events Logistics Department apply the time zones as part of their planning process. The earliest time zone is currently Pacific/Kiribati, 14 hours ahead of UTC. The elves schedule Santa’s first arrival there. The elves schedule a flight plan taking the reindeer on to other time zones where midnight comes shortly after, such as Pacific/Auckland. They continue going westward as each zone’s midnight arrives. Hours later in Asia/Kolkata, still later in Europe/Paris, still more hours later in America/Montreal and so on.
Each of these specific delivery moments would be recorded by the elves using WITH TIME ZONE, while that general idea of Christmas would by stored as WITHOUT TIME ZONE.
Another use in business apps for WITHOUT TIME ZONE is scheduling appointments farther out than several weeks. Politicians around the world have an inexplicable predilection for messing with the clock and redefining time zone rules. They join Daylight Saving Time (DST), leave DST, start DST on a different date, or end DST on a different date, or shift their clocks by 15 minutes or half-hour. All of these have been done in last several years by Turkey, United States, Russia, Venezuela, and others.
The politicians often make these changes with little forewarning. So if you are scheduling a dental appointment for six months out at 13:00, that should probably be stored as TIMESTAMP WITHOUT TIME ZONE or otherwise the politicians may effectively be changing you appointment to noon, or 2 PM, or 13:30.
You could use it to represent what Joda-Time and the new Java 8 time APIs call a LocalDateTime. A LocalDateTime doesn't represent a precise point on the timeline. It's just a set of fields, from year to nanoseconds. It is "a description of the date, as used for birthdays, combined with the local time as seen on a wall clock".
You could use it to represent, for example, the fact that your precise birth date is 1975-07-19 at 6 PM. Or that, all across the world, the next new year is celebrated on 2015-01-01 at 00:00.
To represent precise moments, like the moment Armstrong walked on the moon, a timestamp with timezone is indeed more appropriate. Regardless of the timezone of the JVM and the timezone of the database, it should return you the correct moment.
Update for the answers above: partitioning is no longer an exceptional case in PG11 thanks to pruning.
https://www.postgresql.org/docs/11/ddl-partitioning.html#DDL-PARTITION-PRUNING
Personally successfully tested queries against PG11 AWS RDS. Also the official PG wiki states the use of timestamp without timezone is a bad idea:
https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use_timestamp_.28without_time_zone.29_to_store_UTC_times
With the Java 8 date & time API I wouldn't blindly jump into a timestamptz camp.
If you map timestamp <=> LocalDateTime you always get the same value regardless default Java application timezone. Regardless how many calls TimeZone.setDefault(TimeZone.getTimeZone("TZ")) mixing different TZ you put in between SELECT/INSERT you will get the same LocalDateTime in Java at any time and date/time components will be the same as in Postgresql TO_CHAR(ts, 'YYYY-MM-DD HH24:MI:SS').
If you map timestamptz <=> LocalDateTime Postgresql JDBC driver (supporting JDBC 4.2 spec) converts LocalDateTime to UTC using default Java timezone when saving value to DB. If you save it in one default TZ and read in another you get different "local" results.
Airplane departure time is local to an airport. If you don't need to compare departure time between different cities timestamptz & UTC doesn't make sense, you just print exact city local time in a ticket. With timestamp it is possible to keep date/time as is, avoiding double TZ correction due to Java app default TZ + city specific TZ (business logic).
timestamptz is useful when you heavily convert TZ in SQL. With only timestamp you write:
date_trunc('day', x.datecol AT TIME ZONE 'UTC' AT TIME ZONE x.timezone)
AT TIME ZONE x.timezone AT TIME ZONE 'UTC'
while with timestamptz there is no need to mention that time is in UTC (if you follow such convention, probably you should xD):
date_trunc('day', x.datecol AT TIME ZONE x.timezone)
AT TIME ZONE x.timezone
Operator AT TIME ZONE is overloaded:
timestamp AT TIME ZONE 'X' => timestamptz
timestamptz AT TIME ZONE 'X' => timestamp
Postgresql JDBC + Java 8 date&time API spec.

Why do Postgres timezones change around 1967/68?

On 9.3.3, if one runs:
select
EXTRACT(TIMEZONE FROM timestamp with time zone '1911-03-01 00:00 -8:00:00'),
EXTRACT(TIMEZONE FROM timestamp with time zone '1911-05-15 00:00 -8:00:00'),
EXTRACT(TIMEZONE FROM timestamp with time zone '1917-03-01 00:00 -8:00:00'),
EXTRACT(TIMEZONE FROM timestamp with time zone '1917-05-15 00:00 -8:00:00'),
EXTRACT(TIMEZONE FROM timestamp with time zone '1967-03-01 00:00 -8:00:00'),
EXTRACT(TIMEZONE FROM timestamp with time zone '1967-05-15 00:00 -8:00:00'),
EXTRACT(TIMEZONE FROM timestamp with time zone '1968-03-01 00:00 -8:00:00'),
EXTRACT(TIMEZONE FROM timestamp with time zone '1968-05-15 00:00 -8:00:00');
One gets the following results:
0;0;
0;3600;
0;3600;
3600;3600
(The first time is the founding day of Las Vegas, the next few are some I used to debug the issue)
It seems there is no offset around 1911, an offset between 1911 and 1967 during summer but not winter and then always has one from 1968 onwards. This seems a little weird. Does anyone have any idea what is going on with the offsets here and whether this is expected behaviour or if there is something in my linux's setup that I could possibly change?
Time zones change for all sorts of reasons.
Daylight savings rules change.
Sometimes timezone offsets change, too, if nations redefine their time zone for political reasons.
The canonical time zone information database is the tz or "zoneinfo" database, which used to be called the Olsen database. The zoneinfo DB is on the IANA site. There are a variety of programs to dump human readable versions of the DB.
You can use timestamp without time zone if you wish to store a particular moment in wall-clock time, without concern for time zone.
timestamp with time zone is sensitive to the system TimeZone setting on input and output, and is stored in UTC time as absolute seconds. So it's converted for input and output. If you want different conversions or to override the conversion you can use the AT TIME ZONE operator.
The rules for your time zone are established by law, and the law changes.
If you want to store the INSTANT at which the (local) clocks in Las Vegas marked 00:00:00 in the day in which the city was founded, and assuming that Las Vegas was using an offset of -8 hours, then you should store 1911-03-01 00:00 -8:00:00::timezonetx in a timezonetx field. Be aware, however, that what is really stored is only the "universal instant", when you read it you cannot know to which "local time at Las Vegas" it corresponds (unless you explicityl convert it, after reading it, to a timezone).
Depends on your timezone, really. In 1966, DST was first implemented nationally in the US to take place in 1967. States needed to pass laws to end it, meaning that in many areas, 1967 is the only year which used DST. This can lead to some very interesting glitches, where every date except those in 1967/68 behaves normally.

PostgreSQL date() with timezone

I'm having an issue selecting dates properly from Postgres - they are being stored in UTC, but
not converting with the Date() function properly.
Converting the timestamp to a date gives me the wrong date if it's past 4pm PST.
2012-06-21 should be 2012-06-20 in this case.
The starts_at column datatype is timestamp without time zone. Here are my queries:
Without converting to PST timezone:
Select starts_at from schedules where id = 40;
starts_at
---------------------
2012-06-21 01:00:00
Converting gives this:
Select (starts_at at time zone 'pst') from schedules where id = 40;
timezone
------------------------
2012-06-21 02:00:00-07
But neither convert to the correct date in the timezone.
Basically what you want is:
$ select starts_at AT TIME ZONE 'UTC' AT TIME ZONE 'US/Pacific' from schedules where id = 40
I got the solution from this article is below, which is straight GOLD!!! It explains this non-trivial issue very clearly, give it a read if you wish to understand pstgrsql TZ management better.
Expressing PostgreSQL timestamps without zones in local time
Here is what is going on. First you should know that 'PST timezone is 8 hours behind UTC timezone so for instance Jan 1st 2014, 4:30 PM PST (Wed, 01 Jan 2014 16:00:30 -0800) is equivalent to Jan 2nd 2014, 00:30 AM UTC (Thu, 02 Jan 2014 00:00:30 +0000). Any time after 4:00pm in PST slips over to the next day, interpreted as UTC.
Also, as Erwin Brandstetter mentioned above, postresql has two type of timestamps data type, one with a timezone and one without.
If your timestamps include a timezone, then a simple:
$ select starts_at AT TIME ZONE 'US/Pacific' from schedules where id = 40
will work. However if your timestamp is timezoneless, executing the above command will not work, and you must FIRST convert your timezoneless timestamp to a timestamp with a timezone, namely a UTC timezone, and ONLY THEN convert it to your desired 'PST' or 'US/Pacific' (which are the same up to some daylight saving time issues. I think you should be fine with either).
Let me demonstrate with an example where I create a timezoneless timestamp. Let's assume for convenience that our local timezone is indeed 'PST' (if it weren't then it gets a tiny bit more complicated which is unnecessary for the purpose of this explanation).
Say I have:
$ select timestamp '2014-01-2 00:30:00' AS a, timestamp '2014-01-2 00:30:00' AT TIME ZONE 'UTC' AS b, timestamp '2014-01-2 00:30:00' AT TIME ZONE 'UTC' AT TIME ZONE 'PST' AS c, timestamp '2014-01-2 00:30:00' AT TIME ZONE 'PST' AS d
This will yield:
"a"=>"2014-01-02 00:30:00" (This is the timezoneless timestamp)
"b"=>"2014-01-02 00:30:00+00" (This is the UTC TZ timestamp, note that up to a timezone, it is equivalent to the timezoneless one)
"c"=>"2014-01-01 16:30:00" (This is the correct 'PST' TZ conversion of the UTC timezone, if you read the documentation postgresql will not print the actual TZ for this conversion)
"d"=>"2014-01-02 08:30:00+00"
The last timestamp is the reason for all the confusion regarding converting timezoneless timestamp from UTC to 'PST' in postgresql. When we write:
timestamp '2014-01-2 00:30:00' AT TIME ZONE 'PST' AS d
We are taking a timezoneless timestamp and try to convert it to 'PST TZ (we indirectly assume that postgresql will understand that we want it to convert the timestamp from a UTC TZ, but postresql has plans of its own!). In practice, what postgresql does is it takes the timezoneless timestamp ('2014-01-2 00:30:00) and treats it as if it WERE ALREADY a 'PST' TZ timestamp (i.e: 2014-01-2 00:30:00 -0800) and converts that to UTC timezone!!! So it actually pushes it 8 hours ahead instead of back! Thus we get (2014-01-02 08:30:00+00).
Anyway, this last (un-intuitive) behavior is the cause of all confusion. Read the article if you want a more thorough explanation, I actually got results which are a bit different then their on this last part, but the general idea is the same.
I don't see the exact type of starts_at in your question. You really should include this information, it is the key to the solution. I'll have to guess.
PostgreSQL always stores UTC time for the type timestamp with time zone internally. Input and output (display) are adjusted to the current timezone setting or to the given time zone. The effect of AT TIME ZONE also changes with the underlying data type. See:
Ignoring time zones altogether in Rails and PostgreSQL
If you extract a date from type timestamp [without time zone], you get the date for the current time zone. The day in the output will be the same as in the display of the timestamp value.
If you extract a date from type timestamp with time zone (timestamptz for short), the time zone offset is "applied" first. You still get the date for the current time zone, which agrees with the display of the timestamp. The same point in time translates to the next day in parts of Europe, when it is past 4 p.m. in California for instance. To get the date for a certain time zone, apply AT TIME ZONE first.
Therefore, what you describe at the top of the question contradicts your example.
Given that starts_at is a timestamp [without time zone] and the time on your server is set to the local time. Test with:
SELECT now();
Does it display the same time as a clock on your wall? If yes (and the db server is running with correct time), the timezone setting of your current session agrees with your local time zone. If no, you may want to visit the setting of timezone in your postgresql.conf or your client for the session. Details in the manual.
Be aware that the timezone offset used the opposite sign of what's displayed in timestamp literals. See:
Peculiar time zone handling in a Postgres database
To get your local date from starts_at just
SELECT starts_at::date
Tantamount to:
SELECT date(starts_at)
BTW, your local time is at UTC-7 right now, not UTC-8, because daylight savings time is in effect (not among the brighter ideas of the human race).
Pacific Standard TIME (PST) is normally 8 hours "earlier" (bigger timestamp value) than UTC (Universal Time Zone), but during daylight saving periods (like now) it can be 7 hours. That's why timestamptz is displayed as 2012-06-21 02:00:00-07 in your example. The construct AT TIME ZONE 'PST' takes daylight saving time into account. These two expressions yield different results (one in winter, one in summer) and may result in different dates when cast:
SELECT '2012-06-21 01:00:00'::timestamp AT TIME ZONE 'PST'
, '2012-12-21 01:00:00'::timestamp AT TIME ZONE 'PST'
I know this is an old one but You may want to consider using AT TIME ZONE "US/Pacific" when casting to avoid any PST/PDT issues. So
SELECT starts_at::TIMESTAMPTZ AT TIME ZONE "US/Pacific"
FROM schedules
WHERE ID = '40';

Time zone with daylight savings times in PostgreSQL

We're deploying our own stream gauges (a lot like this USGS gauge: http://waterdata.usgs.gov/usa/nwis/uv?site_no=03539600) so us kayakers know whether or not there's enough water to paddle the stream and don't waste time and gas to drive out there. We hope install a few of these across the southeast whitewater region which spans the eastern and central time zones.
I'm storing the time a record is inserted using the default value of current_time for the record. I'd like to later display the data using the MM/DD/YYYY HH12:MI AM TZ format, which outputs reading like 03/12/2012 01:00 AM CDT. I'd also like for the output to be aware of changes in day light savings time, so the last part of the previous sentence would change between CST and CDT when we 'spring forward' and 'fall back'. This change occurred on 3/11/2012 this year and I've included dates on both sides of this DST line below. I'm using my Windows 7 laptop for development and we will later be deploying on a Unix box. Postgres has apparently detected that my Windows computer is set to eastern US time zone. I'm trying this with a 'timestamp without time zone' field and a 'timestamp with time zone' field but can't get it to work.
I've tried using 'at time zone' in my selects and every thing is working until it's time to display the time zone. The actual hour is part of the time stamp is correctly subtracted by an hour when I ask for the time in CDT. But EDT is displayed in the output.
SELECT reading_time as raw,
reading_time at time zone 'CDT',
to_char(reading_time at time zone 'CDT',
'MM/DD/YYYY HH12:MI AM TZ') as formatted_time
FROM readings2;
"2012-04-29 17:59:35.65";"2012-04-29 18:59:35.65-04";"04/29/2012 06:59 PM EDT"
"2012-04-29 17:59:40.19";"2012-04-29 18:59:40.19-04";"04/29/2012 06:59 PM EDT"
"2012-03-10 00:00:00";"2012-03-10 00:00:00-05";"03/10/2012 12:00 AM EST"
"2012-03-11 00:00:00";"2012-03-11 00:00:00-05";"03/11/2012 12:00 AM EST"
"2012-03-12 00:00:00";"2012-03-12 01:00:00-04";"03/12/2012 01:00 AM EDT"
I'm storing the time zone that each of our gauges is located in a character varying field a separate table. I considered just appending this value to the end of the time output, but I want it to change from from CST to CDT without my intervention.
Thanks for your help.
Instead of using time zone abbreviations like CDT or CST, you could consider using full Olson-style time zone names. In the case of central time, you could choose a time zone. Either one that matches your location, such as America/Chicago, or just US/Central. This ensures PostgreSQL uses the Olson tz database to automatically figure out whether daylight saving time applies at any given date.
You definitely want a TIMESTAMP WITH TIME ZONE column (which is also known as timestamptz in PostgreSQL). That will store the timestamp in UTC, so that it represents a particular moment in time. Contrary to what the name suggests, it does not save a time zone in the column -- you can view the retrieved timestamp in the time zone of your choosing with the AT TIME ZONE phrase.
The semantics of TIMESTAMP WITHOUT TIME ZONE are confusing and nearly useless. I strongly recommend you don't use that type at all for what you are describing.
I'm really confused by the part of the question which talks about storing the timestamp in a CHARACTER VARYING column. That seems as though it might be part of the problem. If you can store it in timestamptz right from the start I suspect that you will have fewer problems. Barring that, it would be safest to use the -04 notation for offset from UTC; but that seems like more work to me for no benefit.
You can create a table of known timezones in the format suggested in Guan Yang's answer, and then use a foreign key column to this table. Valid timezones can be obtained from pg_timezone_names I've gone into more detail in this related answer.