I'm developing an app where a user can request that an email be sent to them at a specific time every day in their timezone. For example User A lives in London and schedules an email at 2pm every day London time and User B lives in New York and schedules an email at 2pm New York time.
I'm wondering what way I should configure my database postgres such that a scheduler can fire every minute and query for all emails to be sent at that minute regardless of what timezone their in.
The one thing I want to avoid is having to run multiple queries, once per timezone.
Due to the (rather idiotic, quite frankly) rules for daylight saving times (DST) across the world, a local time can mean all kind of things in absolute (UTC time).
Save a time (not timetz!) and the time zone name (not the abbreviation) for when to send the emails. Tricky details under this related question:
Time zone names with identical properties yield different result when applied to timestamp
CREATE TABLE event (
event_id serial PRIMARY KEY
, alarm_time time -- local alarm time
, tz text -- time zone name
, ...
);
Use the following expression to "cook" the exact daily point in time, taking local DST settings into account:
SELECT current_date + alarm_time AT TIME ZONE tz;
Example:
SELECT current_date + '2:30'::time AT TIME ZONE 'Europe/London' AS alarm_ts
Returns:
alarm_ts
2014-05-19 02:30:00+02
Use timestamp with time zone (timestamptz) across your whole application. Be sure to understand how it works. This comprehensive post may be of help (also explains the AT TIME ZONE construct:
Ignoring timezones altogether in Rails and PostgreSQL
Just to be clear, once you have "cooked" the daily UTC time, you can translate it to and work with any local time just as well. But it might be less confusing to do all the rest in UTC.
Related
CURRENT SITUATION:
I have a table of wildfire incidents with a timestamp with time zone (timestamptz) field to track when the observation occurred.
Everything in the data creation process is in UTC: the incoming data from the source, the app server that inserts the data, the insert python code (appends a "Z" to the time), and the database server are all in UTC.
The incidents' geographic extent spans several time zones in the US, Canada, and Mexico.
PROBLEM:
I've been querying on a day's worth of data in UTC time, but need to extract out data relative to local time. The midnight to midnight range will be different in each time zone.
My use case now is one day, but I was asked to consider arbitrary time ranges. E.g.: find all incidents in the hottest part of the day (say 10:00 to 18:00) local time.
This table is quite large and I have an index on the timestamptz field right now. Any changes I make will need to work with an index.
Account for daylight saving time.
I have a method to get the time zone for each record, so I don't need help with that.
I created a test table for this with a timestamptz field ts_with and a varchar field for the time zone tz. The following query returns what I want, so I feel like I'm making progress.
SELECT
name, test_tz.ts_with, test_tz.tz,
TIMEZONE(test_tz.tz, test_tz.ts_with) as timezone_with
FROM fire_info.test_tz
WHERE TIMEZONE(test_tz.tz, test_tz.ts_with) BETWEEN
'2018-08-07 00:00:00' AND '2018-08-07 23:59:59';
QUESTIONS:
Will this use my index? I'm thinking the timezone function will avoid it. Any solution for that? I'm considering adding another condition to the where clause that selects on timestamptz buffered by a day on either side. That would use the index and then the timezone function isn't sorting through too much data (~6k records per day during fire season). Would PG figure that out?
The timezone function is giving me DST offsets (e.g.: Denver is currently UTC-06). I assume I'll get standard time after DST ends. If I run a query in December for data in August, will it apply standard time or DST?
thanks!!!
The way you wrote the query, it cannot use an index on ts_with.
To use an index, the condition would have to be of the form ts_with <operator> <constant>, and there is no way to rewrite the query in that fashion.
So you should create a second index on timezone(test_tz.tz, test_tz.ts_with).
I'm writing a "send me messages at this time" app. I'm storing recurrence information in this manner:
Schedules
----------
days_of_week: [3, 4, 5]
hours_of_day: [8, 13, 22]
time_zone: "Pacific Time (US & Canada)"
Works fine in displaying, but I need to write a frequent cron job that grabs all schedules for "right now (utc)". So, if the the cron job is running at 09:00 UTC Monday, I need to grab all schedules where
Monday is in days_of_week (where days_of_week #> ARRAY[1])
hours_of_day is at 09:00 UTC. This is given hours_of_day is stored as an array of integers, but we are also storing the user's time_zone.
So the user may say: "deliver me a message at 9am Monday" (which we store as [9]), but that means 9am in their time zone.
Questions:
Any way to query all schedules given these parameters?
If not, is there a better way to structure the data to ensure easier querying through Postgres? The schema is flexible.
Thanks in advance!
Postgres has superb facilities for working with timezones, and I've written something very similar to what you're asking about here using the AT TIME ZONE construct. In addition to your fields, I use a last_scheduled_at to flag when a schedule was last "executed"--i.e., when the last successful cron job ran for that schedule to avoid double-scheduling, and a deleted_at for logical deletion of schedules.
My schema for schedules was similar, except that I had only a single hour. I stored days in an array, like you, and the timezone as text. The fields in my schedules table are dows, hour, and timezone.
This was the query:
SELECT
s.*
FROM
schedules s
WHERE
ARRAY[extract(dow from timestamptz (now() at time zone timezone))] && dows
AND hour = extract(hour from timestamptz (now() at time zone timezone))
AND (s.last_scheduled_at IS NULL
OR s.last_scheduled_at < (now() - interval '12 hours'))
AND s.deleted_at IS NULL
LIMIT
1000
I use && (overlaps) rather than #> (contains), but either works. You'll probably also want the limit so you can process the work in batches (keep running this and you're done for hour X if you get zero results; make sure you're done well before the hour is up). You'll also probably want to pass the timestamp as a parameter to this query--I've inlined it here as now() to simplify things, but passing the time as a parameter makes testing this a lot easier.
Note also that Postgres can be picky with time zone names and abbreviations and its behavior with daylight saving time can be counterintuitive: e.g., Pacific Standard Time and Pacific Daylight Time are treated as two distinct time zones (for the purposes of AT TIME ZONE):
maciek=# select now() at time zone 'pst';
timezone
----------------------------
2015-10-09 23:14:51.856813
(1 row)
maciek=# select now() at time zone 'pdt';
timezone
----------------------------
2015-10-10 00:14:54.402524
(1 row)
That is, Daylight Saving Time is always there, whether you are currently observing it or not. If you're letting people enter the time zone directly, it's good to either reject these or automatically coerce these to 'America/Los_Angeles' (or whatever time zone they happen to map to), which will handle these conversions for you automatically according to the time zone rules your Postgres version has (make sure you update to point releases promptly if accuracy is critical here for areas that have frequent time zone changes). The list of time zone names used by Postgres can be found in the Olson database. The Postgres tables pg_timezone_names
and pg_timezone_abbrevs may also be of interest.
I have column 'event_date_time' in my 'events' table with type 'timestamp with timezone'. My python flask application is saving date like '2014-08-30 02:17:02+00:00' but postgres automatically converts it to '2014-08-30 07:17:02+05'. It converts the timestamp to my local timezone i-e Pakistan. I want to save it without converting.
I have tried
set timezone='UTC'
and it does change timezone to 'UTC' but pgadmin3 is still saving the converted time.
I am using MAC OS and Postgresql 9.3.
The reason pgadmin is displaying hours +5 is because your system timezone is set to this.
When you save a "timestamp with time zone" value at GMT + or - any value, the system offsets whatever timezone your input was to GMT (or UTC), so that when you go to retrieve it, you can specify the timezone you want it displayed in.
For example let's establish a current time for say... New York.
select now()::timestamp with time zone at time zone 'America/New_York';
At the time of asking it returned '2014-08-23 08:50:57.136817'. 8:50 Saturday morning, or 8:51 if you're being pedantic.
Now if we take that same time and display it in GMT we will see a different result:
select '2014-08-23 08:50:57.136817 America/New_York'::timestamp with time zone at time zone 'GMT';
Now have a new time of '2014-08-23 12:50:57.136817'... 5 hours into the "future"!
Finally let's get the original timestamp and display it in what I believe is the Pakistan time zone (PKT) and see what it shows
select '2014-08-23 08:50:57.136817 America/New_York'::timestamp with time zone at time zone 'PKT';
The result? '2014-08-23 17:50:57.136817' further into the future still!
Again I must stress the reason it can do this is because it is always converting the input time offset to UTC or GMT. Postgres processes all of its "timestamp with time zone" data types in this way. It is designed to avoid time zone problems such as daylight savings and so on.
Your issue appears to be that python is inserting the time at an offset of +00, and if this was supposed to be a local time then you will be 5 hours off as far as postgres is concerned. Without knowing exactly what queries python is making, I would assume you may want to look at that to make sure it is giving you the correct time, presumably set timezone='PKT' should be a fix. Either way, when you are viewing timestamp with time zone using a browser such as pgadmin, the timestamp is being converted to your local timezone and this is why you see +5.
Alternatively if you do wish to see those times at +00 then you must specify that you want this in your SELECT queries.
I've been going through a lot of pain dealing with Timestamps lately with JPA. I have found that a lot of my issues have been cleared up by using TIMESTAMPTZ for my fields instead of TIMESTAMP. My server is in UTC while my JVM is in PST. It seems almost impossible with JPA to normalize on UTC values in the database when using TIMESTAMP WITHOUT TIMEZONE.
For me I use these fields for stuff like "when was the user created", "when did they last use their device", "when was the last time they got an alert", etc. These are typically events so they are instance in time sorts of values. And because they will now by TIMESTAMPTZ I can always query them for a particular zone if I don't want them UTC.
So my question is, for a Java/JPA/PostgreSQL server, when WOULD I want to use TIMESTAMP over TIMESTAMPTZ? What are the use cases for it? Right now I have a hard time seeing why I'd ever want to use TIMESTAMP and because of that I'm concerned that I'm not grasping its value.
Generally use TIMESTAMPTZ
Here's advice from David E. Wheeler, a Postgres expert, in a blog post whose title says it all:Always Use TIMESTAMP WITH TIME ZONE (TIMESTAMPTZ)
If you are tracking actual moments, specific points on the timeline, use TIMESTAMP WITH TIME ZONE.
One Exception: Partitioning
Wheeler’s sole exception is when partitioning on timestamps, because of technical limitations. A rare exception for most of us.
For information about partitioning, see doc and see the Wiki.
Misnomer
The data types names timestamp with time zone and timestamp without time zone are misnomers. In both cases the date-time value is stored in UTC (no time zone offset). Read that previous sentence again. UTC, always. The "with time zone" phrase means "with attention paid to time zone", not "store the time zone alongside this value". The difference between the types is whether any time zone should be applied either during storage (INSERT or UPDATE) or retrieval (SELECT query). (This behavior is described for Postgres -- Other databases vary widely in this regard.)
More precisely, one should say that TIMESTAMP WITHOUT TIME ZONE stores date-time values with no time zone. But without any time frame reference, anyone looking at that data would have to assume (hope, pray?) that the values are UTC. But again, moot as you should almost never use this type.
Read the doc carefully, and experiment a bit to clarify your understanding.
Unzoned
If you want to store the general idea of a possible time rather than a specific moment, use the other type, TIMESTAMP WITHOUT TIME ZONE.
For example, Christmas starts this year at the first moment of December 25th, 2017. That would be 2017-12-25T
00:00:00 with no indicator of time zone nor offset-from-UTC. This value is only a vague idea about possible moments. It has no meaning until we apply a time zone (or offset). So we store this using TIMESTAMP WITHOUT TIME ZONE.
The elves staffing Santa’s Special Events Logistics Department apply the time zones as part of their planning process. The earliest time zone is currently Pacific/Kiribati, 14 hours ahead of UTC. The elves schedule Santa’s first arrival there. The elves schedule a flight plan taking the reindeer on to other time zones where midnight comes shortly after, such as Pacific/Auckland. They continue going westward as each zone’s midnight arrives. Hours later in Asia/Kolkata, still later in Europe/Paris, still more hours later in America/Montreal and so on.
Each of these specific delivery moments would be recorded by the elves using WITH TIME ZONE, while that general idea of Christmas would by stored as WITHOUT TIME ZONE.
Another use in business apps for WITHOUT TIME ZONE is scheduling appointments farther out than several weeks. Politicians around the world have an inexplicable predilection for messing with the clock and redefining time zone rules. They join Daylight Saving Time (DST), leave DST, start DST on a different date, or end DST on a different date, or shift their clocks by 15 minutes or half-hour. All of these have been done in last several years by Turkey, United States, Russia, Venezuela, and others.
The politicians often make these changes with little forewarning. So if you are scheduling a dental appointment for six months out at 13:00, that should probably be stored as TIMESTAMP WITHOUT TIME ZONE or otherwise the politicians may effectively be changing you appointment to noon, or 2 PM, or 13:30.
You could use it to represent what Joda-Time and the new Java 8 time APIs call a LocalDateTime. A LocalDateTime doesn't represent a precise point on the timeline. It's just a set of fields, from year to nanoseconds. It is "a description of the date, as used for birthdays, combined with the local time as seen on a wall clock".
You could use it to represent, for example, the fact that your precise birth date is 1975-07-19 at 6 PM. Or that, all across the world, the next new year is celebrated on 2015-01-01 at 00:00.
To represent precise moments, like the moment Armstrong walked on the moon, a timestamp with timezone is indeed more appropriate. Regardless of the timezone of the JVM and the timezone of the database, it should return you the correct moment.
Update for the answers above: partitioning is no longer an exceptional case in PG11 thanks to pruning.
https://www.postgresql.org/docs/11/ddl-partitioning.html#DDL-PARTITION-PRUNING
Personally successfully tested queries against PG11 AWS RDS. Also the official PG wiki states the use of timestamp without timezone is a bad idea:
https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use_timestamp_.28without_time_zone.29_to_store_UTC_times
With the Java 8 date & time API I wouldn't blindly jump into a timestamptz camp.
If you map timestamp <=> LocalDateTime you always get the same value regardless default Java application timezone. Regardless how many calls TimeZone.setDefault(TimeZone.getTimeZone("TZ")) mixing different TZ you put in between SELECT/INSERT you will get the same LocalDateTime in Java at any time and date/time components will be the same as in Postgresql TO_CHAR(ts, 'YYYY-MM-DD HH24:MI:SS').
If you map timestamptz <=> LocalDateTime Postgresql JDBC driver (supporting JDBC 4.2 spec) converts LocalDateTime to UTC using default Java timezone when saving value to DB. If you save it in one default TZ and read in another you get different "local" results.
Airplane departure time is local to an airport. If you don't need to compare departure time between different cities timestamptz & UTC doesn't make sense, you just print exact city local time in a ticket. With timestamp it is possible to keep date/time as is, avoiding double TZ correction due to Java app default TZ + city specific TZ (business logic).
timestamptz is useful when you heavily convert TZ in SQL. With only timestamp you write:
date_trunc('day', x.datecol AT TIME ZONE 'UTC' AT TIME ZONE x.timezone)
AT TIME ZONE x.timezone AT TIME ZONE 'UTC'
while with timestamptz there is no need to mention that time is in UTC (if you follow such convention, probably you should xD):
date_trunc('day', x.datecol AT TIME ZONE x.timezone)
AT TIME ZONE x.timezone
Operator AT TIME ZONE is overloaded:
timestamp AT TIME ZONE 'X' => timestamptz
timestamptz AT TIME ZONE 'X' => timestamp
Postgresql JDBC + Java 8 date&time API spec.
We're deploying our own stream gauges (a lot like this USGS gauge: http://waterdata.usgs.gov/usa/nwis/uv?site_no=03539600) so us kayakers know whether or not there's enough water to paddle the stream and don't waste time and gas to drive out there. We hope install a few of these across the southeast whitewater region which spans the eastern and central time zones.
I'm storing the time a record is inserted using the default value of current_time for the record. I'd like to later display the data using the MM/DD/YYYY HH12:MI AM TZ format, which outputs reading like 03/12/2012 01:00 AM CDT. I'd also like for the output to be aware of changes in day light savings time, so the last part of the previous sentence would change between CST and CDT when we 'spring forward' and 'fall back'. This change occurred on 3/11/2012 this year and I've included dates on both sides of this DST line below. I'm using my Windows 7 laptop for development and we will later be deploying on a Unix box. Postgres has apparently detected that my Windows computer is set to eastern US time zone. I'm trying this with a 'timestamp without time zone' field and a 'timestamp with time zone' field but can't get it to work.
I've tried using 'at time zone' in my selects and every thing is working until it's time to display the time zone. The actual hour is part of the time stamp is correctly subtracted by an hour when I ask for the time in CDT. But EDT is displayed in the output.
SELECT reading_time as raw,
reading_time at time zone 'CDT',
to_char(reading_time at time zone 'CDT',
'MM/DD/YYYY HH12:MI AM TZ') as formatted_time
FROM readings2;
"2012-04-29 17:59:35.65";"2012-04-29 18:59:35.65-04";"04/29/2012 06:59 PM EDT"
"2012-04-29 17:59:40.19";"2012-04-29 18:59:40.19-04";"04/29/2012 06:59 PM EDT"
"2012-03-10 00:00:00";"2012-03-10 00:00:00-05";"03/10/2012 12:00 AM EST"
"2012-03-11 00:00:00";"2012-03-11 00:00:00-05";"03/11/2012 12:00 AM EST"
"2012-03-12 00:00:00";"2012-03-12 01:00:00-04";"03/12/2012 01:00 AM EDT"
I'm storing the time zone that each of our gauges is located in a character varying field a separate table. I considered just appending this value to the end of the time output, but I want it to change from from CST to CDT without my intervention.
Thanks for your help.
Instead of using time zone abbreviations like CDT or CST, you could consider using full Olson-style time zone names. In the case of central time, you could choose a time zone. Either one that matches your location, such as America/Chicago, or just US/Central. This ensures PostgreSQL uses the Olson tz database to automatically figure out whether daylight saving time applies at any given date.
You definitely want a TIMESTAMP WITH TIME ZONE column (which is also known as timestamptz in PostgreSQL). That will store the timestamp in UTC, so that it represents a particular moment in time. Contrary to what the name suggests, it does not save a time zone in the column -- you can view the retrieved timestamp in the time zone of your choosing with the AT TIME ZONE phrase.
The semantics of TIMESTAMP WITHOUT TIME ZONE are confusing and nearly useless. I strongly recommend you don't use that type at all for what you are describing.
I'm really confused by the part of the question which talks about storing the timestamp in a CHARACTER VARYING column. That seems as though it might be part of the problem. If you can store it in timestamptz right from the start I suspect that you will have fewer problems. Barring that, it would be safest to use the -04 notation for offset from UTC; but that seems like more work to me for no benefit.
You can create a table of known timezones in the format suggested in Guan Yang's answer, and then use a foreign key column to this table. Valid timezones can be obtained from pg_timezone_names I've gone into more detail in this related answer.