I am creating a form (using Typeform, but doesn't really matter) in which I need to understand the time zone of my customers. So far I have left this as an open questions, but I would like to provide an extensive list, to ensure consistency on my database.
I could provide a list of time zones abbreviation (such as this one), but then I'd have the problem of daylight saving times. That is, let's say a customer is in the UK, they will see both "Greenwich Mean Time" and "British Summer Time" on the list, and would be answering differently depending on the time of the year.
How can a produce a meaningful non-redundant plaintext list of timezones?
Related
I have a node.js application that stores many dates in a database. They are stored in the ISO format, such as '2016-11-02T16:30:12-04:00'.
Some fields which are dates are just dates, other are date/times. An example of a date/time would be "last modified" for a record, where a person's birthday is just a date.
The question is about best practices for storage and query patterns on these things. Because a date always has a time, you must choose how to store for example a birthday. Following the 5 laws of API dates and times this is of course done in UTC.
There are edge cases though where proper API behavior seems unclear. Suppose someone submits a birthdate to the API of '2016-11-02T16:30:12-04:00'. This is bad news, because a search like /users?birthdate=2016-11-02 will fail, as that date will get converted to '2016-11-02T00:00:00Z' and fail to match in the DB. What then should correct behavior be?
When someone POSTs a user, convert date fields into dates at midnight UTC, and then have the convention that querying birthdates should assume the same?
Convert date queries for certain fields into implicit ranges, i.e. searching for 2016-11-02 is really looking for 2016-11-02T00:00:00Z <= x <= 2016-11-02:23:59:59Z?
Match only on the exact moment, and rely on the client to know that a birthday of '2016-11-02T16:30:12-04:00' really means 4:30PM EST, and does not mean just on November 2nd?
What's the established pattern / best practice here for distinguishing between dates and datetimes?
I have been studying REST best practices and standards a lot for a while and I can't recall reading anything about that, but for the usage of ISO standard. From your description it seems to be something that really depends on the application and its use-cases.
I would go for your option #2: if a GET request comes with a date but no time, consider it a query for the whole day, and do the "conversion" in your GET response server code. Maybe you'd want to support both a "date" and a separate "time" query string parameters if the precise time might matter occasionally. This can also help you to keep clients "unaware" of the database storage format you choose, and may even allow you to support localized date formats.
The problem here is the usage of UTC, which implies that there's a time associated with it. There's not, a birthdate is considered (in iCalendar) a 'floating date' and does not have a specific time associated with it.
If your birthdate is November 3rd, and you move to Australia, your birthdate does not actually change to November 2nd, because your birthdate does not have a time, does not have a timezone and is the same where ever you are in the world.
The solution is simple. If you allow users to submit a date/time for birthday searches, then you should just 'cut off' the time and timezone. Assume that you're only going to be using the date portion and just search your database based on that.
Ideally you don't allow users to submit a time at all though. I think this just creates confusion. Just force api clients to submit a date only.
Those '5 laws' are an extreme over-simplification and don't apply to many situations.
An API defines that a date should be sent as iso8601, but we have a requirement to send "forever" as a date, and the standard does not seem to cover this. Can anyone suggest a better solution than Dec 31 9999? Is there a different standard that would be more appropriate?
Quoting ISO 8601:2004(E):
3.5 Expansion
By mutual agreement of the partners in information interchange, it is permitted to expand the component
identifying the calendar year, which is otherwise limited to four digits. This enables reference to dates and
times in calendar years outside the range supported by complete representations, i.e. before the start of the
year [0000] or after the end of the year [9999].
And also relevant may be section 3.7 Mutual agreement which basically says you're free to define your own representations as long as you don't interfere with the representations defined in ISO 8601. So 9999-12-32 or 9999-13-00 could be mutually agreed upon for your proposed forever value.
As to what's common practice, I'd say it depends.
I'd go for 3.7 whenever possible. But it's important to assess your role within the whole set-up. E.g. if you're using a 3rd party API within your own set of components for the sake of convenience or future compatibility, there should be no problem at all. If you're part of a bigger system and you'd have to convince tens of other system parties/components/modules/etc. I'd say it's not worth the trouble.
Also very important to check legacy code. And at least sketch out a plan on how to do the migration in case it breaks set-ups beyond belief. That could be anything from documenting your API "extension" to actually sending patches to the legacy code maintainers.
Are births and deaths modeled as events for a person in a genealogy profile or as attributes of the person. What are the pros and cons of each approach?
if you consider that every event has artifacts to go with it, they really should be events, so you can have all of the documents, etc. associated with them.
on the other hand, can you imagine a person record that doesn't have birth/death dates as attributes? you wouldn't want to have to do a join with the events that give you birth/death just so you could sort by those dates.
so there are the pros and cons, but there's also the idea you can have both. if you are willing to live with a database that isn't completely normalized, you can have them as events and for each person with birth/death events, copy those values into the attributes.
keep in mind, of course, that you could have multiple birth/death events for a person, records that might be in conflict, in which case only one of them that the user has indicated is meant to be the person's birth/date attribute would be copied.
"Events" in genealogy (and in genealogy software) are generally considered something that takes place at a given time and place. They can be events for an individual, e.g. Birth, death, baptism, naturalization, emigration, etc., or for a family (husband/wife), e.g. Marriage, Engagement, Divorce.
"Attributes" (or "facts") are generally considered to be something that is true, e.g. Scholastic achievement,Tribal Origin, Occupation, Religious Affiliation, Title.
These are how GEDCOM defines them and how they try to get programmers to program them.
Personally, my concept of an "event' is a transition in a change of state. e.g. Going from before someone was born until once they are alive. It need not be a short period of time, but may take a long time, e.g. World War II was an event. And events can contain other events (e.g. the specific battles in World War II).
One more example is hair color, which is considered an attribute. But someone can be born with blond hair, have it fall out and replaced with brown hair, and then as they get older it turns grey before falling out again. Hair color are attributes that are true over a certain time, and are "fuzzy" as the event happens that changes it from one to another.
My concept of an "attribute" is that they have time periods to them. The attribute is the state which can be changed by events. e.g. "Occupation" changes with the "getting fired" event and
"Unemployed" takes over until the "getting hired" event occurs.
So attributes are between events, and events separate different attributes.
What I am basically saying is that in my genealogy program, I really don't make a distinction between events and attributes. I treat them the same. Either may include a date or time period and events usually include a place and attributes usually don't.
Because of their similarities, I don't see any need to model them separately.
Is there a published data structure for storing periodic or recurring dates? Something that can handle:
The pump need recycling every five days.
Payday is every second Friday.
Thanksgiving Day is the second Monday in October (US: the fourth Thursday in November).
Valentine's Day is every February 14th.
Solstice is (usually) every June 21st and December 21st.
Easter is the Sunday after the first full moon on or after the day of the vernal equinox (okay, this one's a bit of a stretch).
I reckon cron's internal data structure can handle #1, #4, #5 (two rules), and maybe #2, but I haven't had a look at it. MS Outlook and other calendars seem to be able to handle the first five, but I don't have that source code lying around.
Use a iCalendar implementation library, like these ones: ruby, java, php, python, .net and java, and then add support for calculating special dates.
With all these variations in the way you specify the recurrence, I would shy away from one single data structure implementation to accommodate all 5 scenarios.
Instead, I would (and have for a previous project) build simple structures that address each type of recurrence. You could wrap them all up so that it feels like a single data structure, but under the hood they could do whatever they like. By implementing an interface, I was able to treat each type of recurrence similarly so it felt like a one-size-fits-all data structure. I could ask any instance for all the dates of recurrence within a certain time frame, and that did the trick.
I'd also want to know more about how these dates need to be used before settling on a specific implementation.
If you want to hands-on create a data structure, I'd recommend a hash table (where the holidays or event are keys with the new date occurrence as a value), if there are multiplicities of each occurrence you could hash the value that finds a section in a linked list, which then has a list of all the occurrences (this would make finding as well as insertion run in O(1)).
Should I store it in a single timestamp/datetime field or separate date and time fields? I prefer to store it as a single timestamp field but I need to know when a user didn't enter a time portion.
How to best store this in the database? I'm using postgresql.
There are definitely reasons why this is a bad idea. There are also reasons why your choices are limited. It's a bad idea because it's a single data item and for the more practical reason that you can't store a timezone if you have two fields.
As mentioned above, nulls are the obvious benefit of using two fields.
You might also want to consider using a single datetime field and storing a flag to indicate whether or not the user entered a time. This could be a boolean flag. However, you will still need to think about how you are going to use this data - entering only a date into a datetime field will lead to a time component being set to midnight. This will have implications in sorting and selection. Additionally, if you are storing timezones, you will have to be very careful when you use the data.
In order to fulfill your requirement of knowing whether or not a time was entered you will need to have two fields. You do not need the second field to be a time though.
The obvious answer is to use two separate fields; then you can use NULL values.
If you choose to use one field you will need to choose a magic time part that signifies "didn't enter a real time", which has the danger of coinciding with a real time (however unlikely).
Also, if you intend to use the date and time part separately often, then it might also be convenient to use separate fields; otherwise you will often need to use selection functions for extracting the relevant part of a field.