Time and date standards? - date

What standard called for the use of HH:mm as the 24-hour clock whereas hh:mm is the 12-hour clock?
Likewise, I also often see dates noted as MM/dd/yyyy where MM is in caps to distinguish it from mm which indicates minutes.
Does anyone know what standard this nomenclature is based upon?

Probably you refer to the CLDR project which has defined the LDML standard. LDML means: "Locale Data Markup Language" and is listed as "Unicode Technical Standard #35", see also the title of the linked document:
http://www.unicode.org/reports/tr35/tr35-dates.html#Date_Field_Symbol_Table
However, you have not told exactly which language or library you use. Be aware of subtile differences. For example in Java, the old class SimpleDateFormat has exceptionally used the pattern symbol "u" as "Day number of week" while CLDR (and the newer class DateTimeFormatter) really interpretes "u" as "extended year (without era)".
By the way, I would never use "hh:mm" without "a" (as marker for am/pm in English speaking countries) or "B" for day periods (if your library supports it) because otherwise the 12-hour-clock is ambivalent.

Related

LibreOffice : how shall I change the names of the weekdays in date format

In LibreOffice, I wish to change the weekdays names to have something shorter.
In French, the date format "ddd dd/mm/yy" applied to today 2021-08-17 gives : "mar. 17/08/21". I would prefer "ma 17/08/21" to have narrower columns.
So I wish to change the existing weekdays names "lun., mar., mer., jeu., even., sam., dim." to something shorter : "lu, ma, me, je, ve, sa, di", through the format used by LibreOffice.
I played with French locales (Swiss French, etc.) but it is not satisfactory. I tried to change the list for the sort but it has no effect on the date format.
Is it possible to change the list LibreOffice uses for the weekdays ? How shall I proceed ?
You cannot do this with normal formatting (unless you require a change to the national date standard).
However, you can easily get the desired date representation using the TEXTE() and REGEX() functions.
=REGEX(TEXT(A1;"OOO JJ/MM/AA");"(..)([^\.]*\.)(.+)";"$1$3";"g")
Or shorter notation using LEFT() (GAUCHE()):
=LEFT(TEXT(A1;"OOO");2)&TEXT(A1;"\ JJ/MM/AA")
Remember to check that the cell format is French, otherwise you will get an error (not every language uses O-J-A characters)

`uuuu` versus `yyyy` in `DateTimeFormatter` formatting pattern codes in Java?

The DateTimeFormatter class documentation says about its formatting codes for the year:
u year year 2004; 04
y year-of-era year 2004; 04
…
Year: The count of letters determines the minimum field width below which padding is used. If the count of letters is two, then a reduced two digit form is used. For printing, this outputs the rightmost two digits. For parsing, this will parse using the base value of 2000, resulting in a year within the range 2000 to 2099 inclusive. If the count of letters is less than four (but not two), then the sign is only output for negative years as per SignStyle.NORMAL. Otherwise, the sign is output if the pad width is exceeded, as per SignStyle.EXCEEDS_PAD.
No other mention of “era”.
So what is the difference between these two codes, u versus y, year versus year-of-era?
When should I use something like this pattern uuuu-MM-dd and when yyyy-MM-dd when working with dates in Java?
Seems that example code written by those in the know use uuuu, but why?
Other formatting classes such as the legacy SimpleDateFormat have only yyyy, so I am confused why java.time brings this uuuu for “year of era”.
Within the scope of java.time-package, we can say:
It is safer to use "u" instead of "y" because DateTimeFormatter will otherwise insist on having an era in combination with "y" (= year-of-era). So using "u" would avoid some possible unexpected exceptions in strict formatting/parsing. See also this SO-post. Another minor thing which is improved by "u"-symbol compared with "y" is printing/parsing negative gregorian years (in far past).
Otherwise we can clearly state that using "u" instead of "y" breaks long-standing habits in Java-programming. It is also not intuitively clear that "u" denotes any kind of year because a) the first letter of the English word "year" is not in agreement with this symbol and b) SimpleDateFormat has used "u" for a different purpose since Java-7 (ISO-day-number-of-week). Confusion is guaranteed - for ever?
We should also see that using eras (symbol "G") in context of ISO is in general dangerous if we consider historic dates. If "G" is used with "u" then both fields are unrelated to each other. And if "G" is used with "y" then the formatter is satisfied but still uses proleptic gregorian calendar when the historic date mandates different calendars and date-handling.
Background information:
When developing and integrating the JSR 310 (java.time-packages) the designers decided to use Common Locale Data Repository (CLDR)/LDML-spec as the base of pattern symbols in DateTimeFormatter. The symbol "u" was already defined in CLDR as proleptic gregorian year, so this meaning was adopted to new upcoming JSR-310 (but not to SimpleDateFormat because of backwards compatibility reasons).
However, this decision to follow CLDR was not quite consistent because JSR-310 had also introduced new pattern symbols which didn't and still don't exist in CLDR, see also this old CLDR-ticket. The suggested symbol "I" was changed by CLDR to "VV" and finally overtaken by JSR-310, including new symbols "x" and "X". But "n" and "N" still don't exist in CLDR, and since this old ticket is closed, it is not clear at all if CLDR will ever support it in the sense of JSR-310. Furthermore, the ticket does not mention the symbol "p" (padding instruction in JSR-310, but not defined in CLDR). So we have still no perfect agreement between pattern definitions across different libraries and languages.
And about "y": We should also not overlook the fact that CLDR associates this year-of-era with at least some kind of mixed Julian/Gregorian year and not with the proleptic gregorian year as JSR-310 does (leaving the oddity of negative years aside). So no perfect agreement between CLDR and JSR-310 here, too.
In the javadoc section Patterns for Formatting and Parsing for DateTimeFormatter it lists the following 3 relevant symbols:
Symbol Meaning Presentation Examples
------ ------- ------------ -------
G era text AD; Anno Domini; A
u year year 2004; 04
y year-of-era year 2004; 04
Just for comparison, these other symbols are easy enough to understand:
D day-of-year number 189
d day-of-month number 10
E day-of-week text Tue; Tuesday; T
The day-of-year, day-of-month, and day-of-week are obviously the day within the given scope (year, month, week).
So, year-of-era means the year within the given scope (era), and right above it era is shown with an example value of AD (the other value of course being BC).
year is the signed year, where year 0 is 1 BC, year -1 is 2 BC, and so forth.
To illustrate: When was Julius Caesar assassinated?
March 15, 44 BC (using pattern MMMM d, y GG)
March 15, -43 (using pattern MMMM d, u)
The distinction will of course only matter if year is zero or negative, and since that is rare, most people don't care, even though they should.
Conclusion: If you use y you should also use G. Since G is rarely used, the correct year symbol is u, not y, otherwise a non-positive year will show incorrectly.
This is known as defensive programming:
Defensive programming is a form of defensive design intended to ensure the continuing function of a piece of software under unforeseen circumstances.
Note that DateTimeFormatter is consistent with SimpleDateFormat:
Letter Date or Time Component Presentation Examples
------ ---------------------- ------------ --------
G Era designator Text AD
y Year Year 1996; 96
Negative years has always been a problem, and they now fixed it by adding u.
Long story short
For 99 % of purposes you can toss a coin, it will make no difference whether you use yyyy or uuuu (or whether you use yy or uu for 2-digit year).
It depends on what you want to happen in case a year earlier than 1 CE (1 AD) occurs. The point being that in 99 % of programs such a year will never occur.
Two other answers have already presented the facts of how u and y work very nicely, but I still felt something was missing, so I am contributing the slightly more opinion-based answer.
For formatting
Assuming that you don’t expect a year before 1 CE to be formatted, the best thing you can do is to check this assumption and react appropriately in case it breaks. For example, depending on circumstances and requirements, you may print an error message or throw an exception. One very soft failure path might be to use a pattern with y (year of era) and G (era) in this case and a pattern with either u or y in the normal, current era case. Note that if you are printing the current date or the date your program was compiled, you can be sure that it is in the common era and may opt to skip the check.
For parsing
In many (most?) cases parsing also means validating meaning you have no guarantees what your input string looks like. Typically it comes from the user or from another system. An example: a date string comes as 2018-09-29. Here the choice between uuuu and yyyy should depend on what you want to happen in case the string contains a year of 0 or negative (e.g., 0000-08-17 or -012-11-13). Assuming that this would be an error, the immediate answer is: use yyyy in order for an exception to be thrown in this case. Still finer: use uuuu and after parsing perform a range check of the parsed date. The latter approach allows both for a finer validation and for a better error message in case of a validation error.
Special case (already mentioned by Meno Hochschild): If your formatter uses strict resolver style and contains y without G, parsing will always fail because strictly speaking year of era is ambiguous without era: 1950 might mean 1950 CE or 1950 BCE (1950 BC). So in this case you need u (or supplying a default era, this is possible through a DateTimeFormatterBuilder).
Long story short again
Explicit range check of your dates, specifically your years, is better than relying on the choice between uuuu and yyyy for catching unexpected very early years.
Short comparison, if you need strict parsing:
Examples with invalid Date 31.02.2022
System.out.println(DateTimeFormatter.ofPattern("dd.MM.yyyy").withResolverStyle(ResolverStyle.STRICT).parse("31.02.2022"));
prints "{MonthOfYear=2, DayOfMonth=31, YearOfEra=2022},ISO"
System.out.println(DateTimeFormatter.ofPattern("dd.MM.uuuu").withResolverStyle(ResolverStyle.STRICT).parse("31.02.2022"));
throws java.time.DateTimeException: Invalid date 'FEBRUARY 31'
So you must use 'dd.MM.uuuu' to get the expected behaviour.

What is the canonical representation of zero in ISO 8601 durations?

http://en.wikipedia.org/wiki/ISO_8601#Durations
It's not clear what the most correct representation of zero in ISO 8601 durations is.
Possible candidates:
PT0S
This site:
http://www.ostyn.com/standards/scorm/samples/ISOTimeForSCORM.htm
says
PT0H0M0S
Or probably the simplest is
P
But what is most correct? Is there a canonical zero duration representation?
The single letter "P" is certainly wrong because at least one duration element must be present.
The SCORM-specification requires "PT0H0M0S" only because of backwards compatibility with earlier SCORM-Versions, not because ISO mandates it. Citation from the link you have given:
the SCORM 2004 1.3.1 conformance test suite was coded to require the PT0H0M0S format for the initial zero value of the total attempt
time; using that format is therefore recommended where compatibility
with early implementations of SCORM 2004 is required.
So if you don't use SCORM then the expression "PT0S" is completely sufficient. However, I don't remember any location in the original ISO-8601-paper where they have specified how a zero duration has to look like. On the contrary, ISO-8601 also describes alternative duration formats like "P0000-00-00T00:00".
There is not only one single canonical representation if we interprete the word "canonical" as "conform with ISO-8601".
Update (after looking in the original ISO-paper):
ISO-8601 mandates at least one element for a zero duration (4.4.3.2.c - page 21):
If the number of years, months, days, hours, minutes or seconds in any
of these expressions equals zero, the number and the corresponding
designator may be absent; however, at least one number and its
designator shall be present.
Paragraph 4.4.3.3 says:
The complete representation of the expression for duration in the
alternative format is as follows:
Basic format: PYYYYMMDDThhmmss or PYYYYDDDThhmmss
Extended format: PYYYY-MM-DDThh:mm:ss or PYYYY-DDDThh:mm:ss
Keep also in mind that not every software is capable of supporting all format variations.

In an ISO 8601 date, is the T character mandatory?

I'm wondering if the following date is ISO8601 compliant :
2012-03-02 14:57:05.456+0500
(for sure, 2012-03-02T14:57:05.456+0500 is compliant, but not that much human readable !)
IOW, is the T between date and time mandatory ?
It's required unless the "partners in information interchange" agree to omit it.
Quoting an earlier version of the ISO 8601 standard, section 4.3.2:
The character [T] shall be used as time designator to indicate the
start of the representation of the time of day component in these
expressions. [...]
NOTE By mutual agreement of the partners in information interchange,
the character [T] may be omitted in applications where there is no
risk of confusing a date and time of day representation with others
defined in this International Standard.
Omitting it is fairly common, but leaving it in is advisable if the representation is meant to be machine-readable and you don't have a clear agreement that you can omit it.
But according to Wikipedia:
In ISO 8601:2004 it was permitted to omit the "T" character by mutual agreement as in "200704051430", but this provision was removed in ISO 8601-1:2019. Separating date and time parts with other characters such as space is not allowed in ISO 8601, but allowed in its profile RFC 3339.
UPDATE : Mark Amery's comment makes a good point, that permission to omit the [T] does not necessarily imply permission to replace it with a space. So this:
2012-03-02T14:57:05.456+0500
is clearly compliant, and this:
2012-03-0214:57:05.456+0500
was permitted by earlier versions of the standard if the partners agreed to omit the T, but this:
2012-03-02 14:57:05.456+0500
apparently is not (though it's much more readable than the version with the T simply omitted).
Personally, if ISO 8601 compliance were required, I'd include the T, and if it weren't then I'd use a space (or a hyphen if it's going to be part of a file name).
See also RFC 3339 section 5.6, mentioned in Charles Burns's answer.
That date is not ISO-8601 compliant as Keith Thompson indicated, but it is compliant with RFC 3339, a profile of ISO 8601.
Sort of. See NOTE at the bottom of the following text from RFC 3339:
date-time = full-date "T" full-time
NOTE: Per [ABNF] and ISO8601, the "T" and "Z" characters in this
syntax may alternatively be lower case "t" or "z" respectively.
This date/time format may be used in some environments or contexts
that distinguish between the upper- and lower-case letters 'A'-'Z'
and 'a'-'z' (e.g. XML). Specifications that use this format in
such environments MAY further limit the date/time syntax so that
the letters 'T' and 'Z' used in the date/time syntax must always
be upper case. Applications that generate this format SHOULD use
upper case letters.
NOTE: ISO 8601 defines date and time separated by "T".
Applications using this syntax may choose, for the sake of
readability, to specify a full-date and full-time separated by
(say) a space character.

Bug in Zend_Date (back in time)

I have a very strange problem, Zend_Date is converting my timestamp to a year earlier.
In my action:
// Timestamp
$intTime = 1293922800;
// Zend_Date object
$objZendDate = new Zend_Date($intTime);
// Get date
echo date('Y-m-d',$intTime).'<br>';
echo $objZendDate->get('YYYY-MM-dd');
This outputs:
2011-01-02
2010-01-02
Can anyone tell me what i'm doing wrong?
From the ZF issue tracker it seems this is a known issue:
Recently a lot of ZF users are filing a bug that Zend_Date returns the wrong year, 2009 instead of 2008. This is however expected behaviour, and NOT A BUG!
From the FAQ:
When using own formats in your code you could come to a situation where you get for example 29.12.2009, but you expected to get 29.12.2008.
There is one year difference: 2009 instead of 2008. You should use the lower cased year constant. See this example:
$date->toString('dd.MM.yyyy');
instead of
$date->toString('dd.MM.YYYY');
From the manual
Note that the default ISO format differs from PHP's format which can be irritating if you have not used in previous. Especially the format specifiers for Year and Minute are often not used in the intended way.
For year there are two specifiers available which are often mistaken. The Y specifier for the ISO year and the y specifier for the real year. The difference is small but significant. Y calculates the ISO year, which is often used for calendar formats. See for example the 31. December 2007. The real year is 2007, but it is the first day of the first week in the week 1 of the year 2008. So, if you are using 'dd.MM.yyyy' you will get '31.December.2007' but if you use 'dd.MM.YYYY' you will get '31.December.2008'. As you see this is no bug but a expected behaviour depending on the used specifiers.
For minute the difference is not so big. ISO uses the specifier m for the minute, unlike PHP which uses i. So if you are getting no minute in your format check if you have used the right specifier.
To add to zwip's answer, what happens behind the scenes is that your date format YYYY-MM-dd is actually translated into o\-m\-d, which is then passed to PHP's date() function internally with the timestamp you provided.
Like mentioned in the other answer, and in the documentation for the o format on the date format page, the calculation of the year based on the ISO week can sometimes result in the year being one different to the value that you expect.