Required binary date specifications - date

I want to create a binary date trawler for searching for given dates in binary files where the date format and structure of the binary file are unknown. Unless there is something out there that already does this.
I'm looking for reference material for the specification of as many binary date specifications as possible.
I know of some date types :
DOS, FILETIME, OLE, ANSI SQL, C, Mac Abs, HFS, APFS, Java
But where can I find the binary specification for these and any other date formats?

Related

What does T mean in "YYYY-mm-DDTHH:MM"?

I am trying to pull some data from Twitter, and the date format is "YYYY-mm-DDTHH:MM". What does T mean in "YYYY-mm-DDTHH:MM"?
The T isn't substituted for a value, it's a character used in the output to designate that the second part is a Time.
For example: 2021-04-20T13:03
The format is part of the ISO 8601 international standard.

Inverse function to format-date

in xslt 2.0, the function format-date will convert a date to string in a desired format, e.g.
<xsl:value-of select = "format-date(xs:date('2000-01-01'), '[D01] [MN,*-3] [Y0001]', 'en', (), ())"/>
results in
01 JAN 2000.
My question is: which function takes 01 JAN 2000 as input and outputs 2000-01-01?
As noted above:
XPath 3.1 adds parse-ietf-date() which handles many of the date formats used in internet standards such as email (which are often very US-oriented). But there are too many varieties of date formats out there for a general solution to be viable.
It's much easier to define a syntax for converting one input format to a wide variety of output formats than to do the converse. A syntax that is sufficiently powerful to do the job properly would end up being very similar to doing it "by hand" using the replace() function and regular expressions.
It's quite easy to DIY.
References
XSLT Date Formatting
Parsing Date/Time Information from Google XML feed using XSL Stylesheet

Is the --MM-DD format for month-day part of ISO 8601?

The Java 8 date/time API, java.time, has a MonthDay class to represent a month and a day together. So too the Joda-Time library offers a MonthDay class.
In java.time, the MonthDay.toString() method is declared as:
Outputs this month-day as a String, such as --12-03.
Most of the classes in java.time have their toString() method output the standard ISO 8601 representation of the concept they represent (YYYY-MM-DD for LocalDate, for example), so I would expect this --MM-DD format to be standard as well.
But I could not find this in the ISO 8601 standard.
Is month-day a concept defined by ISO 8601, and if so, is --MM-DD the standard format?
Background: I'm developing a date/time API in another language.
Yes, in paragraph 4.4, ISO 8601:1988 says:
Note - The hyphen is also used to indicate omitted components
So the format --MM-DD conforms to ISO 8601, where the year is omitted/missing, and paragraph 5.2.1.3d also contains an example of that very format.
This format disappeared in later editions of the ISO 8601 standard. However, the java.time classes continue to support the format along with the MonthDay class.

Parsing COPY...WITH BINARY results

I'm using this:
COPY( select field1, field2, field3 from table ) TO 'C://Program
Files/PostgreSql//8.4//data//output.dat' WITH BINARY
To export some fields to a file, one of them is a ByteA field. Now, I need to read the file with a custom made program.
How can I parse this file?
The general format of a file generated by COPY...BINARY is explained in the documentation, and it's non-trivial.
bytea contents are the most easy to deal with, since they're not encoded.
Each other datatype has its own encoding rules, which are not described in the documentation but in the source code. From the doc:
To determine the appropriate binary format for the actual tuple data
you should consult the PostgreSQL source, in particular the *send and
*recv functions for each column's data type (typically these functions are found in the src/backend/utils/adt/ directory of the source
distribution).
It might be easier to use the text format rather than binary (so just remove the WITH BINARY). The text format has better documentation and is designed for better interoperability. The binary format is more intended for moving between postgres installations, and even there they have version incompatibilities.
Text format will write the bytea field as if it was text, and encode any non-printable characters with \nnn octal representation (except for a few special cases that it encodes with C style \x patterns, such as \n and \t etc.) These are listed in the COPY documentation.
The only caveat with this is you need to be absolutely sure that the character encoding you're using is the same when saving the file as when reading it. To make sure that the printable characters map to the same numbers. I'd stick to SQL_ASCII as it keeps thing simpler.

Which Perl module can handle a variety of date formats containing unicode characters?

My requirement is parsing xml files which contains wide varieties of timestamps based on the locales at which they are written. They may contain Unicode characters in case of Chinese or Korean locales. I have to parse these timestamps and put then in a standard format something like 2009-11-26 12:40:54 to put them in a oracle database. Sometimes I may not even know the locale and yet I have to parse the timestamps.
I am looking for a module that automatically detects the timestamp format (including unicode characters for am and pm in their local language) and converts in to epoch time so that I can convert it back to what ever way I like to.
I have gone through similar questions in this forum. Few suggested DateFormat module, and Date::Parse module. The perl distribution I am using is 5.10 so Date::Manip doesn't come as a core module. As I am supposed to use just the basic core modules and few CPAN modules(on request I cannot ask for all),
I request you to kindly suggest me a good module that suffices all my requirements.
Thanks in advance
DateTime::Locale should do what you want.
You might have a look at Date::Manip. Don't know if it supports the languages you need but there is some UTF8 support in it. In any case once you get the dates extracted it has a UnixDate method to easily output in whatever format you want. Also resolves time zones.