OpenRefine toDate() conversion fail - date

I have a sensor log file with dates in the form Mon Nov 30 18:21:40 UTC 2020 that I'd like to convert to OpenRefine dates.
Per GREL Date Functions, I thought the correct transformation would be value.toDate('E M d H:m:s z y'), but I consistently get "Error: Unable to convert to a date".
I've tried simple things like replacing UTC with GMT, without success.
What clue am I missing?

That's a weird date format. I'm not sure why a sensor log wouldn't just use ISO 8601.
Try using value.toDate('EEE MMM d H:m:s Z y').
It's not super obvious from the docs that you need multiple characters, but if you look at the examples at the bottom of this page, you can see them used there.
https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html

Related

How do I build an an expression using ADF expression language to dynamically generate date in yyyymmdd format for a speciific time zone?

I want to do something relatively straightforward please help I'm stuck
Given today's date is 02 May, 2021 in my current timezone (Pacific Standard Time), build the string 20210502 (yyyymmdd format) dynamically.
What is the simplest way to do this in ADF? I tried following but returns error invalid expression:
#substring(formatString(getutcdate()),0,8)
I'm also not sure how to make it flexible so I can enter a different timezone if I want like Pacific Standard Time.
You can create a timezone variable and pass that value to convertFromUtc or convertTimeZone function. And you can choose format as you need. Here is the format specifiers list.
You can follow this:
expression:#replace(split(convertFromUtc(utcnow(),variables('timezone'),'u'),' ')[0],'-','')
Output:

unexpected result converting date using datestr

Can anyone tell me why if I type in MATLAB
datestr('17-03-2016','dd-mmmm-yyyy')
I get
06-September-0022
From the datestr docs
DateString = datestr(___,formatOut) specifies the format of the output text using formatOut. You can use formatOut with any of the input arguments in the above syntaxes.
So in your example the 'dd-mmmm-yyyy' is specifying the output format, not the input format.
Also
DateString = datestr(DateStringIn) converts DateStringIn to text in the format, day-month-year hour:minute:second. All dates and times represented in DateStringIn must have the same format.
where
'dd-mm-yyyy' is not in the list of allowed DateStringIn formats AND the documentation explicitly recommends using datenum to ensure correct behaviour. (Note: I underlined the wrong must in the sentence, it's the second must I wanted to emphasise)
So Sandar_Usama's answer of
datestr(datenum('17-03-2016','dd-mmmm-yyyy'))
is the officially correct method straight out of the docs.
Bottom line, always read the documentation.
Use this instead: datestr(datenum('17-03-2016','dd-mmmm-yyyy'))
To address the last unanswered point in this question, why does datenum behave like this?
>> datestr(datenum('17-03-2016'))
ans =
06-Sep-0022
Without explicitly telling datestr and datenum how it should treat the input, it will try to match against the expected formats. Since none of the documented formats match (see #dan's answer), it fails.
Although what it does next is undocumented, at least up to whatever version of Matlab we are running, it falls into a "last resource" attempt to give you a date number.
Matlab will try to parse different month names from your input, remove non-numeric characters, and then timedate elements from the string. In your case, they are 17, 03, and 2016. The first is expected to be either month or year. Since there's no 17th month, it is treated as year. Then 03 is the month, and 2016 is the day.
Now, March 2016th, 17 is not a valid date, but Matlab will give it a slack and read as 1985 days past March 31st, 17. And that gives us September 6th, 22.
Because Matlab's timestamp is a floating number for the number of days since its epoch, you can trigger that answer, using valid dates, like so:
>> datestr(datenum('0017-03-31') + 1985)
ans =
06-Sep-0022

D3 time.parse with colon in time-zone format

I'm trying to use d3.time.format on a date like this:
2016-02-10T12:40:16-05:00
Basically, a UTC date with a timezone offset.
The issue is, the %Z formatter in D3 looks for the timezone written as follows: -0500. In other words, the colon is missing.
Does anyone know a workaround?

Bug in Zend_Date (back in time)

I have a very strange problem, Zend_Date is converting my timestamp to a year earlier.
In my action:
// Timestamp
$intTime = 1293922800;
// Zend_Date object
$objZendDate = new Zend_Date($intTime);
// Get date
echo date('Y-m-d',$intTime).'<br>';
echo $objZendDate->get('YYYY-MM-dd');
This outputs:
2011-01-02
2010-01-02
Can anyone tell me what i'm doing wrong?
From the ZF issue tracker it seems this is a known issue:
Recently a lot of ZF users are filing a bug that Zend_Date returns the wrong year, 2009 instead of 2008. This is however expected behaviour, and NOT A BUG!
From the FAQ:
When using own formats in your code you could come to a situation where you get for example 29.12.2009, but you expected to get 29.12.2008.
There is one year difference: 2009 instead of 2008. You should use the lower cased year constant. See this example:
$date->toString('dd.MM.yyyy');
instead of
$date->toString('dd.MM.YYYY');
From the manual
Note that the default ISO format differs from PHP's format which can be irritating if you have not used in previous. Especially the format specifiers for Year and Minute are often not used in the intended way.
For year there are two specifiers available which are often mistaken. The Y specifier for the ISO year and the y specifier for the real year. The difference is small but significant. Y calculates the ISO year, which is often used for calendar formats. See for example the 31. December 2007. The real year is 2007, but it is the first day of the first week in the week 1 of the year 2008. So, if you are using 'dd.MM.yyyy' you will get '31.December.2007' but if you use 'dd.MM.YYYY' you will get '31.December.2008'. As you see this is no bug but a expected behaviour depending on the used specifiers.
For minute the difference is not so big. ISO uses the specifier m for the minute, unlike PHP which uses i. So if you are getting no minute in your format check if you have used the right specifier.
To add to zwip's answer, what happens behind the scenes is that your date format YYYY-MM-dd is actually translated into o\-m\-d, which is then passed to PHP's date() function internally with the timestamp you provided.
Like mentioned in the other answer, and in the documentation for the o format on the date format page, the calculation of the year based on the ISO week can sometimes result in the year being one different to the value that you expect.

Secure way to figure out if a given date format has an 12h or 24h format?

I know this sucks. Date stuff sucks hard. But: Imagine a date format like "dd-MM-yyyy h:mm" how would you tell for sure what time mode that is? AM / PM or 24 hour? I'd say: If there is no "a" in the date format, then that's no AM / PM stuff and therefore it's nice 24h stuff. What do you think?
If you are given a date, such as 11:15, you can't know whether it is AM or PM. Just as you don't know whether when I say Deer, I mean one or more than one. As a program designer, you have to remove ambiguities or make assumptions. You could either force the data to have AM/PM, or tell the provider of the time to give it to you in 24 hour format, or you can assume that they are smart enough to realize that without AM/PM you have no way of knowing. Not knowing your situation, I can't tell you how to proceed, but there are issues that transcend plain old programming. Like whether 1,000,000,000 is a billion or a milliard or a trillion or whether a ton is 1000 kilograms or ....
You should rather check for a M or m and not an a.
But "dd-MM-yyyy hh:mm" is surely an ambiguous format.
That is, parsing a date that just looks like dd-MM-yyyy hh:mm can't tell you about the 12/24 format.
You could assume it's 24h format, otherwise something is missing or it would look like "dd-MM-yyyy hh:mm X", where X is 'AM' or 'PM'.
The only truly unambiguous format is ISO 8601 'yyyy-MM-dd hh:mm' with 24h times.