Perl module for parsing natural language time duration specifications (similar to the "at" command)? - perl

I'm writing a perl script that takes a "duration" option, and I'd like to be able to specify this duration in a fairly flexible manner, as opposed to only taking a single unit (e.g. number of seconds). The UNIX at command implements this kind of behavior, by allowing specifications such as "now + 3 hours + 2 days". For my program, the "now" part is implied, so I just want to parse the stuff after the plus sign. (Note: the at command also parses exact date specifications, but I only want to parse durations.)
Is there a perl module for parsing duration specifications like this? I don't need the exact syntax accepted by at, just any reasonable syntax for specifying time durations.
Edit: Basically, I want something like DateTime::Format::Flexible for durations instead of dates.

Take a look at DateTime::Duration and DateTime::Format::Duration:
use DateTime::Duration;
use DateTime::Format::Duration;
my $formatter = DateTime::Format::Duration->new(
pattern => '%e days, %H hours'
);
my $dur = $formatter->parse_duration('2 days, 5 hours');
my $dt = DateTime->now->add_duration($dur);

Time::ParseDate has pretty flexible syntax for relative times. Note that it always returns an absolute time, so you can't tell the difference between "now + 3 hours + 2 days" and "3 hours + 2 days" (both of those are valid inputs to parsedate, and will return the same value). You could subtract time if you want to get a duration instead.
Also, it doesn't return DateTime objects, just a UNIX epoch time. I don't know if that's a problem for your application.

I ended up going with Time::Duration::Parse.

Related

How to format the time as following "2018-03-15T23:47:15+01:00"

With java.time , I'm trying to format the time as the following "2018-03-15T23:47:15+01:00" .
With this formatter I'm close to the result in Scala.
val formatter: DateTimeFormatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ssZ")
ZonedDateTime.now() // 2018-03-14T19:25:23.397+01:00
ZonedDateTime.now().format(formatter) // => 2018-03-14 19:25:23+0100
But I cannot insert the extra character "T" between the day and hour.
What does this "T" mean BTW ?
How to format as "2018-03-15T23:47:15+01:00" ?
Notes:
In case you wonder why LocalDateTime cannot be formatted
Format LocalDateTime with Timezone in Java8
Try this
val ZONED_DATE_TIME_ISO8601_FORMATTER3 = DateTimeFormatter.ofPattern("yyyy-MM-dd'T'HH:mm:ss.SSSxxx")
ZonedDateTime.now().format(ZONED_DATE_TIME_ISO8601_FORMATTER3)
// 2018-03-14T19:35:54.321+01:00
See here
Offset X and x: This formats the offset based on the number of pattern letters. One letter outputs just the hour, such as '+01', unless the minute is non-zero in which case the minute is also output, such as '+0130'. Two letters outputs the hour and minute, without a colon, such as '+0130'. Three letters outputs the hour and minute, with a colon, such as '+01:30'. Four letters outputs the hour and minute and optional second, without a colon, such as '+013015'. Five letters outputs the hour and minute and optional second, with a colon, such as '+01:30:15'. Six or more letters throws IllegalArgumentException. Pattern letter 'X' (upper case) will output 'Z' when the offset to be output would be zero, whereas pattern letter 'x' (lower case) will output '+00', '+0000', or '+00:00'.
Converting the ZonedDateTime to OffsetDateTime - as suggested in the other answers - works, but if you want to use a DateTimeFormatter, there's a built-in constant that does the job:
ZonedDateTime.now().format(DateTimeFormatter.ISO_OFFSET_DATE_TIME)
But it's important to note some differences between all the approaches. Suppose that the ZonedDateTime contains a date/time equivalent to 2018-03-15T23:47+01:00 (the seconds and milliseconds are zero).
All the approaches covered in the answers will give you different results.
toString() omits seconds and milliseconds when they are zero. So this code:
ZonedDateTime zdt = // 2018-03-15T23:47+01:00
zdt.toOffsetDateTime().toString()
prints:
2018-03-15T23:47+01:00
only hour and minute, because seconds and milliseconds are zero
The built-in formatter will omit only the milliseconds if it's zero, but it'll print the seconds, regardless of the value. So this:
zdt.format(DateTimeFormatter.ISO_OFFSET_DATE_TIME)
prints:
2018-03-15T23:47:00+01:00
seconds printed, even if it's zero; milliseconds ommited
And the formatter that uses an explicit pattern will always print all the fields specified, regardless of their values. So this:
zdt.format(DateTimeFormatter.ofPattern("yyyy-MM-dd'T'HH:mm:ss.SSSxxx"))
prints:
2018-03-15T23:47:00.000+01:00
seconds and milliseconds are printed, regardless of their values
You'll also find a difference in values such as 2018-03-15T23:47:10.120+01:00 (note the 120 milliseconds). toString() and ofPattern will give you:
2018-03-15T23:47:10.120+01:00
While the built-in DateTimeFormatter.ISO_OFFSET_DATE_TIME will print only the first 2 digits:
2018-03-15T23:47:10.12+01:00
Just be aware of these details when choosing which approach to use.
As your question already shows, you may just rely on ZonedDateTime.toString() for getting a string like 2018-03-14T19:25:23.397+01:00. BTW, that string is in ISO 8601 format, the international standard. Only two minor modifications may be needed:
If you don’t want the fraction of second — well, I don’t see what harm it does, it agrees with ISO 8601, so whoever receives your ISO 8601 string should be happy to have it. But if you don’t want it, you may apply myZonedDateTime.truncatedTo(ChronoUnit.SECONDS) to get rid of it.
ZonedDateTime.toString() often appends a zone name, for example 2018-03-14T19:25:23+01:00[Europe/Paris], which is not part of the ISO 8601 standard. To avoid that, convert to OffsetDateTime before using its toString method: myZonedDateTime.toOffsetDateTime().toString() (or myZonedDateTime.truncatedTo(ChronoUnit.SECONDS).toOffsetDateTime().toString()).
Building your own formatter through a format pattern string is very flexible when this is what you need. However, very often we can get through with less (and then should do for the easier maintainability of our code): toString methods or built-in formatters including both the ISO ones and the localized ones that we can get from DateTimeFormatter.ofLocalizedPattern().
What does this "T" mean BTW ?
The T is part of the ISO 8601 format. It separates the date part from the time-of-day part. You may think of it as T for time since it denotes the start of the time part. If there is only a date (2018-04-25) or only a time-of-day (21:45:00), the T is not used, but when we have both, the T is required. You may think that the format might have been specified without the T, and you are probably right. When it comes to the format for periods/durations it is indispensable, however, and also needed when there are no days: P3M means a period of 3 months, while PT3M means 3 minutes.
Link: Read more in the Wikipedia article on ISO 8601.

Why does strptime from Time::Piece not parse my format?

My collegue (who has left the company) has written a bunch of scripts, including batch and Perl scripts, and I'm getting rid of the regional settings dependencies.
In the last Perl script, he's written the following piece of code:
my $format = "%d.%m.%Y %H:%M";
my $today_converted = Time::Piece->strptime($today, $format) - ONE_HOUR - ONE_HOUR - ONE_HOUR - ONE_HOUR - ONE_HOUR;
(the idea is to get five hours before midnight of that particular date)
The value of $today seems to be "03/04/2017" (which stands for the third of April (European dateformat)), which seems not to be understood by Time::Piece implementation:
Error parsing time at C:/Perl64/lib/Time/Piece.pm line 481.
Which format can I use which is understood by Time::Piece Perl implementation?
In the format you have dots . as the date delimiter, but in the data you have slashes /. That's why it doesn't parse. It needs an exact match.
I think it's worth clarifying that strptime() will parse most date and time formats - that's the point of the method. But you need to define the format of the date string that you are parsing. That's what the second parameter to strptime() (in this case, your $format variable) is for.
The letters used in the format are taken from a standard list of definitions which used by every implementation of strptime() (and its inverse, strftime()). See man strptime on your system for a complete list of the available options.
In your case, the format is %d.%m.%Y %H:%M - which means that it will parse timestamps which have the day, month and year separated by dots, followed by a space and the hours and minutes separated by a colon. If you want to parse timestamps in a different format, then you will need to change the definition of $format.

How can I use Perl to do datetime comparisons and calculate deltas?

I extracted year, month, day, hour, minute, second, and millisecond data from human readable text (it wasn't in a timestamp format, but rather something like "X started at HH:MM:SS.SSS on DD MMM YYYY and ended at HH:MM:SS.SSSS on DD MMM YYYY"), so for each recorded event, I have each of the values.
However, I need to turn those into some kind of timestamp so I can do math on it - I want to determine how long the event lasted (end time - start time). I was hoping the time function would take parameters so I can create two arbitrary times, but that doesn't appear to be the case.
If possible, I would like to stick with functions available in the core Perl libraries or scripts that I can add to the project, since getting CPAN modules installed on the target machines would just make a headache for everyone, if it is even possible to get the modules through the security restrictions.
You want the CPAN module DateTime. Here's an introduction.
On a Debian GNU/Linux or Ubuntu system, simply run:
apt-get install libdatetime-perl
to install the module.
You can do it with Time:Local. It's basically the reverse of the built in "localtime" function, so you can generate a timestamp from a standard date.
In terms of built-ins these may be helpful:
POSIX (for mktime and strftime)
Time::Piece, Time::Local and Time::Seconds. These are all standard in Perl 5.10, but may not be available by default on earlier systems.
That said, time/date calculations are complex. If the only obstacle is a few headaches installing modules (rather than a company policy forbidding them), I would really recommend looking at CPAN.
Edit: I see from your comment on another post that there are company restrictions. You should update your original post, since there's a big difference between "headaches" and "security restrictions." In any case, DateTime and Date::Manip are worth looking at. Even if you don't install them, you can get a lot out of reading their source.
If you were only interested in comparing times,
my $ts1 = sprintf( '%4.4d%2.2d%2.2d%2.2d%2.2d%3.3d',
$year1, $month1, $mday1, $hour1, $min1, $sec1, $ms1 );
to
my $ts2 = sprintf( '%4.4d%2.2d%2.2d%2.2d%2.2d%3.3d',
$year2, $month2, $mday2, $hour2, $min2, $sec2, $ms2 );
using cmp would be sufficient.
To do arithmetic on these times, use Time::Local to get seconds since epoch and then add the $ms1/1000 to that value.
my $time1 = timelocal($sec1, $min1, $hour1, $mday1, $mon1, $year1) + $ms1/1000;
You can use POSIX::mktime to turn broken-up time into a timestamp. Be aware that the month is 0-based, and the year is 1900-based, so adjust accordingly. :-)
use POSIX qw(mktime);
$timestamp = mktime($sec, $min, $hour, $day, $month - 1, $year - 1900);

What's the opposite of the localtime function in Perl?

In Perl, localtime takes a Unix timestamp and gives back year/month/day/hour/min/sec etc. I'm looking for the opposite of localtime: I have the parts, and I'd like to build a unix timestamp from them.
You can use the timelocal function in the Time::Local CPAN module.
NAME
Time::Local - efficiently compute time
from local and GMT time
SYNOPSIS
$time = timelocal($sec,$min,$hour,$mday,$mon,$year);
$time = timegm($sec,$min,$hour,$mday,$mon,$year);
DESCRIPTION
This module provides functions that
are the inverse of built-in perl
functions localtime() and gmtime().
They accept a date as a six-element
array, and return the corresponding
time(2) value in seconds since the
system epoch (Midnight, January 1,
1970 GMT on Unix, for example). This
value can be positive or negative,
though POSIX only requires support for
positive values, so dates before the
system's epoch may not work on all
operating systems.
It is worth drawing particular
attention to the expected ranges for
the values provided. The value for the
day of the month is the actual day (ie
1..31), while the month is the number of months since January (0..11). This
is consistent with the values returned
from localtime() and gmtime().
Note: POSIX::mktime is a just a wrapper around your C library's mktime() function. Time::Local is a pure-Perl implementation, and always returns results matching Perl's localtime. Also, Time::Local offers gmtime, while mktime only works in local time. (Well, you could try changing $ENV{TZ}, but that doesn't work on some systems.)
POSIX::mktime
DateTime on CPAN might of of some use. It also has a lot of time manipulation/translation methods.
Just create the DateTime using your parts and call $datetime->formatter("%s") ;

How I can check for a valid date format in Perl?

I am getting a date field from the database in one of my variables, at the moment I am using the following code to check if the date is in "yyyy-mm-dd" format
if ( $dat =~ /\d{3,}-\d\d-\d\d/ )
My question, is there a better way to accomplish this.
Many Thanks
The OWASP Validation Regex Repository's version of dates in US format with support for leap years:
^(?:(?:(?:0?[13578]|1[02])(/|-|.)31)\1|(?:(?:0?[1,3-9]|1[0-2])(/|-|.)(?:29|30)\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:0?2(/|-|.)29\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:(?:0?[1-9])|(?:1[0-2]))(/|-|.)(?:0?[1-9]|1\d|2[0-8])\4(?:(?:1[6-9]|[2-9]\d)?\d{2})$
The Regular Expression Library contains a simpler version along the lines of the other suggestions, which is translated to your problem:
^\d{4}-\d{1,2}-\d{1,2}$
As noted by others, if this is a date field from a database, it should be coming in a well-defined format, so you can use a simple regex, such as that given by toolkit.
But that has the disadvantage that it will accept invalid dates, such as 2009-02-30. Again, if you're handling dates that successfully made it into a date-typed field in a DB, you should be safe.
A more robust approach would be to use one of the many Date/Time modules from CPAN. Probably Date::Manip would be a good choice, and in particular check out the ParseDate() function.
http://metacpan.org/pod/Date::Manip
How about
/\d{2}\d{2}?-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1])/
\d could match number characters from other languages. And is YYY really a valid year? If it must be four digits, dash, two digits, dash, two digits, I'd prefer /^[0-9]{4}-[0-9]{2}-[0-9]{2}$/ or /^[12][0-9]{3}-[0-9]{2}-[0-9]{2}$/. Be aware of space characters around the string you're matching.
Of course, this doesn't check the reasonableness of the characters that are there, except for the first character in the second example. If that's required, you'll do well to just pass it to a date parsing module and then check its output for logical results.
The best and lightweight solution is using Date::Calc's check_date sub routine, here's an example:
use strict;
use warnings
use Date::Calc qw[check_date];
## string in YYYY-MM-DD format, you can have any format
## you like, just parse it
my #dt_dob = unpack("A4xA2xA2",$str_dob_date);
unless(check_date(#dt_dob)) {
warn "Oops! invalid date!";
}
I hope that was helpful :-)
Well you can start with:
/\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|30|31)/
I would very strongly recommend AGAINST writing your own regular expression to do this. Date/time parsing is simple, but there are some tricky aspects, and this is a problem that has been solved hundreds of times. No need for you to design, write, and debug yet another solution.
If you want a regular expression, the best solution is probably to use my Regexp::Common::time plugin for the Regexp::Common module. You can specify simple or complex, rigid or fuzzy date/time matching, and it has a very extensive test suite.
If you just want to parse specific date formats, you may be better off using one of the many parsing/formatting plugins for Dave Rolsky's excellent DateTime module.
If you want to validate the date/time values after you have matched them, I would suggest my Time::Normalize module.
Hope this helps.
I think using a regex without outer check is much to complicated! I use a little sub to get it:
sub check_date {
my $date_string = shift;
# Check the string fromat and get year, month and day out of it.
# Best to use a regex.
return 0 unless $date_string =~ m/^(\d{4})-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])$/;
# 31. in a month with 30 days
return 0 if ($3 >= 31 and ($2 == 4 or $2 == 6 or $2 == 9 or $2 == 11));
# February 30. or 31.
return 0 if ($3 >= 30 and $2 == 2);
# February 29. in not a leap year.
return 0 if ($2 == 2 and $3 == 29
and not ($1 % 4 == 0 and ($1 % 100 != 0 or $1 % 400 == 0)));
# Date is valid
return 1;
}
I got the idea (and most of the code) from regular-expressions.info. There are other examples too.