I have a web application that I want to run some system tests on, and in order to do that I'm going to need to move the system time around. The application used DateTime all the way through.
Has anyone got any recommendations for how to change the time that DateTime->now reports? The only thing that comes to mind is subclassing DateTime and messing about with all the 'use' lines, but this seems rather invasive.
Note on answers:
All three will work fine, but the Hook::LexWrap one is the one I've chosen because (a) I want to move the clock rather than jiggle it a bit (which is more the purpose of what Time::Mock and friends do); (b) I do, consistently, use DateTime, and I'm happy to have errors come out if I've accidentally not used it; and (c) Hook::LexWrap is simply more elegant than a hack in the symbol table, for all that it does the same thing. (Also, it turns out to be a dependency of some module I already installed, so I didn't even have to CPAN it...)
Rather than taking the high-level approach and wrapping DateTime specifically, you might want to look into the modules Test::MockTime and Time::Mock, which override the low-level functions that DateTime etc. make use of, and (with any luck) will do the right thing on any time-sensitive code. To me it seems like a more robust way to test.
I think Hook::LexWrap is overkill for this situation. It's easier to just redefine such a simple function.
use DateTime;
my $offset;
BEGIN {
$offset = 24 * 60 * 60; # Pretend it's tomorrow
no warnings 'redefine';
sub DateTime::now
{
shift->from_epoch( epoch => ($offset + scalar time), #_ )
}
} # end BEGIN
You can replace my $offset with our $offset if you need to access the $offset from outside the file which contains this code.
You can adjust $offset at any time, if you want to change DateTime's idea of the current time during the run.
The calculation of $offset should probably be more complicated than shown above. For example, to set the "current time" to an absolute time:
my $want = DateTime->new(
year => 2009,
month => 9,
day => 14,
hour => 12,
minute => 0,
second => 0,
time_zone => 'America/Chicago',
);
my $current = DateTime->from_epoch(epoch => scalar time);
$offset = $want->subtract_datetime_absolute($current)->in_units('seconds');
But you probably do want to calculate a fixed number of seconds to add to the current time, so that time will advance normally after that. The problem with using add( days => 1 ); in the redefined now method is that things like DST changes will cause the time to jump at the wrong pseudotime.
You can use code injection via Hook::LexWrap to intercept the now() method.
use Hook::LexWrap;
use DateTime;
# Use real now
test();
{
my $wrapper = wrap 'DateTime::now',
post => sub {
$_[-1] = DateTime->from_epoch( epoch => 0 );
};
# Use fake now
test();
}
# use real now again
test();
sub test {
my $now = DateTime->now;
print "The time is $now\n";
}
When designing a new class with testability in mind, the ideal solution is to be able to inject new date objects.
However, for existing code using DateTime->now and DateTime->today a possible, suitably scoped, solution is below. I include it here as a way to do this without introducing Hook::LexWrap as a dependency and without affecting the behaviour globally.
{
no strict 'refs';
no warnings 'redefine';
local *{'DateTime::today'} = sub {
return DateTime->new(
year => 2012,
month => 5,
day => 31
);
};
say DateTime->today->ymd(); # 2012-05-31
};
say DateTime->today->ymd(); # today
Related
I'm starting out with two dates.
my $date1 = 01/01/2016;
my $date2 = 05/15/2016;
I need to put all the dates of the Sundays between two dates.
Any idea where I should start?
Your solution is good, but it potentially consumes a lot of memory creating an array of Sundays that might never be used. DateTime objects are not small nor cheap.
An alternative is an iterator, a function which every time it's called generates the next Sunday. It generates each Sunday on demand rather than calculating them all beforehand. This saves memory and supports potentially infinite Sundays.
use strict;
use warnings;
use v5.10;
sub sundays_iterator {
my($start, $end) = #_;
# Since we're going to modify it, copy it.
$start = $start->clone;
# Move to Sunday, if necessary.
$start->add( days => 7 - $start->day_of_week );
# Create an iterator using a closure.
# This will remember the values of $start and $end as
# they were when the function was returned.
return sub {
# Clone the date to return it without modifications.
# We always start on a Sunday.
my $date = $start->clone;
# Move to the next Sunday.
# Do this after cloning because we always start on Sunday.
$start->add( days => 7 );
# Return Sundays until we're past the end date
return $date <= $endDate ? $date : ();
};
}
That returns a closure, an anonymous subroutine which remembers the lexical variables it was created with. Sort of like an inside out object, a function with data attached. You can then call it like any subroutine reference.
my $sundays = sundays_iterator($startDate, $endDate);
while( my $sunday = $sundays->() ) {
say $sunday;
}
The upside is it saves a lot of memory, this can be especially important if you're taking the dates as user input: a malicious attacker can ask you for an enormous range consuming a lot of your server's memory.
It also allows you to separate generating the list from using the list. Now you have a generic way of generating Sundays within a date range (or, with a slight tweak, any day of the week).
The downside is it's likely to be a bit slower than building an array in a loop... but probably not noticeably so. Function calls are relatively slow in Perl, so making one function call for each Sunday will be slower than looping, but calling those DateTime methods (which call other methods which call other methods) will swamp that cost. Compared to using DateTime, calling the iterator function is a drop in the bucket.
You should start by picking a module. I'm partial to DateTime, using DateTime::Format::Strptime for the parsing.
use DateTime qw( );
use DateTime::Format::Strptime qw( );
my $start_date = "01/01/2016";
my $end_date = "05/15/2016";
my $format = DateTime::Format::Strptime->new(
pattern => '%m/%d/%Y',
time_zone => 'floating', # Best not to use "local" for dates with no times.
on_error => 'croak',
);
my $start_dt = $format->parse_datetime($start_date)->set_formatter($format);
my $end_dt = $format->parse_datetime($end_date )->set_formatter($format);
my $sunday_dt = $start_dt->clone->add( days => 7 - $start_dt->day_of_week );
while ($sunday_dt <= $end_dt) {
print "$sunday_dt\n";
$sunday_dt->add( days => 7 );
}
Note: You really shouldn't use DateTime->new as Bill used and Schwern endorsed. It's not the recommended use of DateTime because it creates code that's far more complicated and error-prone. As you can see, using a formatter cut the code size in half.
Note: Schwern is advocating the use of an iterator, replacing the last four lines of my answer with something 4 times longer (all the code in his answer). There's no reason for that high level complexity! He goes into length saying how much memory the iterator is saving, but it doesn't save any at all.
DateTime::Set makes constructing an iterator easy:
use DateTime::Format::Strptime ();
use DateTime::Set ();
my $start_date = "01/01/2016";
my $end_date = "05/15/2016";
my $format = DateTime::Format::Strptime->new(
pattern => '%m/%d/%Y',
time_zone => 'local',
on_error => 'croak',
);
my $iterator = DateTime::Set->from_recurrence(
start => $format->parse_datetime($start_date)->set_formatter($format),
end => $format->parse_datetime($end_date)->set_formatter($format),
recurrence => sub { $_[0]->add( days => 7 - $_[0]->day_of_week || 7 ) }, # next Sunday after $_[0]
);
while ( my $date = $iterator->next ) {
say $date;
}
This is what I came up with but please let me know if there is a better way.
use DateTime;
my $date1 = "1/1/2016";
my $date2 = "5/15/2016";
my ($startMonth, $startDay, $startYear) = split(/\//, $date1);
my ($endMonth, $endDay, $endYear) = split(/\//, $date2);
my $startDate = DateTime->new(
year => $startYear,
month => $startMonth,
day => $startDay
);
my $endDate = DateTime->new(
year => $endYear,
month => $endMonth,
day => $endDay
);
my #sundays;
do {
my $date = DateTime->new(
year => $startDate->year,
month => $startDate->month,
day => $startDate->day
);
push #sundays, $date if ($date->day_of_week == 7);
$startDate->add(days => 1);
} while ($startDate <= $endDate);
foreach my $sunday (#sundays) {
print $sunday->strftime("%m/%d/%Y");
}
I am writing a small tool to parse some application logs for to collect data that is going to be used as the inputs for Zabbix monitoring. I am just wanting to keep data from the logs that are within the past two hours.
The format of the logs is pretty simple, the fields are separated by white space and the first three fields are used to determine the time when the logging was written.
Here is an example of the first three fields of a log line:
Jan 5 13:42:07
What I set out to do was to utilize one of my favorite modules, DateTime. Where I convert the above into a DateTime object and then compare that object to another DateTime object when the utility would be invoked.
Everything was fine an dandy and working nicely until I actually set the utility against the a portion of the logs it would actually be parsing -- only a couple gigabytes in size. The test run was being done on a kitchen invoked Ubuntu virtual box instance on my laptop, so the resources are -- as expected -- rather limited. The script would halt with the words 'Killed' displayed.
Looking into /var/log/messages I would see log lines describing the process being killed due to resource issues.
When I invoked the process again, and then switching to another screen instance to watch top, I noticed that the memory percentage would grow, that swap space would being to be consumed all until the script would again stop with the 'Killed' message.
When I would rerun the script with the DateTime portion commented out, the script would execute as expected.
In the script I have a subroutine which would be called to create a DateTime object based upon the information found in the first three fields of the log line. I have tried where I create the object at the beginning of the subroutine then undef it prior to returning a value at the end of the subroutine, I have tried it where I create a global object ( using our ) and then use the DateTime set_* methods to modify what I thought would be a single object's values.
I have read that perl does not clean up hash memory so that it can be reused by the program--I feel that this is the base of the issue that I am running into.
At this point, I am feel the need to get input of others and that is the reason for this post. All comments and criticisms would be appreciated.
This utility was running on Perl v5.14.2.
This code produces the memory leak:
#!/usr/bin/perl -w
use strict;
use DateTime;
my $month = 1;
my $day = 6;
my $hour = 20;
my $minute = 30;
my $second = 00;
for (my $count = 0; $count <= 25_000_000; $count++) {
my $epoch = &get_epoch( $month, $day, $hour, $minute, $second );
}
sub get_epoch {
my $mon = shift;
my $day = shift;
my $hour = shift;
my $min = shift;
my $sec = shift;
my $temp_dt = DateTime->new(
year => 2015,
month => $mon,
day => $day,
hour => $hour,
minute => $min,
second => $sec,
nanosecond => 500_000_000,
time_zone => 'UTC',
);
return( $temp_dt->epoch );
}
This is a bug in Params::Validate 1.15 and will be fixed very soon.
There are some programs/scripts that need to be run at specific times in a timezone different from the system timezone.
A la crontab in Perl, but one that honors a timezone and DST rules in a region different from that in which the system is configured.
Here is the use case : I will create an excel sheet with the time in PT in column B and the corresponding program/Perl script to run in column C.
Nothing specific about this information being in a Excel sheet - could be plain text file/"crontab" entry too.
A Perl script will read in the data from the excel sheet and run/spawn those scripts at the correct time.
The thing to keep at mind is that the Perl script should run correctly regardless of what timezone the system that it is running on is.
Regardless of whether the script is running on a Box in NY or IL or CA, it should spawn the scripts at the time mentioned in the file entries as per the Pacific Standard Time with DST at mind.
It is very important, as I said before, of it being aware, "automagically" ( without me doing any explicit programmming ) of the latest DST rules for the PT region.
What would you suggest?
Maybe I can visit some website that shows current time in that region and scan the time value from it, and run the scripts when it's the correct time?
Any such Perl screen scraper friendly site?
Or maybe I can use some smart Perl module, like Schedule::Cron
For the record, a large number of good suggestions came by at http://www.perlmonks.org/index.pl?node_id=772934, however, they, in typical at/cron fashion, work as per the system configured timezone.
In general, if you care about timezones, represent times internally in some universal format and convert times for display purposes only.
Applying this to your problem, write a crontab whose times are expressed in GMT. On each worker machine, convert to local time and install the crontab.
Front matter:
#! /usr/bin/perl
use warnings;
use strict;
use feature qw/ switch /;
use Time::Local qw/ timegm /;
For the conversions this program supports, use today's date and substitute the time from the current cronjob. Return the adjusted hour and day-of-week offset:
sub gmtoday {
my($gmmin,$gmhr,$gmmday,$gmmon,$gmwday) = #_;
my #gmtime = gmtime $^T;
my(undef,undef,$hour,$mday,$mon,$year,$wday) = #gmtime;
my #args = (
0, # sec
$gmmin eq "*" ? "0" : $gmmin,
$gmhr,
$mday,
$mon,
$year,
);
my($lhour,$lwday) = (localtime timegm #args)[2,6];
($lhour, $lwday - $wday);
}
Take the five-field time specification from the current cronjob and convert it from GMT to local time. Note that a fully general implementation would support 32 (i.e., 2 ** 5) cases.
sub localcron {
my($gmmin,$gmhr,$gmmday,$gmmon,$gmwday) = #_;
given ("$gmmin,$gmhr,$gmmday,$gmmon,$gmwday") {
# trivial case: no adjustment necessary
when (/^\d+,\*,\*,\*,\*$/) {
return ($gmmin,$gmhr,$gmmday,$gmmon,$gmwday);
}
# hour and maybe minute
when (/^(\d+|\*),\d+,\*,\*,\*$/) {
my($lhour) = gmtoday #_;
return ($gmmin,$lhour,$gmmday,$gmmon,$gmwday);
}
# day of week, hour, and maybe minute
when (/^(\d+|\*),\d+,\*,\*,\d+$/) {
my($lhour,$wdoff) = gmtoday #_;
return ($gmmin,$lhour,$gmmday,$gmmon,$gmwday+$wdoff);
}
default {
warn "$0: unhandled case: $gmmin $gmhr $gmmday $gmmon $gmwday";
return;
}
}
}
Finally, the main loop reads each line from the input and generates the appropriate output. Note that we do not destroy unhandled times: they instead appear in the output as comments.
while (<>) {
if (/^\s*(?:#.*)?$/) {
print;
next;
}
chomp;
my #gmcron = split " ", $_, 6;
my $cmd = pop #gmcron;
my #localcron = localcron #gmcron;
if (#localcron) {
print join(" " => #localcron), "\t", $cmd, "\n"
}
else {
print "# ", $_, "\n";
}
}
For this sorta-crontab
33 * * * * minute only
0 0 * * * minute and hour
0 10 * * 1 minute, hour, and wday (same day)
0 2 * * 1 minute, hour, and wday (cross day)
the output is the following when run in the US Central timezone:
33 * * * * minute only
0 18 * * * minute and hour
0 4 * * 1 minute, hour, and wday (same day)
0 20 * * 0 minute, hour, and wday (cross day)
In the schedule, store the number of seconds from the epoch when each run should occur rather than a date/time string.
Expanding a little:
#!/usr/bin/perl
use strict; use warnings;
use DateTime;
my $dt = DateTime->new(
year => 2010,
month => 3,
day => 14,
hour => 2,
minute => 0,
second => 0,
time_zone => 'America/Chicago',
);
print $dt->epoch, "\n";
gives me
Invalid local time for date in time zone: America/Chicago
because 2:00 am on March 14, 2010 is when the switch occurs. On the other hand, using hour => 3, I get: 1268553600. Now, in New York, I use:
C:\Temp> perl -e "print scalar localtime 1268553600"
Sun Mar 14 04:00:00 2010
So, the solution seems to be to avoid scheduling these events during non-existent times in your local time zone. This does not require elaborate logic: Just wrap the DateTime constructor call in an eval and deal with the exceptional time.
While I certainly think that there are likely "cleaner" solutions, would the following work?
set the cron to run the scripts several hours ahead of the possible range of times you actually want the script to run
handle the timezone detection in the script and have it sleep for the appropriate amount of time
Again, I know this is kinda kludgey but I thought I would put it out there.
Use the DateTime module to calculate times.
So if your setup says to run a script at 2:30 am every day, you will need logic to:
Try to create a DateTime object for 2:30am in timezone America\Los_Angeles.
If no object add 5 minutes to the time and try again. Give up after 2 hours offset.
Once you have a DateTime object, you can do comparisons with DateTime->now or extract an epoch time from your object and compare that with the results of time.
Note that I chose 2:30 am, since that time won't exist at least 1 day a year. That's why you need to have a loop that adds an offset.
Ever since British Summer Time ended in the UK last week my application has been seeing a very interesting bug. Here's an isolated Perl script which demonstrates the issue:
#!/usr/bin/perl
use strict; use warnings;
use DateTime::Format::W3CDTF;
use DateTime::Format::ISO8601;
my $tz = 'Europe/London';
sub print_formatted_date {
my $date = shift;
my $tz_date = DateTime::Format::ISO8601->new->parse_datetime( $date );
$tz_date->set_time_zone( $tz );
print "tz_date: $tz_date\n";
$tz_date->set_formatter( DateTime::Format::W3CDTF->new );
print "tz_date with W3C formatter: $tz_date\n";
}
print_formatted_date( '2009-10-25' );
print "\n";
print_formatted_date( '2009-10-26' );
The output of this is:
tz_date: 2009-10-25T00:00:00
tz_date with W3C formatter: 2009-10-25T00:00:00+01:00
tz_date: 2009-10-26T00:00:00
tz_date with W3C formatter: 0
Notice that for dates which fall outside of BST the W3C formatter is rendering them as '0'.
This is an issue for me because a third party library which we use is using DateTime::Format::W3CDTF to format paramters during a SOAP call. Because the formatting is failing the calls are failing.
Anyone have any clues? I'm no Perl guru so any help would be really appreciated. Could this be a bug in the DateTime::Format::W3CDTF library?
Looking at the implementation of W3CDTF I think this might actually be a bug in the library:
sub format_datetime
{
my ( $self, $dt ) = #_;
my $base = sprintf( '%04d-%02d-%02dT%02d:%02d:%02d',
$dt->year, $dt->month, $dt->day,
$dt->hour, $dt->minute, $dt->second );
my $tz = $dt->time_zone;
return $base if $tz->is_floating;
return $base . 'Z' if $tz->is_utc;
if (my $offset = $dt->offset()) {
return $base . offset_as_string($offset );
}
}
Note that if $tz->is_utc is false, but $dt->offset() is 0 then there is neither return codepath is being hit, which I guess in Perl means that a nil is implicitly returned. I think that scenario is what my sample script is hitting - 'Europe/London' is not technically UTC, but it does still have an offset of 0.
UPDATE
After a bit more research I found that this same bug has already been reported (2 years ago!). The bug report includes a patch which appears to resolve the issue (although I haven't tested that personally).
UPDATE 2
A fix for this has been released
The log4j logs I have contain timestamps in the following format:
2009-05-10 00:48:41,905
I need to convert it in perl to millseconds since epoch, which in this case would be 124189673005, using the following gawk function. How do I do it in perl?
I have little or no experience in perl, so appreciate if someone can post an entire script that does this
function log4jTimeStampToMillis(log4jts) {
# log4jts is of the form 2009-03-02 20:04:13,474
# extract milliseconds that is after the command
split(log4jts, tsparts, ",");
millis = tsparts[2];
# remove - : from tsstr
tsstr = tsparts[1];
gsub("[-:]", " ", tsstr);
seconds = mktime(tsstr);
print log4jts;
return seconds * 1000 + millis;
}
Though I almost always tell people to go use one of the many excellent modules from the CPAN for this, most of them do have one major drawback - speed. If you're parsing a large number of log files in real-time, that can sometimes be an issue. In those cases, rolling your own can often be a more suitable solution, but there are many pitfalls and nuances that must be considered and handled properly. Hence the preference for using a known-correct, proven, reliable module written by somebody else. :)
However, before I even considered my advice above, I looked at your code and had converted it to perl in my head... therefore, here is a more-or-less direct conversion of your gawk code into perl. I've tried to write it as simply as possible, so as to highlight some of the more delicate parts of dealing with dates and times in perl by hand.
# import the mktime function from the (standard) POSIX module
use POSIX qw( mktime );
sub log4jTimeStampToMillis {
my ($log4jts, $dst) = #_;
# extract the millisecond field
my ($tsstr, $millis) = split( ',', $log4jts );
# extract values to pass to mktime()
my #mktime_args = reverse split( '[-: ]', $tsstr );
# munge values for posix compatibility (ugh)
$mktime_args[3] -= 1;
$mktime_args[4] -= 1;
$mktime_args[5] -= 1900;
# print Dumper \#mktime_args; ## DEBUG
# convert, make sure to account for daylight savings
my $seconds = mktime( #mktime_args, 0, 0, $dst );
# return that time as milliseconds since the epoch
return $seconds * 1000 + $millis;
}
One important difference between my code and yours - my log4jTimeStampToMillis subroutine takes two parameters:
the log timestamp string
whether or not that timestamp is using daylight savings time ( 1 for true, 0 for false )
Of course, you could just add code to detect if that time falls in DST or not and adjust automatically, but I was trying to keep it simple. :)
NOTE: If you uncomment the line marked DEBUG, make sure to add "use Data::Dumper;" before that line in your program so it will work.
Here's an example of how you could test that subroutine:
my $milliseconds = log4jTimeStampToMillis( "2009-05-10 00:48:41,905", 1 );
my $seconds = int( $milliseconds / 1000 );
my $local = scalar localtime( $seconds );
print "ms: $milliseconds\n"; # ms: 1241844521905
print "sec: $seconds\n"; # sec: 1241844521
print "local: $local\n"; # local: Sat May 9 00:48:41 2009
You should take advantage of the great DateTime package, specifically use DateTime::Format::Strptime:
use DateTime;
use DateTime::Format::Strptime;
sub log4jTimeStampToMillis {
my $log4jts=shift(#_);
#see package docs for how the pattern parameter works
my $formatter= new DateTime::Format::Strptime(pattern => '%Y-%m-%d %T,%3N');
my $dayObj = $formatter->parse_datetime($log4jts);
return $dayObj->epoch()*1000+$dayObj->millisecond();
}
print log4jTimeStampToMillis('2009-05-10 10:48:41,905')."\n";
#prints my local version of the TS: 1241952521905
This saves you the pain of figuring out DST yourself (although you'll have to pass your server's TZ to Strptime via the time_zone parameter). It also saves you from dealing with leap everything if it becomes relevant (and I'm sure it will).
Haven't used it, but you might want to check out Time::ParseDate.
SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss,SSS");
Date time = dateFormat.parse(log4jts);
long millis = time.getTime();