Hash Logic within the script - perl

I am trying to understand the logic of the following script specially in terms of storing content within the hash and time scan, also any suggestion on the improvement to make it more short.
#!perl
use strict;
use warnings;
my $A = 60; # minutes
my #mth = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
my #f = localtime();
my $TODAY = sprintf "%02d/%s/%4d",$f[3],$mth[$f[4]],$f[5]+1900;
my $START_MINUTE = $f[2]*60+$f[1] - $MAX_AGE;
##
my %users;
my %conn;
while (<DATA>) {
if( /\bAT\b/ ) {
my( $conn, $uid ) = /conn=(\d+).*uid=(.*?),/;
$conn{$conn} = $uid;
}
if( /ABB/ ) {
my ($timestamp, $conn) = /\[(.*?)\] conn=(\d+)/;
my ($date,$h,$m,undef) = split ':',$timestamp,4;
next unless ($date eq $TODAY);
my $minutes = $h*60 + $m;
if ($minutes >= $START_MINUTE){
my $uid = $conn{$conn};
++$users{$uid};
}
}
}
for my $uid (keys %users) {
my $count = $users{$uid};
print "$count\n" if $count > 6;
}
_DATA_
[04/Jun/2013:13:06:13 -0600] conn=13570 op=14 msgId=13 - AT dn="conn=ad1222,o=xyz.com" method=128 version=3
[04/Jun/2013:15:06:13 -0600] conn=13570 op=14 msgId=15 - RESULT ABB

There are two places where data is put in a hash
my( $conn, $uid ) = /conn=(\d+).*uid=(.*?),/;
$conn{$conn} = $uid;
}
This is straight forward, the regexp extracts the $uid and $conn and sets a hash entry with $conn as the key and $uid as the value. In this statement
$conn{$conn}
^^^^^^ ^ this is a hash
^^^^^ this is a completely different scalar
Overall the expression $conn{$conn} refers to a single element of the hash %conn with the scalar key $conn. There are two different variables here with basically the same name!
If you are looking for improvements, stylistically the hash should be called %uid as it's values are uids
if ($minutes >= $START_MINUTE){
my $uid = $conn{$conn};
++$users{$uid};
This is a bit more "that crazy perl" stuff, although really it is straight forward and is widely used in code. All it it does is increment the hash entry for the key $uid. If there is no entry for $user{$uid} already then the statement automatically makes it and sets the value to 1
update to discuss the "time scan"
my $A = 60; # minutes
my #mth = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
my #f = localtime();
my $TODAY = sprintf "%02d/%s/%4d",$f[3],$mth[$f[4]],$f[5]+1900;
my $START_MINUTE = $f[2]*60+$f[1] - $MAX_AGE;
This makes "$TODAY" which is the date today in a format that matches dates in the files and $START_MINUTE which is the number of minutes since midnight at the time the script is run
Later in the script the time of day is extracted and the minutes since midnight are found in a similar way (hour * 60 + minutes)
To "improve" this part of the script strftime could be used instead of the #mth array and the sprintf line
The calculations for the minutes could be moved to a sub called something like sub minutes_since_midnight
Bit difficult to say about improving the use of the hashes as it's not clear what they are used for out of the context of the program segment shown
Hope that more or less answers your question!!

any suggestion on the improvement to make it more short
You don't want to make it shorter to improve it. You need make it easier to understand in order to improve it. You already had problems understanding the logic, making it shorter isn't going to help there.
Let's take this:
my $A = 60; # minutes
my #mth = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
my #f = localtime();
my $TODAY = sprintf "%02d/%s/%4d",$f[3],$mth[$f[4]],$f[5]+1900;
my $START_MINUTE = $f[2]*60+$f[1] - $MAX_AGE;
I could go through the logic and attempt to figure it out, but more likely, I'll be making the same logical assumptions of the original author. Instead, we can improve readability by using good variable names, and by expanding the logic:
my #month_list = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
my ( $sec, $min, $hour, $month, $mday, $year ) = localtime; #Whoops!
my $full_year += 1900;
my $text_month = $month_list[$month];
my $today = sprintf "%02d/%s/%4d", $mday, $text_month, $full_year;
This is longer to type, but efficiency wise, it's just as efficient. Just because you can cram a bunch of operations on a single line doesn't make it faster to execute. However, mine is much easier to read and easier to maintain which will save you many hours of work. For example, my parsing of localtime is taken directly from the Perldoc on localtime. If you find an issue, and you think it could be due to my parsing of localtime, you could quickly compare my code to the Perldoc.
In fact, there is an error. Take a look of the localtime documentation and compare it to what I have, and you'll see I have $month and $mday mixed up.
Even better would be to use Time::Piece. In fact, Time::Piece would have made parsing the timestamp much cleaner too.
So, please understand that shorter code isn't better if it's just harder to understand, and it is usually not any more efficient in execution.

Related

Perl convert localtime to unix (epoch) time

using perl, I am trying to estimate the time since a file was created.
I would like to convert the local time to unix time (epoch), then take unix time of the file & subtract.
The problem I face is that when I convert localtime to unixtime , it is converted incorrectly!
my $current = str2time (localtime(time));
print $current;
The results I get are
2768504400 = Sun, 23 Sep 2057 21:00:00 GMT
2421349200 = Sun, 23 Sep 2046 21:00:00 GMT
Do I have to feed str2time with a specific date format?
You're doing something bizarre here - localtime(time) takes - the epoch time (time) and converts it to a string.
And then you convert it back.
Just use time()
Or perhaps better yet -M which tells you how long ago a file was modified. (In days, so you'll have to multiply up).
e.g.:
my $filename = "sample.csv";
my $modification = -M $filename;
print $modification * 84600;
But if you really want to take the time and convert it back again - you'll need to look at how localtime(time) returns the result.
If you do:
print localtime(time);
You get:
5671624811542661
Because localtime is being evaluated in a list context, and so returning an array of values. (Which you can use without needing to parse).
my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
localtime(time);
If you do it in a scalar context, it returns a string denoting the time:
print "".localtime(time);
Gives:
Thu Sep 24 16:09:33 2015
But note - that might vary somewhat depending on your current locale. That's probably why str2time is doing odd things - because it makes certain assumptions about formats that don't always apply. The big gotcha is this:
When both the month and the date are specified in the date as numbers they are always parsed assuming that the month number comes before the date. This is the usual format used in American dates.
You would probably be better off instead using Time::Piece and strftime to get a fixed format:
e.g.
use Time::Piece;
print localtime(time) -> strftime ( "%Y-%m-%d %H:%M:%S" );
Note - Time::Piece overloads localtime so you can actually use it (fairly) transparently. Of course, then you can also do:
print localtime(time) -> epoch;
And do without all the fuss of converting back and forth.
You have missed requesting localtime to produce scalar (string) instead of array.
use Date::Parse;
my $current = str2time (scalar(localtime(time)));
print $current, "\n";
print scalar(localtime($current)),"\n";
perldoc -f localtime
Converts a time as returned by the time function to a 9-element
list with the time analyzed for the local time zone. Typically
used as follows:
# 0 1 2 3 4 5 6 7 8
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
...
In scalar context, "localtime()" returns the ctime(3) value:
$now_string = localtime; # e.g., "Thu Oct 13 04:54:34 1994"

Perl sort with time stamp using hash

I'm new to Perl and need help with sorting using the hash and/or any other possible method this can be done in Perl.
I've an input file like below and would like to generate the output file as shown.
I'm thinking if this can be done by putting it in hash and then comparing? Please also provide an explanations to the steps for the learning purpose if possible.
If the file has duplicate/triplicate entries matching with different timestamp, it should only list the latest time stamp entry.
Input file
A May 19 23:59:14
B May 19 21:59:14
A May 22 07:59:14
C Apr 10 12:23:00
B May 11 10:23:34
The output should be
A May 22 07:59:14
B May 19 21:59:14
C Apr 10 12:23:00
You can try to use your data(A,B etc) as key and timestamp as value in perl hash.
Then read input file and compare timestamps using perl time datatype. This way you keep only latest entries and other can be discarded. Print result at the end.
A hash is good for coalescing duplicates.
However sorting by time stamp requires converting the 'text' representation to an actual time. Time::Piece is one of the better options for doing this
#!/usr/local/bin/perl
use strict;
use warnings;
use Time::Piece;
my %things;
while (<DATA>) {
my ( $letter, $M, $D, $T ) = split;
my $timestamp =
Time::Piece->strptime( "$M $D $T 2015", "%b %d %H:%M:%S %Y" );
if ( not defined $things{$letter}
or $things{$letter} < $timestamp )
{
$things{$letter} = $timestamp;
}
}
foreach my $thing ( sort keys %things ) {
print "$thing => ", $things{$thing}, "\n";
}
__DATA__
A May 19 23:59:14
B May 19 21:59:14
A May 22 07:59:14
C Apr 10 12:23:00
B May 11 10:23:34
Note though - your timestamps are ambiguous because they omit the year. You have to deal with this some way. I've gone for the easy road of just inserting 2015. That's not good practice - at the very least you should use some way of discovering 'current year' automatically - but bear in mind that at some points in the year, this will Just Break.
You can format output date using the strftime method within Time::Piece - this is merely the default output.

perl print current year in 4 digit format

how do i get the current year in 4 digit this is what i have tried
#!/usr/local/bin/perl
#months = qw( Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec );
#days = qw(Sun Mon Tue Wed Thu Fri Sat Sun);
$year = $year+1900;
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime();
print "DBR_ $year\\$months[$mon]\\Failures_input\\Failures$mday$months[$mon].csv \n";
This prints DBR_ 114\Apr\Failures_input\Failures27Apr.csv
How do I get 2014?
I am using version 5.8.8 build 820.
use Time::Piece;
my $t = Time::Piece->new();
print $t->year;
Move the line:
$year = $year+1900;
To after that call to localtime() and to become:
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime();
$year = $year+1900;
The best way is to use the core library Time::Piece. It overrides localtime so that the result in scalar context is a Time::Piece object, you can use the many methods that the module supplies on it. (localtime in list context, as you have used it in your own code, continues to provide the same nine-element list.)
The strftime method allows you to format a date/time as you wish.
This very brief program produces the file path that I think you want (I doubt if there should be a space after DBR_?) Note that there is no need to double up backslashes inside a single-quoted string unless it is the last character of the string.
use strict
use warnings;
use Time::Piece;
my $path = localtime->strftime('DBR_%Y\%b\Failures_input\Failures%m%d.csv');
print $path;
output
DBR_2014\Apr\Failures_input\Failures27Apr.csv
One option to get the 4 digit year:
#!/usr/bin/perl
use POSIX qw(strftime);
$year = strftime "%Y", localtime;
printf("year %02d", $year);
You can also use
my ($y,$m,$d) = Date::Calc::Today();
$y variable will contain 2019
$m variable will contain 8
$d variable will contain 9
at the time of writing this answer ( 9th August 2019 )
The simplest way, I find, to get the year is:
my $this_year = (localtime)[5] + 1900;

getting minutes difference between two Time::Piece objects

I have the code:
use Time::Piece;
use Time::Seconds;
my $timespan = $latest_time - $timestamp;
print $latest_time . "\n";
print $timestamp . "\n";
print $timespan->minutes;
where $latest_time = Time::Piece->new; and $timestamp = Time::Piece->strptime();
and I get the results:
Thu Mar 27 09:40:19 2014
Thu Mar 27 09:40:00 2014
-479.683333333333
What went wrong? there should be 0 minutes for $timespan, correct? Where is -479 coming from?
Reproducing the "bug"
This issue arises because strptime defaults to UTC instead of to the local timezone. This can be demonstrated in the following code which takes a current time, prints it out, then reparses it and shows the difference:
use strict;
use warnings;
use Time::Piece;
my $now = Time::Piece->new();
print $now->strftime(), "\n";
my $fmt = "%Y-%m-%d %H:%M:%S";
my $nowstr = $now->strftime($fmt);
my $parsed = Time::Piece->strptime("$nowstr", $fmt);
print "($nowstr)\n";
print $parsed->strftime(), "\n";
my $diff = $now - $parsed;
print $diff->hours, " hours difference\n";
Outputs:
Wed, 26 Mar 2014 21:42:08 Pacific Daylight Time
(2014-03-26 21:42:08)
Wed, 26 Mar 2014 21:42:08 UTC
7 hours difference
One hackish solution - getting parsed times to read as local
Now, in hacking around, I've discovered one potential hack for this on my strawberry perl system. It's by calling strptime like this: $now->strptime.
my $nowstr = "2014-03-26 21:51:00"; #$now->strftime($fmt);
my $parsed = $now->strptime("$nowstr", $fmt); #Time::Piece->strptime("$nowstr", $fmt);
print "($nowstr)\n";
print $parsed->strftime(), "\n";
my $diff = $now - $parsed;
print $diff->hours, " hours difference\n";
To confirm that strptime was actually using the time I set it, I gave it one that was 6 minutes before the current time. The output is as follows:
Wed, 26 Mar 2014 21:57:00 Pacific Daylight Time
(2014-03-26 21:51:00)
Wed, 26 Mar 2014 21:51:00 Pacific Standard Time
0.1 hours difference
The parsed time will inherit the c_islocal value from $now. $now just needs to be initialized with either localtime or ->new() and not gmtime of course.
As you can see one claims DST while the other does not, but date math is still done correctly. I was able to figure out this hack by looking at the source for strptime, _mktime, and new.
Hopefully, at the very least my code to reproduce the error will be helpful to someone with more experience with Time::Piece, and I'd love a better solution.
When I use strftime("%H:%M:%S %Z") for both $latest_time and $timestamp they both show the same timezone, but the $latest_time - $timestamp operation shows that there is difference between the timezones as tobyink pointed out. This might be a bug in Time::Piece module.
So it seems that Time::Piece->new; gets the current machine time including the timezone.
So either I fix the timezone in Time::Piece->new or include a timezone value when I use Time::Piece->strptime(); to fix the timezone problem. Thanks for the info, tobyink.

How do you read the system time and date in Perl?

I need to read the system clock (time and date) and display it in a human-readable format in Perl.
Currently, I'm using the following method (which I found here):
#!/usr/local/bin/perl
#months = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
#weekDays = qw(Sun Mon Tue Wed Thu Fri Sat Sun);
($second, $minute, $hour, $dayOfMonth, $month, $yearOffset, $dayOfWeek, $dayOfYear, $daylightSavings) = localtime();
$year = 1900 + $yearOffset;
$theTime = "$hour:$minute:$second, $weekDays[$dayOfWeek] $months[$month] $dayOfMonth, $year";
print $theTime;
When you run the program, you should see a much more readable date and time like this:
9:14:42, Wed Dec 28, 2005
This seems like it's more for illustration than for actual production code. Is there a more canonical way?
Use localtime function:
In scalar context, localtime() returns
the ctime(3) value:
$now_string = localtime; # e.g., "Thu Oct 13 04:54:34 1994"
You can use localtime to get the time and the POSIX module's strftime to format it.
While it'd be nice to use Date::Format's and its strftime because it uses less overhead, the POSIX module is distributed with Perl, and is thus pretty much guaranteed to be on a given system.
use POSIX;
print POSIX::strftime( "%A, %B %d, %Y", localtime());
# Should print something like Wednesday, January 28, 2009
# ...if you're using an English locale, that is.
# Note that this and Date::Format's strftime are pretty much identical
As everyone else said "localtime" is how you tame date, in an easy and straight forward way.
But just to give you one more option. The DateTime module. This module has become a bit of a favorite of mine.
use DateTime;
my $dt = DateTime->now;
my $dow = $dt->day_name;
my $dom = $dt->mday;
my $month = $dt->month_abbr;
my $chr_era = $dt->year_with_christian_era;
print "Today is $dow, $month $dom $chr_era\n";
This would print "Today is Wednesday, Jan 28 2009AD". Just to show off a few of the many things it can do.
use DateTime;
print DateTime->now->ymd;
It prints out "2009-01-28"
Like someone else mentioned, you can use localtime, but I would parse it with Date::Format. It'll give you the timestamp formatted in pretty much any way you need it.
The simplest one-liner print statement to print localtime in clear, readable format is:
print scalar localtime (); #Output: Fri Nov 22 14:25:58 2019