I'm new to Perl and need help with sorting using the hash and/or any other possible method this can be done in Perl.
I've an input file like below and would like to generate the output file as shown.
I'm thinking if this can be done by putting it in hash and then comparing? Please also provide an explanations to the steps for the learning purpose if possible.
If the file has duplicate/triplicate entries matching with different timestamp, it should only list the latest time stamp entry.
Input file
A May 19 23:59:14
B May 19 21:59:14
A May 22 07:59:14
C Apr 10 12:23:00
B May 11 10:23:34
The output should be
A May 22 07:59:14
B May 19 21:59:14
C Apr 10 12:23:00
You can try to use your data(A,B etc) as key and timestamp as value in perl hash.
Then read input file and compare timestamps using perl time datatype. This way you keep only latest entries and other can be discarded. Print result at the end.
A hash is good for coalescing duplicates.
However sorting by time stamp requires converting the 'text' representation to an actual time. Time::Piece is one of the better options for doing this
#!/usr/local/bin/perl
use strict;
use warnings;
use Time::Piece;
my %things;
while (<DATA>) {
my ( $letter, $M, $D, $T ) = split;
my $timestamp =
Time::Piece->strptime( "$M $D $T 2015", "%b %d %H:%M:%S %Y" );
if ( not defined $things{$letter}
or $things{$letter} < $timestamp )
{
$things{$letter} = $timestamp;
}
}
foreach my $thing ( sort keys %things ) {
print "$thing => ", $things{$thing}, "\n";
}
__DATA__
A May 19 23:59:14
B May 19 21:59:14
A May 22 07:59:14
C Apr 10 12:23:00
B May 11 10:23:34
Note though - your timestamps are ambiguous because they omit the year. You have to deal with this some way. I've gone for the easy road of just inserting 2015. That's not good practice - at the very least you should use some way of discovering 'current year' automatically - but bear in mind that at some points in the year, this will Just Break.
You can format output date using the strftime method within Time::Piece - this is merely the default output.
Related
I am trying to convert a date from epoch to year month day and get the correct date.
my $day = 18322;
my ($y, $m, $d) = (gmtime 86400*$day)[5,4,3];
The epoch date is 1583020800 The conversion is as follows $y is 120 $m is 2 $d is 1
I guess I have to add $y = $y+1900 I get the correct year, I can add 1 to $m to get the correct month the day $d I don't have to add anything to. Is this correct. I am taking over code for someone but I have no idea what [5,4,3] does.
Epoch time 1583020800 is Sun Mar 1 00:00:00 2020.
You can use gmtime, but it's awkward. It returns an array of values and they need to be converted. The year is the number of years since 1900 and the month starts at 0. This is because it is a thin wrapper around struct tm from the C programming language Perl is written in.
my($y,$m,$d) = (gmtime(1583020800))[5,4,3];
$y += 1900;
$m += 1;
printf "%04d-%02d-%02d\n", $y, $m, $d;
Instead, use the built in Time::Piece.
use v5.10;
use Time::Piece;
my $time = Time::Piece->gmtime(1583020800);
say $time->ymd;
Or the more powerful DateTime.
use v5.10;
use DateTime;
my $dt = DateTime->from_epoch(epoch => 1583020800);
say $dt->ymd;
The (...)[5,4,3] is a literal list slice. The thing inside the parens creates a list, but this selects only elements 5, 4, and 3.
The gmtime docs point to localtime, which shows you the position of each thing in its list:
localtime
Converts a time as returned by the time function to a 9-element
list with the time analyzed for the local time zone. Typically
used as follows:
# 0 1 2 3 4 5 6 7 8
my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
localtime(time);
I would use Time::Piece as in Schwern's answer. But just to cover all bases, you can use the strftime() function from POSIX.pm as well.
use feature 'say';
use POSIX qw[strftime];
say strftime('%Y-%m-%d', gmtime(1583020800));
Output:
2020-03-01
You can pass different format strings to strftime().
My script calculates the difference in days between two dates. However, all the time I encounter errors. The solution must work for all OS. It is advisable to do it in UNIX epoch time, but if it is impossible then there may be another solution.
I tried:
Time::ParseDate - does not work on MS Windows
Time::Local - does not work on dates from the 31st of the month
Sample code:
#!/usr/bin/perl -w
use strict;
use Time::Local;
use POSIX;
sub toepoch {
my #a = split /[- :]/, $_[0];
$a[0] =~ s/^.{2}//;
if (! defined $a[5]) {
$a[5] = 00
}
my $b = timelocal($a[5], $a[4], $a[3], $a[2], $a[1], $a[0]);
return $b;
}
my $days = sprintf("%d",(&toepoch('2018-03-31 11:00') - &toepoch('2018-04-02 11:00') / 86400));
print $days;
Output: Day '31' out of range 1..30 at epoch.pl line 12.
What module should I check in next? I remind you that the solution must work on UNIX and MS Windows systems.
From the documentation for Time::Local:
It is worth drawing particular attention to the expected ranges for
the values provided. The value for the day of the month is the actual
day (ie 1..31), while the month is the number of months since January
(0..11). This is consistent with the values returned from
"localtime()" and "gmtime()".
So by supplying timelocal the array (0, 00, 11, 31, 03, 18) you're trying to use day 31 of month 4, which doesn't work since April only ever has 30 days. If only the error message included the month it's assuming!
When doing the conversion, you need to mind to keep month values within 0..11 and adjust the year accordingly.
(Alternately you can use timelocal_nocheck() to be allowed to input month -1 and have the function do the conversion to the previous year. Although if you did use that function, you'd have had a bug that was a lot harder to track down, since it would have automatically converted 31st of April to 1st of May and you'd have no idea why your time difference is only 1 day.)
Secondly, you have a misplaced parenthesis on the calculation line, so you divide only the latter time by 86400.
My edited code:
use strict;
use warnings;
use Time::Local;
use POSIX;
sub toepoch {
my #a = split /[- :]/, $_[0];
$a[0] =~ s/^.{2}//;
if (! defined $a[5]) {
$a[5] = 00
}
--$a[1];
if ($a[1] < 0) {
--$a[0];
$a[1] += 12;
}
my $b = timelocal($a[5], $a[4], $a[3], $a[2], $a[1], $a[0]);
return $b;
}
my $days = sprintf("%d",(&toepoch('2018-03-31 11:00') - &toepoch('2018-04-02 11:00')) / 86400);
print $days;
Output:
-2
EDIT:
I assume you know what you're doing when using format %d for the value - it truncates the value down to the next whole number, meaning if you had dates
2018-03-31 11:00
2018-04-02 10:59
that is, just 1 minute short of 2 days, your program would report the time difference as "-1".
To round to nearest whole number, use the format %.0f instead.
using perl, I am trying to estimate the time since a file was created.
I would like to convert the local time to unix time (epoch), then take unix time of the file & subtract.
The problem I face is that when I convert localtime to unixtime , it is converted incorrectly!
my $current = str2time (localtime(time));
print $current;
The results I get are
2768504400 = Sun, 23 Sep 2057 21:00:00 GMT
2421349200 = Sun, 23 Sep 2046 21:00:00 GMT
Do I have to feed str2time with a specific date format?
You're doing something bizarre here - localtime(time) takes - the epoch time (time) and converts it to a string.
And then you convert it back.
Just use time()
Or perhaps better yet -M which tells you how long ago a file was modified. (In days, so you'll have to multiply up).
e.g.:
my $filename = "sample.csv";
my $modification = -M $filename;
print $modification * 84600;
But if you really want to take the time and convert it back again - you'll need to look at how localtime(time) returns the result.
If you do:
print localtime(time);
You get:
5671624811542661
Because localtime is being evaluated in a list context, and so returning an array of values. (Which you can use without needing to parse).
my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
localtime(time);
If you do it in a scalar context, it returns a string denoting the time:
print "".localtime(time);
Gives:
Thu Sep 24 16:09:33 2015
But note - that might vary somewhat depending on your current locale. That's probably why str2time is doing odd things - because it makes certain assumptions about formats that don't always apply. The big gotcha is this:
When both the month and the date are specified in the date as numbers they are always parsed assuming that the month number comes before the date. This is the usual format used in American dates.
You would probably be better off instead using Time::Piece and strftime to get a fixed format:
e.g.
use Time::Piece;
print localtime(time) -> strftime ( "%Y-%m-%d %H:%M:%S" );
Note - Time::Piece overloads localtime so you can actually use it (fairly) transparently. Of course, then you can also do:
print localtime(time) -> epoch;
And do without all the fuss of converting back and forth.
You have missed requesting localtime to produce scalar (string) instead of array.
use Date::Parse;
my $current = str2time (scalar(localtime(time)));
print $current, "\n";
print scalar(localtime($current)),"\n";
perldoc -f localtime
Converts a time as returned by the time function to a 9-element
list with the time analyzed for the local time zone. Typically
used as follows:
# 0 1 2 3 4 5 6 7 8
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
...
In scalar context, "localtime()" returns the ctime(3) value:
$now_string = localtime; # e.g., "Thu Oct 13 04:54:34 1994"
I've managed to cobble together a script that reads through thousands of log entries and creates a summary of them. All good so far. What I also want to be able to do is for it to create a separate summary of the entries from just the last 90 days.
A single entry in the log looks like the following, with newer entries always being added to the bottom of the file :
Serial No: 10123407
Date: 14/08/15
Time: 12:58
Cycle type: 134 U
Hold time: 0180
Cycle No: 1357
Dry Time: 00 mins.
Cycle Start
12:58.35
Hold Time 0000 Secs
Cycle: Failed
User_Message 13
Ref.to User Manual
Cycle End
13:01.32
The code I am using to return the current date and the date 90 days ago is:
use POSIX qw(strftime);
use Time::Local qw(timegm);
my ($d,$m,$y) = (localtime())[3,4,5];
print OUT (strftime("%d/%m/%y - ", gmtime(timegm(0,0,0,$d,$m,$y)-90*24*60*60)));
print OUT (strftime("%d/%m/%y\n", gmtime(timegm(0,0,0,$d,$m,$y))));
I'm doing it like this because it's producing my dates in the format I want, the same as in the logs dd/mm/yy and always zero padded.
Using this I get the following output:
11/05/15 - 09/08/15
So if I can print it, how can I store the data as the variables: $day90, $month90, $year90, $day, $month and $year. If I can do that then I think I can do the logical operations necessary to decide if the log entry is within the last 90 days and then create my summary as I want it.
I don't have any preconceived ideas as to how this is done so any and all solutions will be very much appreciated.
One of the best ways to compare dates is by converting them to %Y%m%d format (or %Y-%m-%d if you want something more readable) and then you can compare them as text strings
You can use the core module Time::Piece to do the formatting for you
Here's an example. It defines the input format and the comparison format as $dmy_format and $ymd_format respectively
The strings for today's date and 90 days earlier are defined and stored as state variables the first time in_range is called, and so never need to be calculated again. (You will need Perl 5 version 10 or better for the state keyword. If that's not available then just use my instead and move those definitions outside and immediately before the subroutine)
The passed parameter is the date in DD/MM/YY format. It is parsed and reformatted as YYYY-MM-DD and the subroutine returns the result of comparing it with the two boundary dates
use strict;
use warnings;
use v5.10; # for 'state' variables
use Time::Piece;
use Time::Seconds 'ONE_DAY';
for my $month ( 1 .. 12 ) {
my $date = sprintf '14/%02d/15', $month;
printf "date %s is %s\n", $date, in_range($date) ? 'in range' : 'out of range';
}
sub in_range {
state $ymd_format = '%Y-%m-%d';
state $dmy_format = '%d/%m/%y';
state $now = localtime;
state $today = $now->strftime($ymd_format);
state $days90 = ($now - 90 * ONE_DAY)->strftime($ymd_format);
my $date = Time::Piece->strptime(shift, $dmy_format)->strftime($ymd_format);
$date le $today and $date ge $days90;
}
output
date 14/01/15 is out of range
date 14/02/15 is out of range
date 14/03/15 is out of range
date 14/04/15 is out of range
date 14/05/15 is in range
date 14/06/15 is in range
date 14/07/15 is in range
date 14/08/15 is out of range
date 14/09/15 is out of range
date 14/10/15 is out of range
date 14/11/15 is out of range
date 14/12/15 is out of range
You could use this to get dates into variable i.e split with / and store it in variables:
use strict;
use warnings;
use POSIX qw(strftime);
use Time::Local qw(timegm);
my ($d,$m,$y) = (localtime())[3,4,5];
my ($day90,$month90,$year90) = split(/\//,(strftime("%d/%m/%y", gmtime(timegm(0,0,0,$d,$m,$y)-90*24*60*60))));
my ($day,$month,$year)=split(/\//,(strftime("%d/%m/%y", gmtime(timegm(0,0,0,$d,$m,$y)))));
print "DATE(BEFORE 90 days): $day90 $month90 $year90 \n";
print "DATE(CURRENT): $day $month $year \n";
Output:
DATE(BEFORE 90 days): 11 05 15
DATE(CURRENT): 09 08 15
I am trying to code a Perl Script which will take the date in Pattern, October 24, 2011 and convert this to 10,24,2011.
In order to do this I have prepared a Hash which will have the Month Name as a Key and a Numerical value representing Month's position as a Value.
I will read the input string, use a regular expression to extract the month name from above format.
Replace this month name with a value which corresponds to the month as a key.
Here's the script I have coded so far, but it's not working for me.
#dates array will have every element in this format -> October 24, 2011.
%days=("January",01,"February",02,"March",03,"April",04,"May",05,"June",06,"July",07,"August",08,"September",09,"October",10,"November",11,"December",12);
#output = map{
$pattern=$_;
$pattern =~ s/(.*)\s/$days{$1};
} #dates;
foreach $output (#output)
{
print $output."\n";
}
Here's a little explanation of what I am trying to do with this code.
#output will have the new formatted array with the Month Name replaced by the corresponding Numerical representing it as defined in the Hash.
map function is used to transform the elements of the array on the fly.
a sequence of characters followed by space is the regular expression used to extract the Month Name from pattern, October 24, 2011.
This will be referenced by $1.
I look up the corresponding value for $1 in the hash using, $days{$1}
I see a few problems here. The first is that there is no use strict;.
A number with a leading zero is assumed to be in octal format (i.e. base 8) so 08 is invalid. You want one of these:
%days = ("January", 1, "February", 2, ...
%days = ("January", "01", "February", "02", ...
%days = ("January" => 1, "February" => 2, ...
%days = ("January" => "01", "February" => "02", ...
You should also be declaring your variables with my:
my %days = ...
my #output = ...
You're missing the final slash on your substitution, you probably want a comma in there to match your desired output format, and .* will eat up more than you want:
$pattern =~ s/(\S*)\s/$days{$1}, /;
The block for your map needs to return the value you want in #output but it currently returns 1 (see perldoc perlop to learn why); something like this will serve you better:
my #output = map {
my $pattern=$_; # You don't need this, operating on $_ is fine here
$pattern =~ s/(\S*)\s/$days{$1}, /;
$pattern
} #dates;
If you really want the spaces removed from the output, then this should do the trick:
my #output = map {
my $pattern=$_; # You don't need this, operating on $_ is fine here
$pattern =~ s/(\S*)\s/$days{$1}, /;
$pattern =~ s/\s//g;
$pattern
} #dates;
There are more compact ways to do this map but I don't want to change too much and confuse you.
And, as mentioned in the comments, you might want to save yourself some trouble and have a look at DateTime and related packages.
Leaving aside the fact that you pasted non-compiling code (forgot training "/" as sarnold said), your regex is wrong.
You used a GREEDY regex: .* - meaning take as many characters as possible while matching. So your regex matched October 24, instead if October.
You need to do \S+\s
Do you want to "substitute array elements with hash values," or do you want to map month names to numbers. If it's the latter, the following will convert month_name day year to month_number day year with less code:
perl -le '$d=$ARGV[0]; for (qw{Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec}) { $i++; last if $d =~ s/\b$_[^\s]*/$i/i; }; print $d' "october 24, 2011"
Here's some feedback on your code:
Your pasted code does not compile very well.
You didn't use strict and warnings.
01 to 09 needs to be in double quotes.
You do not need to reassign $_ inside your map statement.
map needs to end with the value you intend to insert, e.g.: map { s/(\w+)/$days{$1}/; $_ }
say for #output looks nicer. =)