perl -n with input from pipe and #ARGV - perl

I've got a one-liner like this:
date +%H | perl -ne 'printf "%02d", $_ - ($ARGV[0] - 1);' 1
It says:
Can't open 1: Datei oder Verzeichnis nicht gefunden.
The error message means "File or directory not found".
I want it to take both the output from date and the commandline argument at the same time.
Essentially it should get me the current hour minus the argument minus one. If there are better ways to achieve this, I'll happily accept them. I'd still be grateful about an explanation as to why this doesn't work.
Let's assume it's after 10am now.
Param Output
1 10
2 09
3 08

The result might be yesterday or even further back in the past, it does not make much sense to print just the hour.
perl -mDateTime -e'
my $dt = DateTime->now;
$dt->subtract(hours => 1);
$dt->subtract(hours => shift #ARGV);
print $dt->hour;
' 4
Whereever possible, use a standard datetimestamp, such as RFC3339 which is in wide use.
perl -mDateTime -mDateTime::Format::RFC3339 -e'
my $dt = DateTime->now;
$dt->subtract(hours => 1);
$dt->subtract(hours => shift #ARGV);
print DateTime::Format::RFC3339->new->format_datetime($dt);
' 4
Your Perl one-liner deparses to:
LINE: while (defined($_ = <ARGV>)) {
printf '%02d', $_ - ($ARGV[0] - 1);
}
… because of -n. ARGV:
The special filehandle that iterates over command-line filenames in #ARGV.
But you have no filenames as arguments.

perl -n creates an implicit loop reading files listed as arguments or STDIN if there are no arguments; this conflicts with your use of an argument for something different. You can fix it by clearing #ARGV in a BEGIN block:
date +%H | perl -ne 'BEGIN{ $arg=shift } printf "%02d", $_ - ($arg - 1);' 1
but for this particular task, you're better off doing the date calculation entirely in perl anyway.

Related

Trying to input variable into url and having encoding issues

I am new to Perl and trying to make a script that takes input from the user and then get XML data from a website based on that input together with a url and then relay it back to the user.
But I have had some issues now with make a usable link based on the input from the user.
This is my code in full:
use strict;
use warnings;
my $row = 0;
use XML::LibXML;
print "\n\n\nOn what place do you need a weather report for? -> ";
chomp( my $ort = <> );
my $url = join('', "http://www.yr.no/place/Sweden/Västra_Götaland/",$ort,"/forecast_hour_by_hour.xml");
my $dom = XML::LibXML->load_xml(location => $url);
print "\n\nSee below the weather for ", $ort, ":\n\n";
foreach my $weatherdata ($dom->findnodes('//time')) {
if($row != 10){
my $temp = $weatherdata->findvalue('./temperature/#value');
my $value = $weatherdata->findvalue('./#from');
my $valuesub = substr $value, 11, 5;
print "At ", $valuesub, " the temperature will be: ", $temp, "C\n";
$row++;
}
}
print "\n\n";
If I write a place I want the weather info on. For example:
Mellerud
Then it takes that and I get a response from the link with propper data.
However. If I Write
Åmål
Its not making any sense to the script. I now get:
Could not create file parser context for file
"http://www.yr.no/place/Sweden/V├ñstra_G├Âtaland/Åmål/forecast_hour_by_hour.xml":
No error at test4.pl line 14
If I replace ",$ort," and just add Åmål I get the propper result.
I have been searching for different types of encoding for this, but I have not found a solution that works.
Once again I would like to point out that I am really new to this. I might miss something really simple. My apologies for that.
::EDIT 1::
After suggestion from #zdim I added use open ':std', ':encoding(UTF-8)';
This added some different results, but does only generate more error as following here:
Also I am running this in Windows CMD under administrator privileges.
According to #zdim its running fine in linux with xterm for input, v5.16.
Is there a way to make it work in Windows?
The problem is that CMD.exe is limited to 8-bit codepages. The "Å" and "å" characters are mapped (in Swedish Windows) to positions in the upper 8-bit range of codepage 850 that are illegal code points in Unicode.
If you need to output non-7-bit-ASCII characters, consider running PowerShell ISE. If you set it up correctly, it can cope with any character (in output) that the font you're using supports. The big downside is that PowerShell ISE is not a console, and therefore doesn't allow input from console/keyboard using STDIN. You can work around this by supplying your input as arguments, from a pipe, in a setting file, or thru graphical UI query elements.
To set up Windows PowerShell ISE to work with UTF8:
Set PowerShell to allow running local unsigned user scripts by running (in administrator elevated PowerShell):
Set-ExecutionPolicy RemoteSigned
Create or edit the file "<Documents>\WindowsPowerShell\Microsoft.PowerShellISE_profile.ps1" and add something like:
perl -w -e 'print qq!Initializing with Perl...\n!;'
[System.Console]::OutputEncoding = [System.Text.Encoding]::UTF8;
(You need the Perl bit (or something equivalent) there to allow for the
modification of the encoding.)
In PowerShell ISE's options, set the font to Consolas.
In your perl scripts, always do:
binmode(STDOUT, ':encoding(UTF-8)');
binmode(STDERR, ':encoding(UTF-8)');
My solution to the OP's problem:
use strict;
use warnings;
my $row = 0;
use XML::LibXML;
binmode(STDOUT, ':encoding(UTF-8)');
binmode(STDERR, ':encoding(UTF-8)');
#ARGV or die "No arguments!\n";
my $ort = shift #ARGV;
print "\n\n\nGetting weather report for \"$ort\"\n";
my $url = join('', "http://www.yr.no/place/Sweden/Västra_Götaland/",$ort,"/forecast_hour_by_hour.xml");
my $dom = XML::LibXML->load_xml(location => $url);
print "\n\nSee below the weather for ", $ort, ":\n\n";
foreach my $weatherdata ($dom->findnodes('//time')) {
if($row != 10){
my $temp = $weatherdata->findvalue('./temperature/#value');
my $value = $weatherdata->findvalue('./#from');
my $valuesub = substr $value, 11, 5;
print "At ", $valuesub, " the temperature will be: ", $temp, "C\n";
$row++;
}
}
print "\n\n";
Output:
(run at around 2018-06-09T14:05 UTC; 16:05 CEST (which is Sweden's time zone)):
PS (censored)> perl -w $env:perl5lib\Tests\Amal-Test.pl "Åmål"
Getting weather report for "Åmål"
See below the weather for Åmål:
At 17:00 the temperature will be: 27C
At 18:00 the temperature will be: 26C
At 19:00 the temperature will be: 25C
At 20:00 the temperature will be: 23C
At 21:00 the temperature will be: 22C
At 22:00 the temperature will be: 21C
At 23:00 the temperature will be: 20C
At 00:00 the temperature will be: 19C
At 01:00 the temperature will be: 18C
At 02:00 the temperature will be: 17C
Another note:
Relying on data to always be in an exact position in a string might not be the best idea.
Instead of:
my $valuesub = substr $value, 11, 5;
maybe consider matching it with a regular expression instead:
if ($value =~ /T((?:[01]\d|2[0-3]):[0-5]\d):/) {
my $valuesub = $1;
print "At ", $valuesub, " the temperature will be: ", $temp, "C\n"; }
else {
warn "Malformed value: $value\n";
}

How to find number of numerical data for each and every line in a file

Please help me to count the numerical data in each line of a file,
and also to find the line length. The code has to written in Perl.
For example if I have a line such as:
INPUT:I was born on 24th october,1994.
Output:2
You could do something like this:
perl -ne 'BEGIN{my $x} $x += () = /[0-9]+/g; END{print($x . "\n")}' file
-n: causes Perl to assume the following loop around your program, which makes it iterate over filename arguments somewhat like sed -n or awk:
LINE:
while (<>) {
... # your program goes here
}
-e: may be used to enter one line of program;
() will make /[0-9]+/g be evaluated in list context (i.e. () = /[0-9]+/g will return an array containing the sequences of one or more digits found in the default input), while $x += will make the result be evaluated again in scalar context (i.e. $x += () = /[0-9]+/g will add the number of sequences of one or more digits found in the default input to $x); END{print($x . "\n") will print $x after the whole file has been processed.
% cat file
string 123 string 1 string string string
456 string
% perl -ne 'BEGIN{my $x} $x += () = /[0-9]+/g; END{print($x . "\n")}' file
3
%
I'd do something like this
#!/usr/bin/perl
use warnings;
use strict;
my $file = 'num.txt';
open my $fh, '<', $file or die "Failed to open $file: $!\n";
while (my $line = <$fh>){
chomp $line;
my #num = $line =~ /([0-9.]+)/g;
print "On this line --- " .scalar(#num) . "\n";
}
close ($fh);
The input file I tested --
This should say 1
Line 2 should say 2
I want this line to say 5 so I have added 4 other numbers like 0.02 -1 and 5.23
The output as tested ----
On this line --- 1
On this line --- 2
On this line --- 5
Using the regex match ([0-9.]+) will match ANY number and include any decimals (I guess really you could use just ([0-9]+) since you are only counting them and not using the actually number represented.)
Hope it helps.

Real Time stamp in perl code

Here's my perl code snippet
if($line =~ m/^Warning: (.*)$/){
$subStepValues = {
Warning => $1,
Warning_timeStamp => `date`,
};
push #{$subsSteps->{'subStepValues'}}, $subStepValues;
}
I am parsing the output of tail -f from a file to my perl code and i am really interested to get the actual on the go time stamp, currently some how executing date is not working
any other better suggestion?
How about a nice ISO timestamp?
use POSIX qw(strftime);
if ($line =~ m/^Warning: (.*)$/)
{
$subStepValues = {
Warning => $1,
Warning_timeStamp => strftime("%Y-%m-%dT%H:%M:%S", localtime),
};
push #{$subsSteps->{'subStepValues'}}, $subStepValues;
}
Here is a simple proof of concept from the command line using an empty file and running tail -f on it and then going to another terminal and appending a few lines to it in the manner echo something >> log
schumack#daddyo2 12-18T1:57:23 338> touch log
schumack#daddyo2 12-18T1:57:26 339> tail -f log | perl -lne 'BEGIN{use POSIX qw(strftime);}chomp; printf "%s -- %s\n", strftime("%Y-%m-%dT%H:%M:%S", localtime), $_;'
2015-12-18T01:57:40 -- hello
2015-12-18T01:57:46 -- line 2
2015-12-18T01:57:50 -- line 3

Summing a column of numbers in a text file using Perl

Ok, so I'm very new to Perl. I have a text file and in the file there are 4 columns of data(date, time, size of files, files). I need to create a small script that can open the file and get the average size of the files. I've read so much online, but I still can't figure out how to do it. This is what I have so far, but I'm not sure if I'm even close to doing this correctly.
#!/usr/bin/perl
open FILE, "files.txt";
##array = File;
while(FILE){
#chomp;
($date, $time, $numbers, $type) = split(/ /,<FILE>);
$total += $numbers;
}
print"the total is $total\n";
This is how the data looks in the file. These are just a few of them. I need to get the numbers in the third column.
12/02/2002 12:16 AM 86016 a2p.exe
10/10/2004 11:33 AM 393 avgfsznew.pl
11/01/2003 04:42 PM 38124 c2ph.bat
Your program is reasonably close to working. With these changes it will do exactly what you want
Always use use strict and use warnings at the start of your program, and declare all of your variables using my. That will help you by finding many simple errors that you may otherwise overlook
Use lexical file handles, the three-parameter form of open, and always check the return status of any open call
Declare the $total variable outside the loop. Declaring it inside the loop means it will be created and destroyed each time around the loop and it won't be able to accumulate a total
Declare a $count variable in the same way. You will need it to calculate the average
Using while (FILE) {...} just tests that FILE is true. You need to read from it instead, so you must use the readline operator like <FILE>
You want the default call to split (without any parameters) which will return all the non-space fields in $_ as a list
You need to add a variable in the assignment to allow for athe AM or PM field in each line
Here is a modification of your code that works fine
use strict;
use warnings;
open my $fh, '<', "files.txt" or die $!;
my $total = 0;
my $count = 0;
while (<$fh>) {
my ($date, $time, $ampm, $numbers, $type) = split;
$total += $numbers;
$count += 1;
}
print "The total is $total\n";
print "The count is $count\n";
print "The average is ", $total / $count, "\n";
output
The total is 124533
The count is 3
The average is 41511
It's tempting to use Perl's awk-like auto-split option. There are 5 columns; three containing date and time information, then the size and then the name.
The first version of the script that I wrote is also the most verbose:
perl -n -a -e '$total += $F[3]; $num++; END { printf "%12.2f\n", $total / ($num + 0.0); }'
The -a (auto-split) option splits a line up on white space into the array #F. Combined with the -n option (which makes Perl run in a loop that reads the file name arguments in turn, or standard input, without printing each line), the code adds $F[3] (the fourth column, counting from 0) to $total, which is automagically initialized to zero on first use. It also counts the lines in $num. The END block is executed when all the input is read; it uses printf() to format the value. The + 0.0 ensures that the arithmetic is done in floating point, not integer arithmetic. This is very similar to the awk script:
awk '{ total += $4 } END { print total / NR }'
First drafts of programs are seldom optimal — or, at least, I'm not that good a programmer. Revisions help.
Perl was designed, in part, as an awk killer. There is still a program a2p distributed with Perl for converting awk scripts to Perl (and there's also s2p for converting sed scripts to Perl). And Perl does have an automatic (built-in) variable that keeps track of the number of lines read. It has several names. The tersest is $.; the mnemonic name $NR is available if you use English; in the script; so is $INPUT_LINE_NUMBER. So, using $num is not necessary. It also turns out that Perl does a floating point division anyway, so the + 0.0 part was unnecessary. This leads to the next versions:
perl -MEnglish -n -a -e '$total += $F[3]; END { printf "%12.2f\n", $total / $NR; }'
or:
perl -n -a -e '$total += $F[3]; END { printf "%12.2f\n", $total / $.; }'
You can tune the print format to suit your whims and fancies. This is essentially the script I'd use in the long term; it is fairly clear without being long-winded in any way. The script could be split over multiple lines if you desired. It is a simple enough task that the legibility of the one-line is not a problem, IMNSHO. And the beauty of this is that you don't have to futz around with split and arrays and read loops on your own; Perl does most of that for you. (Granted, it does blow up on empty input; that fix is trivial; see below.)
Recommended version
perl -n -a -e '$total += $F[3]; END { printf "%12.2f\n", $total / $. if $.; }'
The if $. tests whether the number of lines read is zero or not; the printf and division are omitted if $. is zero so the script outputs nothing when given no input.
There is a noble (or ignoble) game called 'Code Golf' that was much played in the early days of Stack Overflow, but Code Golf questions are no longer considered good questions. The object of Code Golf is to write a program that does a particular task in as few characters as possible. You can play Code Golf with this and compress it still further if you're not too worried about the format of the output and you're using at least Perl 5.10:
perl -Mv5.10 -n -a -e '$total += $F[3]; END { say $total / $. if $.; }'
And, clearly, there are a lot of unnecessary spaces and letters in there:
perl -Mv5.10 -nae '$t+=$F[3];END{say$t/$.if$.}'
That is not, however, as clear as the recommended version.
#!/usr/bin/perl
use warnings;
use strict;
open my $file, "<", "files.txt";
my ($total, $cnt);
while(<$file>){
$total += (split(/\s+/, $_))[3];
$cnt++;
}
close $file;
print "number of files: $cnt\n";
print "total size: $total\n";
printf "avg: %.2f\n", $total/$cnt;
Or you can use awk:
awk '{t+=$4} END{print t/NR}' files.txt
Try doing this :
#!/usr/bin/perl -l
use strict; use warnings;
open my $file, '<', "my_file" or die "open error [$!]";
my ($total, $count);
while (<$file>){
chomp;
next if /^$/;
my ($date, $time, $x, $numbers, $type) = split;
$total += $numbers;
$count++;
}
print "the average is " . $total/$count . " and the total is $total";
close $file;
It is as simple as this:
perl -F -lane '$a+=$F[3];END{print "The average size is ".$a/$.}' your_file
tested below:
> cat temp
12/02/2002 12:16 AM 86016 a2p.exe
10/10/2004 11:33 AM 393 avgfsznew.pl
11/01/2003 04:42 PM 38124 c2ph.bat
Now the execution:
> perl -F -lane '$a+=$F[3];END{print "The average size is ".$a/$.}' temp
The average size is 41511
>
explanation:
-F -a says store the line in an array format.with the default separator as space or tab.
so nopw $F[3] has you size of the file.
sum up all the sizes in the 4th column untill all the lines are processed.
END will be executed after processing all the lines in the file.
so $. at the end will gives the number of lines.
so $a/$. will give the average.
This solution opens the file and loops through each line of the file. It then splits the file into the five variables in the line by splitting on 1 or more spaces.
open the file for reading, "<", and if it fails, raise an error or die "..."
my ($total, $cnt) are our column total and number of files added count
while(<FILE>) { ... } loops through each line of the file using the file handle and stores the line in $_
chomp removes the input record separator in $_. In unix, the default separator is a newline \n
split(/\s+/, $_) Splits the current line represented by$_, with the delimiter \s+. \s represents a space, the + afterward means "1 or more". So, we split the next line on 1 or more spaces.
Next we update $total and $cnt
#!/usr/bin/perl
open FILE, "<", "files.txt" or die "Error opening file: $!";
my ($total, $cnt);
while(<FILE>){
chomp;
my ($date, $time, $am_pm, $numbers, $type) = split(/\s+/, $_);
$total += $numbers;
$cnt++;
}
close FILE;
print"the total is $total and count of $cnt\n";`

awk to perl conversion

I have a directory full of files containing records like:
FAKE ORGANIZATION
799 S FAKE AVE
Northern Blempglorff, RI 99xxx
01/26/2011
These items are being held for you at the location shown below each one.
IF YOU ASKED THAT MATERIAL BE MAILED TO YOU, PLEASE DISREGARD THIS NOTICE.
The Waltons. The complete DAXXXX12118198
Pickup at:CHUPACABRA LOCATION 02/02/2011
GRIMLY, WILFORD
29 FAKE LANE
S. BLEMPGLORFF RI 99XXX
I need to remove all entries with the expression Pickup at:CHUPACABRA LOCATION.
The "record separator" issue:
I can't touch the input file's formatting -- it must be retained as is. Each record
is separated by roughly 40+ new lines.
Here's some awk ( this works ):
BEGIN {
RS="\n\n\n\n\n\n\n\n\n+"
FS="\n"
}
!/CHUPACABRA/{print $0}
My stab with perl:
perl -a -F\n -ne '$/ = "\n\n\n\n\n\n\n\n\n+";$\ = "\n";chomp;$regex="CHUPACABRA";print $_ if $_ !~ m/$regex/i;' data/lib51.000
Nothing is returned. I'm not sure how to specify 'field separator' in perl except at the commandline. Tried the a2p utility -- no dice. For the curious, here's what it produces:
eval '$'.$1.'$2;' while $ARGV[0] =~ /^([A-Za-z
# process any FOO=bar switches
#$FS = ' '; # set field separator
$, = ' '; # set output field separator
$\ = "\n"; # set output record separator
$/ = "\n\n\n\n\n\n\n\n\n+";
$FS = "\n";
while (<>) {
chomp; # strip record separator
if (!/CHUPACABRA/) {
print $_;
}
}
This has to run under someone's Windows box otherwise I'd stick with awk.
Thanks!
Bubnoff
EDIT ( SOLVED ) **
Thanks mob!
Here's a ( working ) perl script version ( adjusted a2p output ):
eval '$'.$1.'$2;' while $ARGV[0] =~ /^([A-Za-z
# process any FOO=bar switches
#$FS = ' '; # set field separator
$, = ' '; # set output field separator
$\ = "\n"; # set output record separator
$/ = "\n"x10;
$FS = "\n";
while (<>) {
chomp; # strip record separator
if (!/CHUPACABRA/) {
print $_;
}
}
Feel free to post improvements or CPAN goodies that make this more idiomatic and/or perl-ish. Thanks!
In Perl, the record separator is a literal string, not a regular expression. As the perlvar doc famously says:
Remember: the value of $/ is a string, not a regex. awk has to be better for something. :-)
Still, it looks like you can get away with $/="\n" x 10 or something like that:
perl -a -F\n -ne '$/="\n"x10;$\="\n";chomp;$regex="CHUPACABRA";
print if /\S/ && !m/$regex/i;' data/lib51.000
Note the extra /\S/ &&, which will skip empty paragraphs from input that has more than 20 consecutive newlines.
Also, have you considered just installing Cygwin and having awk available on your Windows machine?
There is no need for (much)conversion if you can download gawk for windows
Did you know that Perl comes with a program called a2p that does exactly what you described you want to do in your title?
And, if you have Perl on your machine, the documentation for this program is already there:
C> perldoc a2p
My own suggestion is to get the Llama book and learn Perl anyway. Despite what the Python people say, Perl is a great and flexible language. If you know shell, awk and grep, you'll understand many of the Perl constructs without any problems.