Perl autosplit with non-file arguments - perl

I'm trying to create a perl script that autosplits the STDIN, and then does something to column X. But I want the column to be passed by argument to the script. However, apparently, when I invoke perl with the autosplit flag, it switches to a mode where it tries to implicitly open the "files" defined in the command line arguments, even though my arguments aren't files.
Example:
my $column = shift;
while(<STDIN>) {
print "F $column: $F[$column]!\n";
}
Then I try to run it with argument 2 to print the 2nd column:
$ echo -e "1 22\n2 33\n3 55" | perl -a myscript.pl 2
When I turn on warnings and such, I get this error message:
Can't open 2: No such file or directory (#1)
(S inplace) The implicit opening of a file through use of the <>
filehandle, either implicitly under the -n or -p command-line
switches, or explicitly, failed for the indicated reason. Usually
this is because you don't have read permission for a file which
you named on the command line.
(F) You tried to call perl with the -e switch, but /dev/null (or
your operating system's equivalent) could not be opened.
If I omit the -a, I don't get any errors and all is good, but then I need to manually split my lines. Is there any other flag I can use to make autosplit not do that? Or alternatively, can I start autosplit inside the script, so that it won't try to implicitly open the arguments as files? I can manually split my input, I just wanted to know what other alternatives I have.

There's not much of a "mode" here. -n adds
while (readline) {
...
}
around your code; -a adds a split statement:
while (readline) {
our #F = split;
...
}
readline (without argument) reads from the filenames specified in #ARGV (or STDIN if #ARGV is empty).
-a implicitly enables -n because otherwise there is nothing for it to do; there is no "autosplit mode", it simply adds code around your program.
Your best bet is to split manually:
while (<STDIN>) {
my #F = split;
print "F $column: $F[$column]!\n";
}
It's fairly short and it makes it obvious what's going on.
I don't like using the command line shortcut switches (e.g. -a, -p, etc.) in script files. In a file you have enough space to write multiple lines, properly formatted / indented. It's better to be explicit.
That said, you can technically do this:
#!perl -na
our $column;
BEGIN { $column = shift; }
print "F $column: $F[$column]!\n";
This is equivalent to:
while (readline) {
our #F = split;
our $column;
BEGIN { $column = shift; }
print "F $column: $F[$column]!\n";
}
The BEGIN block only runs once, at compile time. Provided the script is only called with one argument, this will remove it from #ARGV before the readline loop starts, which will make it read from STDIN.

-n, -p, -l, -a and -F don't change the state of the Perl interpreter. They literally alter the source code.
$ perl -MO=Deparse -a -ne'f()'
LINE: while (defined($_ = readline ARGV)) {
our #F = split(' ', $_, 0);
f();
}
-e syntax OK
As you can see, there's really no such thing as autosplit. It's just a normal split injected into the source code, and it makes no sense to ask Perl to insert a split instead of just inserting a split. So, simply add the following to your loop:
my #F = split;

Related

Tail command used in perl backticks

I'm trying to run a tail command from within a perl script using the usual backticks.
The section in my perl script is as follows:
$nexusTime += nexusUploadTime(`tail $log -n 5`);
So I'm trying to get the last 5 lines of this file but I'm getting the following error when the perl script finishes:
sh: line 1: -n: command not found
Even though when I run the command on the command line it is indeed successful and I can see the 5 lines from that particular.
Not sure what is going on here. Why it works from command line but through perl it won't recognize the -n option.
Anybody have any suggestions?
$log has an extraneous trailing newline, so you are executing
tail file.log
-n 5 # Tries to execute a program named "-n"
Fix:
chomp($log);
Note that you will run into problems if log $log contains shell meta characters (such as spaces). Fix:
use String::ShellQuote qw( shell_quote );
my $tail_cmd = shell_quote('tail', '-n', '5', '--', $log);
$nexusTime += nexusUploadTime(`$tail_cmd`);
ikegami pointed out your error, but I would recommend avoiding external commands whenever possible. They aren't portable and debugging them can be a pain, among other things. You can simulate tail with pure Perl code like this:
use strict;
use warnings;
use File::ReadBackwards;
sub tail {
my ($file, $num_lines) = #_;
my $bw = File::ReadBackwards->new($file) or die "Can't read '$file': $!";
my ($lines, $count);
while (defined(my $line = $bw->readline) && $num_lines > $count++) {
$lines .= $line;
}
$bw->close;
return $lines;
}
print tail('/usr/share/dict/words', 5);
Output
ZZZ
zZt
Zz
ZZ
zyzzyvas
Note that if you pass a file name containing a newline, this will fail with
Can't read 'foo
': No such file or directory at tail.pl line 10.
instead of the more cryptic
sh: line 1: -n: command not found
that you got from running the tail utility in backticks.
The answer to this question is to place the option -n 5 before the target file

Linux shell: change Perl code to linux shell, grep line by line

The follwoing code is Perl script, grep lines with 'Stage' from hostlog. and then line by line match the content with regex, if find add the count by 1:
$command = 'grep \'Stage \' '. $hostlog;
#stage_info = qx($command);
foreach (#stage_info) {
if ( /Stage\s(\d+)\s(.*)/ ) {
$stage_number = $stage_number+1;
}
}
so how to do this in linux shell? Based on my test, the we can not loop line by line, since there is space inside.
That is a horrible piece of Perl code you've got there. Here's why:
It looks like you are not using use strict; use warnings;. That is a huge mistake, and will not prevent errors, it will just hide them.
Using qx() to grep lines from a file is a completely redundant thing to do, as this is what Perl does best itself. "Shelling out" a process like that most often slows your program down.
Use some whitespace to make your code readable. This is hard to read, and looks more complicated than it is.
You capture strings by using parentheses in your regex, but you never use these strings.
Re: $stage_number=$stage_number+1, see point 3. And also, this can be written $stage_number++. Using the ++ operator will make your code clearer, will prevent the uninitialized warnings, and save you some typing.
Here is what your code should look like:
use strict;
use warnings;
open my $fh, "<", $hostlog or die "Cannot open $hostlog for reading: $!";
while (<$fh>) {
if (/Stage\s\d+/) {
$stage_number++;
}
}
You're not doing anything with the internal captures, so why bother? You could do everything with a grep:
$ stage_number=$(grep -E 'Stage\s\d+\s' | wc -l)
This is using extended regular expressions. I believe the GNU version takes these without a -E parameter, and in Solaris, even the egrep command might not quite allow for this regular expression.
If there's something more you have to do, you've got to explain it in your question.
If I understand the issue correctly, you should be able to do this just fine in the shell:
while read; do
if echo ${REPLY} | grep -q -P "'Stage' "; then
# Do what you need to do
fi
done < test.log
Note that if your grep command supports the -P option you may be able to use the Perl regular expression as-is for the second test.
this is almost it. bash has no expression for multiple digits.
#!/bin/bash
command=( grep 'Stage ' "$hostlog" )
while read line
do
[ "$line" != "${line/Stage [0-9]/}" ] && (( ++stage_number ))
done < <( "${command[#]}" )
On the other hand taking the function of the perl script into account rather than the operations it performs the whole thing could be rewritten as
(( stage_number += ` grep -c 'Stage \d\+\s' "$hostlog" ` ))
or this
stage_number=` grep -c 'Stage \d\+\s' "$hostlog" `
if, in the original perl, stage_number is uninitialised, or is initalised to 0.

Perl one-liner: how to reference the filename passed in when -ne or -pe commandline switches are used

In Perl, it's normally easy enough to get a reference to the commandline arguments. I just use $ARGV[0] for example to get the name of a file that was passed in as the first argument.
When using a Perl one-liner, however, it seems to no longer work. For example, here I want to print the name of the file that I'm iterating through if a certain string is found within it:
perl -ne 'print $ARGV[0] if(/needle/)' haystack.txt
This doesn't work, because ARGV doesn't get populated when the -n or -p switch is used. Is there a way around this?
What you are looking for is $ARGV. Quote from perlvar:
$ARGV
Contains the name of the current file when reading from <> .
So, your one-liner would become:
perl -ne 'print $ARGV if(/needle/)' haystack.txt
Though be aware that it will print once for each match. If you want a newline added to the print, you can use the -l option.
perl -lne 'print $ARGV if(/needle/)' haystack.txt
If you want it to print only once for each match, you can close the ARGV file handle and make it skip to the next file:
perl -lne 'if (/needle/) { print $ARGV; close ARGV }' haystack.txt haystack2.txt
As Peter Mortensen points out, $ARGV and $ARGV[0] are two different variables. $ARGV[0] refers to the first element of the array #ARGV, whereas $ARGV is a scalar which is a completely different variable.
You say that #ARGV is not populated when using the -p or -n switch, which is not true. The code that runs silently is something like:
while (#ARGV) {
$ARGV = shift #ARGV; # arguments are removed during runtime
open ARGV, $ARGV or die $!;
while (defined($_ = <ARGV>)) { # long version of: while (<>) {
# your code goes here
} continue { # when using the -p switch
print $_; # it includes a print statement
}
}
Which in essence means that using $ARGV[0] will never show the real file name, because it is removed before it is accessed, and placed in $ARGV.

How can I convert Perl one-liners into complete scripts?

I find a lot of Perl one-liners online. Sometimes I want to convert these one-liners into a script, because otherwise I'll forget the syntax of the one-liner.
For example, I'm using the following command (from nagios.com):
tail -f /var/log/nagios/nagios.log | perl -pe 's/(\d+)/localtime($1)/e'
I'd to replace it with something like this:
tail -f /var/log/nagios/nagios.log | ~/bin/nagiostime.pl
However, I can't figure out the best way to quickly throw this stuff into a script. Does anyone have a quick way to throw these one-liners into a Bash or Perl script?
You can convert any Perl one-liner into a full script by passing it through the B::Deparse compiler backend that generates Perl source code:
perl -MO=Deparse -pe 's/(\d+)/localtime($1)/e'
outputs:
LINE: while (defined($_ = <ARGV>)) {
s/(\d+)/localtime($1);/e;
}
continue {
print $_;
}
The advantage of this approach over decoding the command line flags manually is that this is exactly the way Perl interprets your script, so there is no guesswork. B::Deparse is a core module, so there is nothing to install.
Take a look at perlrun:
-p
causes Perl to assume the following loop around your program, which makes it iterate over filename arguments somewhat like sed:
LINE:
while (<>) {
... # your program goes here
} continue {
print or die "-p destination: $!\n";
}
If a file named by an argument cannot be opened for some reason, Perl warns you about it, and moves on to the next file. Note that the lines are printed automatically. An error occurring during printing is treated as fatal. To suppress printing use the -n switch. A -p overrides a -n switch.
BEGIN and END blocks may be used to capture control before or after the implicit loop, just as in awk.
So, simply take this chunk of code, insertyour code at the "# your program goes here" line, and viola, your script is ready!
Thus, it would be:
#!/usr/bin/perl -w
use strict; # or use 5.012 if you've got newer perls
while (<>) {
s/(\d+)/localtime($1)/e
} continue {
print or die "-p destination: $!\n";
}
That one's really easy to store in a script!
#! /usr/bin/perl -p
s/(\d+)/localtime($1)/e
The -e option introduces Perl code to be executed—which you might think of as a script on the command line—so drop it and stick the code in the body. Leave -p in the shebang (#!) line.
In general, it's safest to stick to at most one "clump" of options in the shebang line. If you need more, you could always throw their equivalents inside a BEGIN {} block.
Don't forget chmod +x ~/bin/nagiostime.pl
You could get a little fancier and embed the tail part too:
#! /usr/bin/perl -p
BEGIN {
die "Usage: $0 [ nagios-log ]\n" if #ARGV > 1;
my $log = #ARGV ? shift : "/var/log/nagios/nagios.log";
#ARGV = ("tail -f '$log' |");
}
s/(\d+)/localtime($1)/e
This works because the code written for you by -p uses Perl's "magic" (2-argument) open that processes pipes specially.
With no arguments, it transforms nagios.log, but you can also specify a different log file, e.g.,
$ ~/bin/nagiostime.pl /tmp/other-nagios.log
Robert has the "real" answer above, but it's not very practical. The -p switch does a bit of magic, and other options have even more magic (e.g. check out the logic behind the -i flag). In practice, I'd simply just make a bash alias/function to wrap around the oneliner, rather than convert it to a script.
Alternatively, here's your oneliner as a script: :)
#!/usr/bin/bash
# takes any number of arguments: the filenames to pipe to the perl filter
tail -f $# | perl -pe 's/(\d+)/localtime($1)/e'
There are some good answers here if you want to keep the one-liner-turned-script around and possibly even expand upon it, but the simplest thing that could possibly work is just:
#!/usr/bin/perl -p
s/(\d+)/localtime($1)/e
Perl will recognize parameters on the hashbang line of the script, so instead of writing out the loop in full, you can just continue to do the implicit loop with -p.
But writing the loop explicitly and using -w and "use strict;" are good if plan to use it as a starting point for writing a longer script.
#!/usr/bin/env perl
while(<>) {
s/(\d+)/localtime($1)/e;
print;
}
The while loop and the print is what -p does automatically for you.

Why can't I get the output of a command with system() in Perl?

When executing a command on the command-line from Perl, is there a way to store that result as a variable in Perl?
my $command = "cat $input_file | uniq -d | wc -l";
my $result = system($command);
$result always turns out to be 0.
Use "backticks":
my $command = "cat $input_file | uniq -d | wc -l";
my $result = `$command`;
And if interested in the exit code you can capture it with:
my $retcode = $?;
right after making the external call.
From perlfaq8:
Why can't I get the output of a command with system()?
You're confusing the purpose of system() and backticks (````). system() runs a command and returns exit status information (as a 16 bit value: the low 7 bits are the signal the process died from, if any, and the high 8 bits are the actual exit value). Backticks (``) run a command and return what it sent to STDOUT.
$exit_status = system("mail-users");
$output_string = `ls`;
You can use the Perl back-ticks to run a shell command, and save the result in an array.
my #results = `$command`;
To get just a single result from the shell command, you can store it in a scalar variable:
my $result = `$command`;
If you are expecting back multiple lines of output, it's easier to use an array, but if you're just expecting back one line, it's better to use scalar.
(something like that, my perl is rusty)
You can use backticks, as others have suggested. That's fine if you trust whatever variables you're using to build your command.
For more flexibility, you can open the command as a pipe and read from that as you would a file. This is particularly useful when you want to pass variables as command line arguments to the program and you don't trust their source to be free of shell escape characters, as open in recent Perl (>= 5.8) has the capacity to invoke a program from an argument list. So you can do the following:
open(FILEHANDLE, '-|', 'uniq', 'some-file.txt') or die "Cannot fork: $!\n";
while (<FILEHANDLE>) {
# process a line from $_
}
close FILEHANDLE or die "child error: $!\n";
IPC::System::Simple provides the 'capture' command which provides a safe, portable alternative to backticks. It (and other commands from this module) are highly recommended.
I'm assuming this a 'contrived' example, because this is
equivalent: 'uniq -d $input_file | wc -l'.
In almost all my experience, the only reason for putting the results
in to a perl variable, is to parse the later. In that case, I use
the following pattern:
$last = undef;
$lc = 0;
open(FN, "$input_file");
while (<FN>) {
# any other parsing of the current line
$lc++ if ($last eq $_);
$last = $_;
}
close(FN);
print "$lc\n";
This also has the added advantages:
no fork for shell, cat, uniq, and wc
faster
parse and collect the desired input