Expression in backticks with Perl `'cmd'.join ...` - perl

I would like to send the remaining #ARGV to foo. I currently do this:
my $cmd = 'foo '.join ' ', #ARGV;
my $out = `$cmd`;
Is there possible to do it in one line? For instance with a non-existent e option:
my $out = qx/'foo'.join ' ', #ARGV/e;
In a more general case I might want to do this:
my $out = qx/'foo?.join(' ', keys %hash)/e;

The builtin readpipe function is what is at the back end of backticks/qx() calls, so you can use that directly:
my $out = readpipe('foo' . join ' ', #ARGV);

You don't need to assemble the command prior to running it. The qx() operator (aliased by the backticks) interpolates.
perl -e 'print `echo #ARGV`' foo bar
or in your script:
my $out = `foo #ARGV`
What "optional" says about qx and interpolation is right: Beware that double interpolation might bite you and it's prone to security issues!
Regarding your update: Try
perl -e '%h = (foo=>1,bar=>2); print `echo #{[keys %h]}`'
That constructs an anonymous arrayref and immediately dereferrences it. Hashes don't interpolate but this array context allows arbitrary Perl code producing a list. Also I'm pretty sure the compiler recognized this idiom and removes the arrayref (de)dereferrence during optimization.
But that is really ugly, nearly unreadable from my point of view. I'd rather recommend:
my #keys = keys %hash;
my $cmd = "foo #keys";
my $out = `$cmd`;
Hint: storing the command in a dedicated variable makes logging executes commands easier what is really desirable.

Sure
my $out = capture_this_command( 'foo', #ARGV );
sub capture_this_command {
use Capture::Tiny qw/ capture /;
## local %ENV;
## delete #ENV{'PATH', 'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
## $ENV{'PATH'} = '/bin:/usr/bin';
my #cmd = #_;
my( $stdout, $stderr, $exit ) = capture {
system { $cmd[0] } #cmd;
};;
if( $exit ){
die "got the exit( $exit ) and stderr: $stderr\n ";
} elsif( $stderr ){
warn "got stderr: $stderr\n ";
}
return $stdout;
}
update:
qx// is double quotes, it interpolates, so everything perlintro/perlsyn/perlquote say about that goes, but also, remember, qx// calls your shell (to see which one you have perl -V:sh) and shells have their own interpolation
So you could write my $out = qx/foo #ARGV/; but its subject to interpolation, first by perl, then by whatever shell you're invoking

Related

Perl filter with substitution

I am attempting to create a Perl script that filters data presented on STDIN, changing all occurrences of
one string to another and outputting all input lines, changed and unchanged to STDOUT. FROMSTRING and TOSTRING can be PERL-compatible regular expressions. I am unable to get matching output.
Here is an example of what I am trying to achieve.
echo "Today is Saturday" | f.pl 'a' '#'
Output Tod#y is S#turd#y.
echo io | filter.pl '([aeiou])([aeiou])' '$2$1'
Output oi.
#!/usr/bin/perl
use strict;
use warnings;
if (#ARGV != 2){
print STDERR "Usage: ./filter.pl FROMSTRING TOSTRING\n"
}
exit 1;
my $FROM = $ARGV[0];
my $TO = $ARGV[1];
my $inLine = "";
while (<STDIN>){
$inLine = $_;
$inLine =~ s/$FROM/$TO/;
print $inLine
}
exit 0;
First off, the replacement part of a s/.../.../ operation is not a regex; it works like a double-quoted string.
There are a couple of issues with your code.
Your exit 1; statement appears in the middle of the main code, not in the error block. You probably want:
if (#ARGV != 2) {
print STDERR "Usage: ./filter.pl FROMSTRING TOSTRING\n";
exit 1;
}
You're missing a g flag if you want multiple substitutions to happen in the same line:
$inLine =~ s/$FROM/$TO/g;
There's no need to predeclare $inLine; it's only used in one block.
There's also no need to read a line into $_ just to copy it into $inLine.
It's common to use $names_like_this for variables and functions, not $namesLikeThis.
You can use $0 instead of hardcoding the program name in the error message.
exit 0; is redundant at the end.
The following is closer to how I'd write it:
#!/usr/bin/perl
use strict;
use warnings;
if (#ARGV != 2) {
die "Usage: $0 FROMSTRING TOSTRING\n";
}
my ($from, $to) = #ARGV;
while (my $line = readline STDIN) {
$line =~ s/$from/$to/g;
print $line;
}
That said, none of this addresses your second example with '$2$1' as the replacement. The above code won't do what you want because $to is a plain string. Perl won't scan it to look for things like $1 and replace them.
When you write "foo $bar baz" in your code, it means the same thing as 'foo ' . $bar . ' baz', but this only applies to code, i.e. stuff that literally appears in your source code. The contents of $bar aren't re-scanned at runtime to expand e.g. \n or $quux. This also applies to $1 and friends, which are just normal variables.
So how do you get '$2$1' to work?
One way is to mess around with eval, but I don't like it because, well, it's eval: If you're not very careful, it would allow someone to execute arbitrary code by passing the right replacement "string".
Doing it without eval is possible and even easy with e.g. Data::Munge::replace:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Munge qw(replace);
if (#ARGV != 2) {
die "Usage: $0 FROMSTRING TOSTRING\n";
}
my ($from, $to) = #ARGV;
while (my $line = readline STDIN) {
print replace($line, $from, $to, 'g');
}
replace works like JavaScript's String#replace in that it expands special $ sequences.
Doing it by hand is also possible but slightly annoying because you basically have to treat $to as a template and expand all $ sequences by hand (e.g. by using another regex substitution):
# untested
$line =~ s{$from}{
my #start = #-;
my #stop = #+;
(my $r = $to) =~ s{\$([0-9]+|\$)}{
$1 eq '$'
? '$'
: substr($from, $start[$1], $stop[$1] - $start[$1])
}eg;
$r
}eg;
(This does not implement braced groups such as ${1}, ${2}, etc. Those are left as an exercise for the reader.)
This code is sufficiently annoying to write (and look at) that I much prefer using a module like Data::Munge for this sort of thing.
three errors found:
; after error message
exit 1;
$inLine =~ s/$FROM/$TO/g;
like:
#!/usr/bin/perl
use strict;
use warnings;
if (#ARGV != 2){
print STDERR "Usage: ./filter.pl FROMSTRING TOSTRING\n";
exit 1;
}
my $FROM = $ARGV[0];
my $TO = $ARGV[1];
my $inLine = "";
while (<STDIN>){
$inLine = $_;
$inLine =~ s/$FROM/$TO/g;
print $inLine
}
exit 0;

How can I print to a variable in Perl?

I have already defined a bunch of functions which do a lot of work and have a bunch of print statements. They can be called like so to build an html page.
print_A()
print_B()
print_C()
Now, I want to call these functions and store the contents of these print statements into one main variable. One way is to rewrite these functions so they return a string with their contents (instead of printing)
my $var = "";
$var = $var . store_A();
$var = $var . store_B();
$var = $var . store_C();
But I want to do this without modifying or rewriting the functions. I don't want to redefine these functions with such a minor change (there are hundreds of these functions in the program).
Is there a shorter and faster way to do this in perl?
One way is to use select to redirect STDOUT to a scalar variable:
use warnings;
use strict;
my $out;
open my $fh, '>', \$out;
my $old_stdout = select $fh;
s1();
s2();
select $old_stdout;
close $fh;
print "start\n";
print $out;
print "end\n";
sub s1 {print "s1\n"}
sub s2 {print "s2\n"}
Prints out:
start
s1
s2
end
Depending on just what these functions do, you may be able to run them in a subprocess and capture their output:
my $pid = open(PIPE, "-|");
if (0 == $pid) {
# Child
print_A();
print_B();
print_C();
exit(0);
}
else {
my $var = "";
while(<PIPE>) {
$var .= $_;
}
close(PIPE);
}
You'll have to evaluate whether it's safe to move these function calls into a subprocess. If one of these functions changes the process's global state--for example, if it modifies a global variable--then that change will be confined to the child process, and won't happen in the original script process.
You can also use IO::Scalar to tie a variable to STDOUT.
use IO::Scalar;
my $output_str;
tie *STDOUT, 'IO::Scalar', \$output_str;
print "Hel", "lo, ";
print "world!\n";
untie *STDOUT;
print "output_str is $output_str\n";
# prints
# output_str is Hello, world!
Not effectively different from #toolic's answer though.

Perl and Environment Variables

Some of the environment variables which we use in Unix are as below (just an example):
VAR1=variable1
VAR2=variable2
VAR3=variable3
# and so on
Now, I have a perl script (let's call it test.pl) which reads a tab delimited text file (let's call it test.txt) and pushes the contents of it columnwise in separate arays. The first column of test.txt contains the following information for example (the strings in first column are delimited by / but I do not know how may / a string would contain and at what position the environment variable would appear):
$VAR1/$VAR2/$VAR3
$VAR3/some_string/SOME_OTHER_STRING/and_so_on/$VAR2
$VAR2/$VAR1/some_string/some_string_2/some_string_3/some_string_n/$VAR2
The extract of the script is as below:
use strict;
my $input0 = shift or die "must provide test.txt as the argument 0\n";
open(IN0,"<",$input0) || die "Cannot open $input0 for reading: $!";
my #first_column;
while (<IN0>)
{
chomp;
my #cols = split(/\t/);
my $first_col = `eval $cols[0]`; #### but this does not work
# here goes the push stmt to populate the array
### more code here
}
close(IN0);
Question: How can I access environment variables in such a situation so that the array is populated as below:
$first_column[0] = variable1/vraible2/variable3
$first_column[1] = variable3/some_string/SOME_OTHER_STRING/and_so_on/variable2
$first_column[2] = variable2/variable1/some_string/some_string_2/some_string_3/some_string_n/variable2
I think you are looking for a way to process configuration files. I like Config::Std for that purpose although there are many others on CPAN.
Here is a way of processing just the contents of $cols[0] to show in an explicit way what you need to do with it:
#!/usr/bin/perl
use strict; use warnings;
# You should not type this. I am assuming the
# environment variables are defined in the environment.
# They are here for testing.
#ENV{qw(VAR1 VAR2 VAR3)} = qw(variable1 variable2 variable3);
while ( my $line = <DATA> ) {
last unless $line =~ /\S/;
chomp $line;
my #components = split qr{/}, $line;
for my $c ( #components ) {
if ( my ($var) = $c =~ m{^\$(\w+)\z} ) {
if ( exists $ENV{$var} ) {
$c = $ENV{$var};
}
}
}
print join('/', #components), "\n";
}
__DATA__
$VAR1/$VAR2/$VAR3
$VAR3/some_string/SOME_OTHER_STRING/and_so_on/$VAR2
$VAR2/$VAR1/some_string/some_string_2/some_string_3/some_string_n/$VAR2
Instead of the split/join, you can use s/// to replace patterns that look like variables with the corresponding values in %ENV. For illustration, I put a second column in the __DATA__ section which is supposed to stand for a description of the path, and turned each line in to a hashref. Note, I factored out the actual substitution to eval_path so you can try alternatives without messing with the main loop:
#!/usr/bin/perl
use strict; use warnings;
# You should not type this. I am assuming the
# environment variables are defined in the environment.
# They are here for testing.
#ENV{qw(VAR1 VAR2 VAR3)} = qw(variable1 variable2 variable3);
my #config;
while ( my $config = <DATA> ) {
last unless $config =~ /\S/;
chomp $config;
my #cols = split /\t/, $config;
$cols[0] = eval_path( $cols[0] );
push #config, { $cols[1] => $cols[0] };
}
use YAML;
print Dump \#config;
sub eval_path {
my ($path) = #_;
$path =~ s{\$(\w+)}{ exists $ENV{$1} ? $ENV{$1} : $1 }ge;
return $path;
}
__DATA__
$VAR1/$VAR2/$VAR3 Home sweet home
$VAR3/some_string/SOME_OTHER_STRING/and_so_on/$VAR2 Man oh man
$VAR2/$VAR1/some_string/some_string_2/some_string_3/some_string_n/$VAR2 Can't think of any other witty remarks ;-)
Output:
---
- Home sweet home: variable1/variable2/variable3
- Man oh man: variable3/some_string/SOME_OTHER_STRING/and_so_on/variable2
- Can't think of any other witty remarks ;-): variable2/variable1/some_string/some_string_2/some_string_3/some_string_n/variable2
I think you just want to do this:
my #cols = map { s/(\$(\w+))/ $ENV{$2} || $1 /ge; $_ } split /\t/;
What you would do here is after you split them you would take each sequence of '$' followed by word characters and check to see if there was an environment variable for the word portion of it, otherwise leave it as is.
The e switch on a substitution allows you to execute code for the replacement value.
If you expect a '0' for any environment variable value, it's better off to do a defined or, that came in with 5.10.
my #cols = map { s|(\$(\w+))| $ENV{$2} // $1 |ge; $_ } split /\t/;
(Ignore the markup. // is a defined-or, not a C-comment)
If you want to allow for full shell expansions, one option to use the shell to do the expansion for you, perhaps via echo:
$ cat input
$FOO
bar
${FOO//cat/dog}
$ FOO=cat perl -wpe '$_ = qx"echo $_"' input
cat
bar
dog
If you cannot trust the contents of the environment variable, this introduces a security risk, as invoking qx on a string may cause the shell to invoke commands embedded in the string. As a result, this scriptlet will not run under taint mode (-T).
Perl keeps its environment variables in %ENV, in your case you can change your code like so:
my $first_col = $ENV[$cols[0]];

Perl: avoid greedy reading from stdin?

Consider the following perl script (read.pl):
my $line = <STDIN>;
print "Perl read: $line";
print "And here's what cat gets: ", `cat -`;
If this script is executed from the command line, it will get the first line of input, while cat gets everything else until the end of input (^D is pressed).
However, things are different when the input is piped from another process or read from a file:
$ echo "foo\nbar" | ./read.pl
Perl read: foo
And here's what cat gets:
Perl seems to greadily buffer the entire input somewhere, and processes called using backticks or system do no see any of the input.
The problem is that I'd like to unit test a script that mixes <STDIN> and calls to other processes. What would be the best way to do this? Can I turn off input buffering in perl? Or can I spool the data in a way that will "mimic" a terminal?
This is not a Perl problem. It is a UNIX/shell problem. When you run a command without pipes you are in line buffering mode, but when you redirect with pipes, you are in block buffering mode. You can see this by saying:
cat /usr/share/dict/words | ./read.pl | head
This C program has the same problem:
#include <stdio.h>
int main(int argc, char** argv) {
char line[4096];
FILE* cat;
fgets(line, 4096, stdin);
printf("C got: %s\ncat got:\n", line);
cat = popen("cat", "r");
while (fgets(line, 4096, cat)) {
printf("%s", line);
}
pclose(cat);
return 0;
}
I have good news and bad news.
The good news is a simple modification of read.pl allows you to give it fake input:
#! /usr/bin/perl
use warnings;
use strict;
binmode STDIN, "unix" or die "$0: binmode: $!";
my $line = <STDIN>;
print "Perl read: $line";
print "And here's what cat gets: ", `cat -`;
Sample run:
$ printf "A\nB\nC\nD\n" | ./read.pl
Perl read: A
And here's what cat gets: B
C
D
The bad news is you get a single switchover: if you try to repeat the read-then-cat, the first cat will starve all subsequent reads. To see this, consider
#! /usr/bin/perl
use warnings;
use strict;
binmode STDIN, "unix" or die "$0: binmode: $!";
my $line = <STDIN>;
print "1: Perl read: $line";
print "1: And here's what cat gets: ", `cat -`;
$line = <STDIN>;
$line = "<undefined>\n" unless defined $line;
print "2: Perl read: $line";
print "2: And here's what cat gets: ", `cat -`;
and then a sample run that produces
$ printf "A\nB\nC\nD\n" | ./read.pl
1: Perl read: A
1: And here's what cat gets: B
C
D
2: Perl read: <undefined>
2: And here's what cat gets:
Today I think I've found what I needed: Perl has a module called Expect which is perfect for such situations:
#!/usr/bin/perl
use strict;
use warnings;
use Expect;
my $exp = Expect->spawn('./read.pl');
$exp->send("First Line\n");
$exp->send("Second Line\n");
$exp->send("Third Line\n");
$exp->soft_close();
Works like a charm ;)
Here's a sub-optimal way that I've found:
use IPC::Run;
my $input = "First Line\n";
my $output;
my $process = IPC::Run::start(['./read.pl'], \$input, \$output);
$process->pump() until $output =~ /Perl read:/;
$input .= "Second Line\n";
$process->finish();
print $output;
It's sub-optimal in the sense that one needs to know the "prompt" that the program will emit before waiting for more input.
Another sub-optimal solution is the following:
use IPC::Run;
my $input = "First Line\n";
my $output;
my $process = IPC::Run::start(['./read.pl'], \$input, my $timer = IPC::Run::timer(1));
$process->pump() until $timer->is_expired();
$timer->start(1);
$input .= "Second Line\n";
$process->finish();
It does not require knowledge of any prompt, but is slow because it waits at least two seconds. Also, I don't understand why the second timer is needed (finish won't return otherwise).
Does anybody know better solutions?
Finally I ended up with the following solution. Still far from optimal, but it works. Even in situations like the one described by gbacon.
use Carp qw( confess );
use IPC::Run;
use Scalar::Util;
use Time::HiRes;
# Invokes the given program with the given input and argv, and returns stdout/stderr.
#
# The first argument provided is the input for the program. It is an arrayref
# containing one or more of the following:
#
# * A scalar is simply passed to the program as stdin
#
# * An arrayref in the form [ "prompt", "input" ] causes the function to wait
# until the program prints "prompt", then spools "input" to its stdin
#
# * An arrayref in the form [ 0.3, "input" ] waits 0.3 seconds, then spools
# "input" to the program's stdin
sub capture_with_input {
my ($program, $inputs, #argv) = #_;
my ($stdout, $stderr);
my $stdin = '';
my $process = IPC::Run::start( [$program, #argv], \$stdin, \$stdout, \$stderr );
foreach my $input (#$inputs) {
if (ref($input) eq '') {
$stdin .= $input;
}
elsif (ref($input) eq 'ARRAY') {
(scalar #$input == 2) or
confess "Input to capture_with_input must be of the form ['prompt', 'input'] or [timeout, 'input']!";
my ($prompt_or_timeout, $text) = #$input;
if (Scalar::Util::looks_like_number($prompt_or_timeout)) {
my $start_time = [ Time::HiRes::gettimeofday ];
$process->pump_nb() while (Time::HiRes::tv_interval($start_time) < $prompt_or_timeout);
}
else {
$prompt_or_timeout = quotemeta $prompt_or_timeout;
$process->pump until $stdout =~ m/$prompt_or_timeout/gc;
}
$stdin .= $text;
}
else {
confess "Unknown input type passed to capture_with_input!";
}
}
$process->finish();
return ($stdout, $stderr);
}
my $input = [
"First Line\n",
["Perl read:", "Second Line\n"],
[0.5, "Third Line\n"],
];
print "Executing process...\n";
my ($stdout, $stderr) = capture_with_input('./read.pl', $input);
print "done.\n";
print "STDOUT:\n", $stdout;
print "STDERR:\n", $stderr;
Usage example (with a slightly modified read.pl to test gbacon's case):
$ time ./spool_read4.pl
Executing process...
done.
STDOUT:
Perl read: First Line
And here's what head -n1 gets: Second Line
Perl read again: Third Line
STDERR:
./spool_read4.pl 0.54s user 0.02s system 102% cpu 0.547 total
Still, I'm open to better solutions...

How can I pass command-line arguments to a Perl program?

I'm working on a Perl script. How can I pass command line parameters to it?
Example:
script.pl "string1" "string2"
Depends on what you want to do. If you want to use the two arguments as input files, you can just pass them in and then use <> to read their contents.
If they have a different meaning, you can use GetOpt::Std and GetOpt::Long to process them easily. GetOpt::Std supports only single-character switches and GetOpt::Long is much more flexible. From GetOpt::Long:
use Getopt::Long;
my $data = "file.dat";
my $length = 24;
my $verbose;
$result = GetOptions ("length=i" => \$length, # numeric
"file=s" => \$data, # string
"verbose" => \$verbose); # flag
Alternatively, #ARGV is a special variable that contains all the command line arguments. $ARGV[0] is the first (ie. "string1" in your case) and $ARGV[1] is the second argument. You don't need a special module to access #ARGV.
You pass them in just like you're thinking, and in your script, you get them from the array #ARGV. Like so:
my $numArgs = $#ARGV + 1;
print "thanks, you gave me $numArgs command-line arguments.\n";
foreach my $argnum (0 .. $#ARGV) {
print "$ARGV[$argnum]\n";
}
From here.
foreach my $arg (#ARGV) {
print $arg, "\n";
}
will print each argument.
Yet another options is to use perl -s, eg:
#!/usr/bin/perl -s
print "value of -x: $x\n";
print "value of -name: $name\n";
Then call it like this :
% ./myprog -x -name=Jeff
value of -x: 1
value of -name: Jeff
Or see the original article for more details:
Alternatively, a sexier perlish way.....
my ($src, $dest) = #ARGV;
"Assumes" two values are passed. Extra code can verify the assumption is safe.
You can access them directly, by assigning the special variable #ARGV to a list of variables.
So, for example:
( $st, $prod, $ar, $file, $chart, $e, $max, $flag ,$id) = #ARGV;
perl tmp.pl 1 2 3 4 5
If the arguments are filenames to be read from, use the diamond (<>) operator to get at their contents:
while (my $line = <>) {
process_line($line);
}
If the arguments are options/switches, use GetOpt::Std or GetOpt::Long, as already shown by slavy13.myopenid.com.
On the off chance that they're something else, you can access them either by walking through #ARGV explicitly or with the shift command:
while (my $arg = shift) {
print "Found argument $arg\n";
}
(Note that doing this with shift will only work if you are outside of all subs. Within a sub, it will retrieve the list of arguments passed to the sub rather than those passed to the program.)
my $output_file;
if((scalar (#ARGV) == 2) && ($ARGV[0] eq "-i"))
{
$output_file= chomp($ARGV[1]) ;
}
If you just want some values, you can just use the #ARGV array. But if you are looking for something more powerful in order to do some command line options processing, you should use Getopt::Long.