What's the differences between system and backticks and pipes in Perl? - perl

Perl supports three ways (that I know of) of running external programs:
system:
system PROGRAM LIST
as in:
system "abc";
backticks as in:
`abc`;
running it through a pipe as in:
open ABC, "abc|";
What are the differences between them? Here's what I know:
You can use backticks and pipes to get the output of the command easily.
that's it (more in future edits?)

system(): runs command and returns command's exit status
backticks: runs command and returns the command's output
pipes : runs command and allows you to use them as an handle
Also backticks redirects the executed program's STDOUT to a variable, and system sends it to your main program's STDOUT.

The perlipc documentation explains the various ways that you can interact with other processes from Perl, and perlfunc's open documentation explains the piped filehandles.
The system sends its output to standard output (and error)
The backticks captures the standard output and returns it (but not standard error)
The piped open allows you to capture the output and read it from a file handle, or to print to a file handle and use that as the input for the external command.
There are also modules that handle these details with the cross-platform edge cases.

system is also returning the exit value of the application (ERRORLEVEL in Windows).
Pipes are a bit more complicated to use, as reading from them and closing them adds extra code.
Finally, they have different implementation which was meant to do different things. Using pipes you're able to communicate back with the executed applications, while the other commands doesn't allow that (easily).

Getting the program's exit status is not limited to system(). When you call close(PIPE), it returns the exit status, and you can get the latest exit status for all 3 methods from $?.
Please also note that
readpipe('...')
is the same as
`...`

Related

Perl: Calling a perl script from another

I have a perl script which calls another script. I am calling it using backticks and passing argument to that script and it works fine.
`CQPerl call_script.pl $agr1 $agr2 $arg3`;
But please suggest if there is another better way to do so. How can I check if the script errored out because of the calling script or the script that was called. How do I do that check from the calling script itself?
If you wan't to do error checking, backticks may be a wrong approach. You probably want to use the system function. See the documentation for all the details of error handling, examples included.
Perl has a number of possibilites to execute other scripts / commands:
backticks / qx{} When you want to read all the output at once after the program has terminated
exec When you wan't to continue your process as another program — never returns if succesfull
system When you are only interested in the success or failure of the command
open When you want to pipe information to or from the command
do and require Execute another Perl script here. Similar to C's #include
There are modules to do a three-way open so that you have access to STDIN, STDOUT and STDERR of the program you executed. See the apropriate parts of perlipc for advanced information.
And always use the multi-argument forms of these calls to avoid shell escaping (can be annoying and very insecure).
Check the value of the perl special variable $? to determine if there was an error.

What does the '`' character do in Perl?

I was using Perl to read through each line of a file. I used a command line tool to call a service, and I noticed some interesting functionality that I can't figure out how to search for. To the variable $cmd I assigned the command that invokes the service. If I refer to $cmd later in the code it prints out the command line argument, but if I refer to it as `$cmd`, however, it gives the output from running the service.
What is the explanation for this?
It works just like backquotes in the shell, which is why it is called that. See sh(1) for details. It captures the standard output alone, and nothing else. It sets the $? variable to the 16-bit wait status word.
This is all explained in the perlop(1) manpage:
qx/STRING/
`STRING`
A string which is (possibly) interpolated and then
executed as a system command with /bin/sh or its
equivalent. Shell wildcards, pipes, and redirections
will be honored. The collected standard output of the
command is returned; standard error is unaffected. In
scalar context, it comes back as a single (potentially
multi-line) string, or undef if the command failed.
In list context, returns a list of lines (however
you’ve defined lines with $/ or
$INPUT_RECORD_SEPARATOR), or the empty list if the
command failed.
Because backticks do not affect standard error: use
shell file descriptor syntax (assuming the shell
supports this) if you care to address this. To
capture a command’s STDERR and STDOUT merged together:
$output = `cmd 2>&1`;
To capture a command’s STDOUT but discard its STDERR:
$output = `cmd 2>/dev/null`;
To capture a command’s STDERR but discard its STDOUT
(ordering is important here):
$output = `cmd 2>&1 1>/dev/null`;
To exchange a command’s STDOUT and STDERR in order to
capture the STDERR but leave its STDOUT to come out
the old STDERR:
$output = `cmd 3>&1 1>&2 2>&3 3>&-`;
To read both a command’s STDOUT and its STDERR
separately, it’s easiest to redirect them separately
to files, and then read from those files when the
program is done:
system("program args 1>program.stdout 2>program.stderr");
The STDIN filehandle used by the command is inherited
from Perl’s STDIN. For example:
open(BLAM, "blam") || die "$0: can't open blam: $!";
open (STDIN, "<&BLAM") || die "$0: can't dup BLAM: $!";
print `sort`;
will print the sorted contents of the file blam.
Using single-quote as the delimiter protects the command
from Perl’s double-quote interpolation, passing the contents on
to the shell instead:
$perl_info = qx(ps $$); # that's Perl's $$
$shell_info = qx'ps $$'; # that's the new shell's $$
How that string gets evaluated is entirely subject to
the command interpreter on your system. On most
platforms, you will have to protect shell
metacharacters if you want them treated literally.
This is in practice difficult to do, as it’s unclear
which characters need escaping, or how. See perlsec for a
clean and safe example of a manual fork and exec
to emulate backticks safely.
On some platforms (notably DOS-like ones), the shell
may not be capable of dealing with multiline commands,
so putting newlines in the string may not get you what
you want. You may be able to evaluate multiple
commands in a single line by separating them with the
command separator character, if your shell supports
that (e.g. ; on many Unix shells; & on the Windows
NT CMD.COM shell).
Beginning with v5.6.0, Perl attempts to flush all
files opened for output before starting the child
process, but this may not be supported on some
platforms (see perlport(1)). To be safe, you may need to
set $| ($AUTOFLUSH in English) or call the
autoflush method of IO::Handle on any open
handles.
Beware that some command shells may place restrictions
on the length of the command line. You must ensure
your strings don’t exceed this limit after any
necessary interpolations. See the platform-specific
release notes for more details about your particular
environment.
Using this operator can lead to programs that are
difficult to port, because the shell commands called
vary between systems, and may in fact not be present
at all. As one example, the type command under the
POSIX shell is very different from the type command
under DOS. That doesn't mean you should go out of
your way to avoid backticks when they’re the right way
to get something done. Perl was made to be a glue
language, and one of the things it glues together is
commands. Just understand what you’re getting
yourself into.
See I/O Operators for more discussion.
Here’s a simple example of using backticks to get the exit status of the first element in a pipeline:
$device = q(/dev/rmt8);
$dd_noise = q(^[0-9]+\+[0-9]+ records (in|out)$);
$status = `exec 3>&1; ((dd if=$device ibs=64k 2>&1 1>&3 3>&- 4>&-; echo $? >&4) | egrep -v "$dd_noise" 1>&2 3>&- 4>&-) 4>&1`;
EDIT
Well ok then, so maybe that wasn’t that simple an example. :) But this one is.
I’d like to recommend the Capture::Tiny CPAN module as a simpler way to manage the output from external commands that you would normally run using backquotes. It has advantages and disadvantages, but I feel that for many people, the advantages outweigh any arguable disadvantageL
The advantage is that you get to do all this without requiring deep knowledge of arcane mysteries of file-descriptor redirection the way the previous example did.
The disadvantage is it’s yet another non-core dependency — something else you have to install from CPAN.
That’s really not bad for what you get.
Here’s an example of how easy it is:
NAME
Capture::Tiny - Capture STDOUT and STDERR from Perl, XS, or external programs
SYNOPSIS
use Capture::Tiny qw/capture tee capture_merged tee_merged/;
($stdout, $stderr) = capture {
# your code here
};
($stdout, $stderr) = tee {
# your code here
};
$merged = capture_merged {
# your code here
};
$merged = tee_merged {
# your code here
};
DESCRIPTION
Capture::Tiny provides a simple, portable way to capture anything sent to STDOUT or STDERR, regardless of whether it comes from Perl, from XS code
or from an external program. Optionally, output can be teed so that it is captured while being passed through to the original handles. Yes, it
even works on Windows. Stop guessing which of a dozen capturing modules to use in any particular situation and just use this one.
There, isn’t that a whole lot easier?
The back-quote in Perl does much the same as the back-quote in shell - it runs a command and captures the standard output.
See also qx//.
I think the backtick lets you run commands and store their output in a variable:
$listing=`ls -1 /tmp/`;

How can I check (peek) STDIN for piped data in Perl without using select?

I'm trying to handle the possibility that no arguments and no piped data is passed to a Perl script. I'm assuming that if there are no arguments then input is being piped via STDIN. However if the user provides no arguments and does not pipe anything to the script, it will try to get keyboard input. My objective is to provide an error message instead.
Unfortunately, select() is not portable to some non-POSIX systems.
Is there another way to do this with maximum portability?
Perl comes with the -t file-test operator, which tells you if a particular filehandle is open to a TTY. So, you should be able to do this:
if ( -t STDIN and not #ARGV ) {
# We're talking to a terminal, but have no command line arguments.
# Complain loudly.
}
else {
# We're either reading from a file or pipe, or we have arguments in
# #ARGV to process.
}
A quick test reveals this working fine on Windows with Perl 5.10.0, and Linux with Perl 5.8.8, so it should be portable across the most common Perl environments.
As others have mentioned, select would not be a reliable choice as there may be times when you're reading from a process, but that process hasn't started writing yet.
All the best,
Paul
use POSIX 'isatty';
if ( ! #ARGV && isatty(*STDIN) ) {
die "usage: ...";
}
See: http://www.opengroup.org/onlinepubs/009695399/functions/isatty.html
Note that select wouldn't be much help anyway, since it would produce false results
if the piped info wasn't ready yet. Example:
seq 100000|grep 99999|perl -we'$rin="";vec($rin,fileno(STDIN),1)=1;print 0+select($rin,"","",.01)'

How do I run a Perl script from within a Perl script?

I've got a Perl script that needs to execute another Perl script. This second script can be executed directly on the command line, but I need to execute it from within my first program. I'll need to pass it a few parameters that would normally be passed in when it's run standalone (the first script runs periodically, and executes the second script under a certain set of system conditions).
Preliminary Google searches suggest using backticks or a system() call. Are there any other ways to run it? (I'm guessing yes, since it's Perl we're talking about :P ) Which method is preferred if I need to capture output from the invoked program (and, if possible, pipe that output as it executes to stdout as though the second program were invoked directly)?
(Edit: oh, now SO suggests some related questions. This one is close, but not exactly the same as what I'm asking. The second program will likely take an hour or more to run (lots of I/O), so I'm not sure a one-off invocation is the right fit for this.)
You can just do it.
{
local #ARGV = qw<param1 param2 param3>;
do '/home/buddy/myscript.pl';
}
Prevents the overhead of loading in another copy of perl.
The location of your current perl interpreter can be found in the special variable $^X. This is important if perl is not in your path, or if you have multiple perl versions available but which to make sure you're using the same one across the board.
When executing external commands, including other Perl programs, determining if they actually ran can be quite difficult. Inspecting $? can leave lasting mental scars, so I prefer to use IPC::System::Simple (available from the CPAN):
use strict;
use warnings;
use IPC::System::Simple qw(system capture);
# Run a command, wait until it finishes, and make sure it works.
# Output from this program goes directly to STDOUT, and it can take input
# from your STDIN if required.
system($^X, "yourscript.pl", #ARGS);
# Run a command, wait until it finishes, and make sure it works.
# The output of this command is captured into $results.
my $results = capture($^X, "yourscript.pl", #ARGS);
In both of the above examples any arguments you wish to pass to your external program go into #ARGS. The shell is also avoided in both of the above examples, which gives you a small speed advantage, and avoids any unwanted interactions involving shell meta-characters. The above code also expects your second program to return a zero exit value to indicate success; if that's not the case, you can specify an additional first argument of allowable exit values:
# Both of these commands allow an exit value of 0, 1 or 2 to be considered
# a successful execution of the command.
system( [0,1,2], $^X, "yourscript.pl", #ARGS );
# OR
capture( [0,1,2, $^X, "yourscript.pl", #ARGS );
If you have a long-running process and you want to process its data while it's being generated, then you're probably going to need a piped open, or one of the more heavyweight IPC modules from the CPAN.
Having said all that, any time you need to be calling another Perl program from Perl, you may wish to consider if using a module would be a better choice. Starting another program carries quite a few overheads, both in terms of start-up costs, and I/O costs for moving data between processes. It also significantly increases the difficulty of error handling. If you can turn your external program into a module, you may find it simplifies your overall design.
All the best,
Paul
I can think of a few ways to do this. You already mentioned the first two, so I won't go into detail on them.
backticks: $retVal = `perl somePerlScript.pl`;
system() call
eval
The eval can be accomplished by slurping the other file into a string (or a list of strings), then 'eval'ing the strings. Heres a sample:
#!/usr/bin/perl
open PERLFILE, "<somePerlScript.pl";
undef $/; # this allows me to slurp the file, ignoring newlines
my $program = <PERLFILE>;
eval $program;
4 . do: do 'somePerlScript.pl'
You already got good answers to your question, but there's always the posibility to take a different point of view: maybe you should consider refactoring the script that you want to run from the first script. Turn the functionality into a module. Use the module from the first and from the second script.
If you need to asynchronously call your external script -you just want to launch it and not wait for it to finish-, then :
# On Unix systems, either of these will execute and just carry-on
# You can't collect output that way
`myscript.pl &`;
system ('myscript.pl &');
# On Windows systems the equivalent would be
`start myscript.pl`;
system ('start myscript.pl');
# If you just want to execute another script and terminate the current one
exec ('myscript.pl');
Use backticks if you need to capture the output of the command.
Use system if you do not need to capture the output of the command.
TMTOWTDI: so there are other ways too, but those are the two easiest and most likely.
See the perlipc documentation for several options for interprocess communication.
If your first script merely sets up the environment for the second script, you may be looking for exec.
#!/usr/bin/perl
use strict;
open(OUTPUT, "date|") or die "Failed to create process: $!\n";
while (<OUTPUT>)
{
print;
}
close(OUTPUT);
print "Process exited with value " . ($? >> 8) . "\n";
This will start the process date and pipe the output of the command to the OUTPUT filehandle which you can process a line at a time. When the command is finished you can close the output filehandle and retrieve the return value of the process. Replace date with whatever you want.
I wanted to do something like this to offload non-subroutines into an external file to make editing easier. I actually made this into a subroutine. The advantage of this way is that those "my" variables in the external file get declared in the main namespace. If you use 'do' they apparently don't migrate to the main namespace. Note the presentation below doesn't include error handling
sub getcode($) {
my #list;
my $filename = shift;
open (INFILE, "< $filename");
#list = <INFILE>;
close (INFILE);
return \#list;
}
# and to use it:
my $codelist = [];
$codelist = getcode('sourcefile.pl');
eval join ("", #$codelist);

When is the right time (and the wrong time) to use backticks?

Many beginning programmers write code like this:
sub copy_file ($$) {
my $from = shift;
my $to = shift;
`cp $from $to`;
}
Is this bad, and why? Should backticks ever be used? If so, how?
A few people have already mentioned that you should only use backticks when:
You need to capture (or supress) the output.
There exists no built-in function or Perl module to do the same task, or you have a good reason not to use the module or built-in.
You sanitise your input.
You check the return value.
Unfortunately, things like checking the return value properly can be quite challenging. Did it die to a signal? Did it run to completion, but return a funny exit status? The standard ways of trying to interpret $? are just awful.
I'd recommend using the IPC::System::Simple module's capture() and system() functions rather than backticks. The capture() function works just like backticks, except that:
It provides detailed diagnostics if the command doesn't start, is killed by a signal, or returns an unexpected exit value.
It provides detailed diagnostics if passed tainted data.
It provides an easy mechanism for specifying acceptable exit values.
It allows you to call backticks without the shell, if you want to.
It provides reliable mechanisms for avoiding the shell, even if you use a single argument.
The commands also work consistently across operating systems and Perl versions, unlike Perl's built-in system() which may not check for tainted data when called with multiple arguments on older versions of Perl (eg, 5.6.0 with multiple arguments), or which may call the shell anyway under Windows.
As an example, the following code snippet will save the results of a call to perldoc into a scalar, avoids the shell, and throws an exception if the page cannot be found (since perldoc returns 1).
#!/usr/bin/perl -w
use strict;
use IPC::System::Simple qw(capture);
# Make sure we're called with command-line arguments.
#ARGV or die "Usage: $0 arguments\n";
my $documentation = capture('perldoc', #ARGV);
IPC::System::Simple is pure Perl, works on 5.6.0 and above, and doesn't have any dependencies that wouldn't normally come with your Perl distribution. (On Windows it depends upon a Win32:: module that comes with both ActiveState and Strawberry Perl).
Disclaimer: I'm the author of IPC::System::Simple, so I may show some bias.
The rule is simple: never use backticks if you can find a built-in to do the same job, or if their is a robust module on the CPAN which will do it for you. Backticks often rely on unportable code and even if you untaint the variables, you can still open yourself up to a lot of security holes.
Never use backticks with user data unless you have very tightly specified what is allowed (not what is disallowed -- you'll miss things)! This is very, very dangerous.
Backticks should be used if and only if you need to capture the output of a command. Otherwise, system() should be used. And, of course, if there's a Perl function or CPAN module that does the job, this should be used instead of either.
In either case, two things are strongly encouraged:
First, sanitize all inputs: Use Taint mode (-T) if the code is exposed to possible untrusted input. Even if it's not, make sure to handle (or prevent) funky characters like space or the three kinds of quote.
Second, check the return code to make sure the command succeeded. Here is an example of how to do so:
my $cmd = "./do_something.sh foo bar";
my $output = `$cmd`;
if ($?) {
die "Error running [$cmd]";
}
Another way to capture stdout(in addition to pid and exit code) is to use IPC::Open3 possibily negating the use of both system and backticks.
Use backticks when you want to collect the output from the command.
Otherwise system() is a better choice, especially if you don't need to invoke a shell to handle metacharacters or command parsing. You can avoid that by passing a list to system(), eg system('cp', 'foo', 'bar') (however you'd probably do better to use a module for that particular example :))
In Perl, there's always more than one way to do anything you want. The primary point of backticks is to get the standard output of the shell command into a Perl variable. (In your example, anything that the cp command prints will be returned to the caller.) The downside of using backticks in your example is you don't check the shell command's return value; cp could fail and you wouldn't notice. You can use this with the special Perl variable $?. When I want to execute a shell command, I tend to use system:
system("cp $from $to") == 0
or die "Unable to copy $from to $to!";
(Also observe that this will fail on filenames with embedded spaces, but I presume that's not the point of the question.)
Here's a contrived example of where backticks might be useful:
my $user = `whoami`;
chomp $user;
print "Hello, $user!\n";
For more complicated cases, you can also use open as a pipe:
open WHO, "who|"
or die "who failed";
while(<WHO>) {
# Do something with each line
}
close WHO;
From the "perlop" manpage:
That doesn't mean you should go out of
your way to avoid backticks when
they're the right way to get something
done. Perl was made to be a glue
language, and one of the things it
glues together is commands. Just
understand what you're getting
yourself into.
For the case you are showing using the File::Copy module is probably best. However, to answer your question, whenever I need to run a system command I typically rely on IPC::Run3. It provides a lot of functionality such as collecting the return code and the standard and error output.
Whatever you do, as well as sanitising input and checking the return value of your code, make sure you call any external programs with their explicit, full path. e.g. say
my $user = `/bin/whoami`;
or
my $result = `/bin/cp $from $to`;
Saying just "whoami" or "cp" runs the risk of accidentally running a command other than what you intended, if the user's path changes - which is a security vulnerability that a malicious attacker could attempt to exploit.
Your example's bad because there are perl builtins to do that which are portable and usually more efficient than the backtick alternative.
They should be used only when there's no Perl builtin (or module) alternative. This is both for backticks and system() calls. Backticks are intended for capturing output of the executed command.
Backticks are only supposed to be used when you want to capture output. Using them here "looks silly." It's going to clue anyone looking at your code into the fact that you aren't very familiar with Perl.
Use backticks if you want to capture output.
Use system if you want to run a command. One advantage you'll gain is the ability to check the return status.
Use modules where possible for portability. In this case, File::Copy fits the bill.
In general, it's best to use system instead of backticks because:
system encourages the caller to check the return code of the command.
system allows "indirect object" notation, which is more secure and adds flexibility.
Backticks are culturally tied to shell scripting, which might not be common among readers of the code.
Backticks use minimal syntax for what can be a heavy command.
One reason users might be temped to use backticks instead of system is to hide STDOUT from the user. This is more easily and flexibly accomplished by redirecting the STDOUT stream:
my $cmd = 'command > /dev/null';
system($cmd) == 0 or die "system $cmd failed: $?"
Further, getting rid of STDERR is easily accomplished:
my $cmd = 'command 2> error_file.txt > /dev/null';
In situations where it makes sense to use backticks, I prefer to use the qx{} in order to emphasize that there is a heavy-weight command occurring.
On the other hand, having Another Way to Do It can really help. Sometimes you just need to see what a command prints to STDOUT. Backticks, when used as in shell scripts are just the right tool for the job.
Perl has a split personality. On the one hand it is a great scripting language that can replace the use of a shell. In this kind of one-off I-watching-the-outcome use, backticks are convenient.
When used a programming language, backticks are to be avoided. This is a lack of error
checking and, if the separate program backticks execute can be avoided, efficiency is
gained.
Aside from the above, the system function should be used when the command's output is not being used.
Backticks are for amateurs. The bullet-proof solution is a "Safe Pipe Open" (see "man perlipc"). You exec your command in another process, which allows you to first futz with STDERR, setuid, etc. Advantages: it does not rely on the shell to parse #ARGV, unlike open("$cmd $args|"), which is unreliable. You can redirect STDERR and change user priviliges without changing the behavior of your main program. This is more verbose than backticks but you can wrap it in your own function like run_cmd($cmd,#args);
sub run_cmd {
my $cmd = shift #_;
my #args = #_;
my $fh; # file handle
my $pid = open($fh, '-|');
defined($pid) or die "Could not fork";
if ($pid == 0) {
open STDERR, '>/dev/null';
# setuid() if necessary
exec ($cmd, #args) or exit 1;
}
wait; # may want to time out here?
if ($? >> 8) { die "Error running $cmd: [$?]"; }
while (<$fh>) {
# Have fun with the output of $cmd
}
close $fh;
}