How do I check the exit code in Test::More? - perl

According to Test::More documentation, it will exit with certain exit codes depending on the out come of your tests.
My question is, how do I check these exit codes?
ps. I am trying to build a harness myself. Is there a simple harness I can use?
ps2. Test::Harness did the trick for me. In particular execute_tests function. This function returns all the statistics I want. Thank you everyone who gave useful links.

Any decent harness program (such as prove) will do that for you, there is absolutely no reason for you to do that yourself.

It's a process return code. If running under some Unix variant's shell, you can usually retrieve it as $?. (which probably makes your question more about bash than perl)
Common idioms:
perl your_file.t
if [[ $? -gt 10 ]]; then
echo "Wow, that's a lot of failures"
elif [[ $? -gt 0 ]]; then
echo "Tests failed"
else
echo "Success!"
fi
Alternately, if you're only interested in success/failure:
perl your_file.t || echo "Something bad happened."

If you're calling your test program from Perl, then the hard (and traditional) way involves doing dark and horrid bit-shifts to $?. You can read about that in the system function documentation if you really want to see how to do it.
The nice way involves using a module which gives you a system style function that processes return values for you:
use IPC::System::Simple qw(systemx EXIT_ANY);
my $exit_value = systemx( EXIT_ANY, 'mytest.t' );
The EXIT_ANY symbol allows your script to return any exit value, which we can then capture. If you just want to make sure that your scripts are passing (ie, returning a zero exit status), and halt as soon as any fail, that's IPC::System::Simple's default behaviour:
use IPC::System::Simple qw(systemx);
systemx( 'mytest.t' ); # Run this command successfully or die.
In all the above examples, you can ask for a replacement system command rather than systemx if you're happy for the possibility of the shell getting involved. See the IPC::System::Simple documentation for more details.
There are other modules that may allow you to easily run a command and capture its exit value. TIMTOWTDI.
Having said that, all the good harnesses should check return values for you, so it's only if you're writing our own testing testers that you should need to look at this yourself.
All the best,
Paul
Disclosure: I wrote IPC::System::Simple, and so may have some positive bias toward it.

Related

Is it safe using $Config{perlpath} in system/exec/open?

Assume that I have following code to open filehandle:
open my $fh, "command1 | command2 |";
I found command1 may output that command2 can not handle well, so I'm trying to insert a perl filter between command1 and command2 to deal with them:
use Config;
open my $fh, "command1 | $Config{perlpath} -ple 'blah blah' | command2 |";
My questions are:
Is it OK to use $Config{perlpath} in system call directly?
Calling own perl binary seems nuts. Are there any better solutions?
Thanks
Is it OK to use $Config{perlpath} in system call directly?
Relatively. It's about as portable as the rest of your code (which already depends on running on something unix-ish). There's some security worry, but I'd say not a very large one because someone who can affect the value of that variable already has scope to cause havoc. $^X is probably safer in that regard. You might want to try quoting it using String::ShellQuote just for safety, since you can't bypass the shell in the midst of a pipeline.
Calling own perl binary seems nuts. Are there any better solutions?
Depends on your definition of "better". There's definitely another way around, which is to run both command1 and command2 separately, process command1's output in your original perl process, hand it to command2, and read command2's output. However, you have to be careful how you do it. There are two safe ways.
The first way is easier, but takes more time and memory: run command1 first, read and process all of its output into a string, then run command2 providing the buffered output as input. The best way to do this is to use IPC::Run to handle command2's input and output (and maybe both commands, just for consistency); if you try to just print all the data to command2's input handle and then read all the output you can deadlock, but IPC::Run will interleave reads and writes if necessary behind the scenes, and give you a nice easy interface.
The second way is more complicated but closer in behavior to the original: you need some kind of async framework like IO::Async or POE, using its classes for process construction, and set up handlers to communicate between them, do your filtering, and gather the output.
Here's a tested toy example of that (gist because it's a couple screenfuls of code) that does the equivalent of ls | perl -pe '$_ = uc' | rev, except with the middle part of the pipeline running in the parent perl process. You may never use it, but I thought it was worth illustrating.

Perl: After a successful system call, "or die" command still ends script

I am using the following line to make a simple system call which works:
system ("mkdir -p Purged") or die "Failed to mkdir." ;
Executing the script does make the system call and I can find a directory called Purged, but the error message is still printed and the script dies. What is wrong with my syntax?
system returns the exit status of the command it calls. In shell, zero exit status means success. You have to invert the logic:
0 == system qw(mkdir -p Purged) or die "Failed to create the dir\n";
That would be a little confusing, wouldn't? - Leonardo Herrera on Ikegami's answer
Yes, it is confusing that the system command inverts true and false in Perl, and creates fun logic like this:
if ( system qw($command) ) {
die qq(Aw... If failed);
}
else {
say qq(Hooray! It worked!);
}
But, it's understandable why the system command does this. In Unix, an exit status of zero means the program worked, and a non-zero status could give you information why your system call failed. Maybe the program you were calling doesn't exist. Maybe the program sort of worked as expected. For example, grep returns an exit code of 1 when grep works, but there were no matching lines. You might want to distinguish when grep returns zero, one, or a return code greater than one. (Yes, I know it's silly to use a system call to grep in a Perl program, but that's the first example I could think of).
To prevent casual confusion, I create a variable that holds the exit status of my system command instead of testing the output of system directly:
my $error = system qw($command);
if ($error) {
die qq(Aw... It failed);
}
else {
say qq(Hooray! It worked!);
}
It's completely unnecessary, and people who work with Perl should know that system reverses Perl's definition of true and false, but if I hadn't had my coffee in the morning, I may miss it as I go over someone else's code. Doing this little step just makes the program look a bit more logical.
The Perldoc of system give you code that allows you to test the output of your system command to see exactly what happened. (If there was an error, or a system interrupt signal killed the system call). It's nice to know if you need to interrogate your system return value to figure out what went wrong.
system returns 0 on success, so you want and rather than or.
See also: use autodie qw( system );
To add what hasn't been mentioned but what comes with it and can quietly mess things up.
The system's return of 'zero-or-not' only tells whether the command itself executed successfully, by shell or execvp; the 0 does not mean that the command succeeded in what it was doing.†
In case of non-zero return you need to unpack $? for more information;† see system for how to do this. For one, the command's actual exit code is $? >> 8, what the executed program was designed to communicate at its exit.
Altogether you may want to do something to the effect of
sub run_system {
my ($cmd, #other_agrs) = #_; # may do more via #other_args
my ($cmdref, $sys_ret) = (ref $cmd, 0); # LIST or scalar invocation?
if ($cmdref eq 'ARRAY') {
$sys_ret = system(#$cmd);
}
elsif (not $cmdref) {
$sys_ret = system($cmd);
}
else { Carp::carp "Invocation error, got $cmdref" }
return 1 if $sys_ret == 0;
# Still here? The rest is error handling.
# (Or handling of particular non-zero returns; see text footnote)
Carp::carp "Trouble with 'system($cmd)': $?";
print "Got exit " . ($? >> 8) . " from $cmd\n";
return 0; # or Carp::croak (but then design changes)
}
This can be done merely as system == 0 or do { ... }; or such right in the code, but this way there's a little library, where we can refine error handling more easily, eventually decide to switch to a module to manage external commands, etc. Also, being a user-level facility now it makes sense that it returns true on success (we aren't changing system's design).
Additionally in this case, mkdir -p is meant to be quiet and it may not say anything when it cannot do its thing, in some cases. This is by design of course, but one should be aware of it.
† A program may return whatever it wants; nothing requires of it to return zero if it succeeded (completed without error) and it does not have to follow any conventions; what it returns is entirely up to its design.
Typical Unix programs generally follow the convention of returning non-zero only if there were problems, but some do return non-zero values merely to indicate a particular condition. See for example here and here.
One example: for at least some versions, zip returns non-zero (12) when it "has nothing to do" (nothing to update in a package for example), even though it completes successfully; so we get 3072 from system in that case (when that 12 is packed into high bits of a number which is the return code). This is documented on its man page.
A zero return, though, normally does imply a successful operation, what is the main point here.
Simplest
system ("mkdir -p Purged") && die "Failed to mkdir.";
The system command passes through the exit code of the program that it ran. It doesn't judge or interpret it. Raku, on the other hand, does interpret it and you have to fight the language to get around it. And, remember that Perl's system is basically C's system (and many things in Perl are just what C's version does).
You need to compare this to the value you expected based on your knowledge of that particular command's behavior, as shown in the system docs:
#args = ("command", "arg1", "arg2");
system(#args) == 0
or die "system #args failed: $?"
Usually, well made unix programs use 0 to indicate success and non-zero for everything else. That's a good rule of thumb, but it's a convention rather than a requirement. Some programs return non-zero values to indicate other sorts of success. system knows none of that. Not only that, many programmers have no idea what an exit code should be, and I've often had to deal with someone's reinvented idea of what that should be (and I've probably done that myself at some point).
There's been a growing trend of "trust and don't verify", which I think is the new generation of programmers not having systems and shell programming backgrounds. And, even then, there's how the world should be and how it actually is. Remember, your goal as the programmer is to know what happened and why, even if the world is messy.
Here are a few interesting links:
Well Behaved Non-Zero Exit Codes?
Non-zero exit status for clean exit
Are there any standard exit status codes in Linux?
I like this statement from Frederick the best, though:
More realistically, 0 means success or maybe failure, 1 means general failure or maybe success, 2 means general failure if 1 and 0 are both used for success, but maybe success as well.
Consider grep. Here's a file to search:
$ cat original.txt
This is the original
This is from append.txt
This is from append.txt
This is from append.txt
Search for something that exists:
$ grep original original.txt
This is the original
The exit code is 0 ($? on the shell, but Perl's $? is more complicated). That's what you expect by convention. You found something and there were no errors:
$ echo $?
0
But now search for something that doesn't exist. You don't find something but there were no errors:
$ grep foo original.txt
Now the exit code is 1, because that's what grep does:
$ echo $?
1
This is non-zero but not an error. You called grep correctly and it produced the correct and expected output.
It's not up to system to judge those exit values. It passes through that exit code and lets you interpret them.
Now, we get lazy and assume that 0 means that it worked. We do that because many people are used to seeing it and it works most of the time. But, "works most if the time" is different than "won't bite me in the ass". Compare the result to system to exactly what you will accept as success:
my $expected = 0;
system('...') == $expected or die ...
If the task's definition of success is that grep finds matching lines, that code works fine because the exit code in that case happens to be zero. But, that accidentally works because two different ideas happen to align at that one point.
But, what you accept as success is not the same thing as the exit code. Consider that grep example. So far you know that both 0 and 1 are normal operation. Both of those mean that grep did not encounter an error. That's different than calling grep incorrectly and getting a different exit value:
$ grep -X foo original.txt
grep: original.txt: Undefined error: 0
$ echo $?
2
That 2 is conventionally used to signal that you called the program incorrectly. For example, you are using switches that exist on one implementation of a tool that don't exist on another (there are other things than line and gnu, but less so).
You might end up with something like this:
my %acceptable = map { $_ => 1 } qw(0 1);
my $rc = system(...);
die ... unless exists $acceptable{$rc};
And, the die can pass through that exit code (see it's method for choosing a value):
$ perl -e 'system( "grep -X foo original.txt" ) == 0 or die'
grep: original.txt: Undefined error: 0
Died at -e line 1.
$ echo $?
2
Sure, that's not as pretty as the single statement, but pretty shouldn't trump correct.
But that's not even good enough. My linux version of grep says it will exit with 0 on error in certain conditions:
Normally the exit status is 0 if a line is selected, 1 if no lines were selected, and 2 if an error occurred. However, if the -q or --quiet or --silent is used and a line is selected, the exit status is 0 even if an error occurred.
Going a step further, there's a list of conventional exit codes in sysexits.h that signal very particular conditions. For example, that exit code of 2 gets the symbol EX_USAGE and indicates a problem with the way the command was formed.
When I'm writing system code, I'm hard core on proper exit codes so other programs can know what happened without parsing error output. The die called by the user without a failure in a system will probably return 255. That's not very useful for other things to figure out what went wrong.
A one-line solution:
system("if printf '' > tmp.txt; then exit 1; else exit 0; fi ;") or die("unable to clobber tmp.txt");

Using perl's `system`

I would like to run some command (e.g. command) using perl's system(). Suppose command is run from the shell like this:
command --arg1=arg1 --arg2=arg2 -arg3 -arg4
How do I use system() to run command with these arguments?
Best practices: avoid the shell, use automatic error handling - IPC::System::Simple.
require IPC::System::Simple;
use autodie qw(:all);
system qw(command --arg1=arg1 --arg2=arg2 -arg3 -arg4);
use IPC::System::Simple qw(runx);
runx [0], qw(command --arg1=arg1 --arg2=arg2 -arg3 -arg4);
# ↑ list of allowed EXIT_VALs, see documentation
Edit: a rant follows.
eugene y's answer includes a link to the documentation to system. There we can see a homungous piece of code that needs to be included everytime to do system properly. eugene y's answer shows but a part of it.
Whenever we are in such a situation, we bundle up the repeated code in a module. I draw parallels to proper no-frills exception handling with Try::Tiny, however IPC::System::Simple as system done right did not see this quick adoption from the community. It seems it needs to be repeated more often.
So, use autodie! Use IPC::System::Simple! Save yourself the tedium, be assured that you use tested code.
my #args = qw(command --arg1=arg1 --arg2=arg2 -arg3 -arg4);
system(#args) == 0 or die "system #args failed: $?";
More information is in perldoc.
As with everything in Perl, there's more than one way to do it :)
The best way, Pass the arguments as a list:
system("command", "--arg1=arg1","--arg2=arg2","-arg3","-arg4");
Though sometimes programs don't seem to play nice with that version (especially if they expect to be invoked from a shell). Perl will invoke the command from the shell if you do it as a single string.
system("command --arg1=arg1 --arg2=arg2 -arg3 -arg4");
But that form is slower.
my #args = ( "command", "--arg1=arg1", "--arg2=arg2", "-arg3", "-arg4" );
system(#args);

What's the difference between system, exec, and backticks in Perl?

In Perl, to run another Perl script from my script, or to run any system commands like mv, cp, pkgadd, pkgrm, pkginfo, rpm etc, we can use the following:
system()
exec()
`` (Backticks)
Are all the three the same, or are they different? Do all the three give the same result in every case? Are they used in different scenarios, like to call a Perl program we have to use system() and for others we have to use ``(backticks).
Please advise, as I am currently using system() for all the calls.
They're all different, and the docs explain how they're different. Backticks capture and return output; system returns an exit status, and exec never returns at all if it's successful.
IPC::System::Simple is probably what you want.
It provides safe, portable alternatives to backticks, system() and other IPC commands.
It also allows you to avoid the shell for most of said commands, which can be helpful in some circumstances.
The best option is to use some module, either in the standard library or from CPAN, that does the job for you. It's going to be more portable, and possibly even faster for quick tasks (no forking to the system).
However, if that's not good enough for you, you can use one of those three, and no, they are not the same. Read the perldoc pages on system(), exec(), and backticks to see the difference.
Calling system is generally a mistake. For instance, instead of saying
system "mv $foo /tmp" == 0
or die "could not move $foo to /tmp";
system "cp $foo /tmp" == 0
or die "could not copy $foo to /tmp";
you should say
use File::Copy;
move $foo, "/tmp"
or die "could not move $foo to /tmp: $!";
copy $foo, "/tmp"
or die "could not copy $foo to /tmp: $!";
Look on CPAN for modules that handle other commands for you. If you find yourself writing a lot of calls to system, it may be time to drop back into a shell script instead.
Well, the more people the more answers.
My answer is to generally avoid external commands execution. If you can - use modules. There is no point executing "cp", "mv" and a lot of another command - there exist modules that do it. And the benefit of using modules is that they usually work cross-platform. While your system("mv") might not.
When put in situation that I have no other way, but to call external command, I prefer to use IPC::Run. The idea is that all simplistic methods (backticks, qx, system, open with pipe) are inherently insecure, and require attention to parameters.
With IPC::Run, I run commands like I would do with system( #array ), which is much more secure, and I can bind to stdin, stdout and stderr separately, using variables, or callbacks, which is very cool when you'll be in situation that you have to pass data to external program from long-running code.
Also, it has built-in handling of timeouts, which come handy more than once :)
If you don't want the shell getting involved (usually you don't) and if waiting for the system command is acceptable, I recommend using IPC::Run3. It is simple, flexible, and does the common task of executing a program, feeding it input and capturing its output and error streams right.

When is the right time (and the wrong time) to use backticks?

Many beginning programmers write code like this:
sub copy_file ($$) {
my $from = shift;
my $to = shift;
`cp $from $to`;
}
Is this bad, and why? Should backticks ever be used? If so, how?
A few people have already mentioned that you should only use backticks when:
You need to capture (or supress) the output.
There exists no built-in function or Perl module to do the same task, or you have a good reason not to use the module or built-in.
You sanitise your input.
You check the return value.
Unfortunately, things like checking the return value properly can be quite challenging. Did it die to a signal? Did it run to completion, but return a funny exit status? The standard ways of trying to interpret $? are just awful.
I'd recommend using the IPC::System::Simple module's capture() and system() functions rather than backticks. The capture() function works just like backticks, except that:
It provides detailed diagnostics if the command doesn't start, is killed by a signal, or returns an unexpected exit value.
It provides detailed diagnostics if passed tainted data.
It provides an easy mechanism for specifying acceptable exit values.
It allows you to call backticks without the shell, if you want to.
It provides reliable mechanisms for avoiding the shell, even if you use a single argument.
The commands also work consistently across operating systems and Perl versions, unlike Perl's built-in system() which may not check for tainted data when called with multiple arguments on older versions of Perl (eg, 5.6.0 with multiple arguments), or which may call the shell anyway under Windows.
As an example, the following code snippet will save the results of a call to perldoc into a scalar, avoids the shell, and throws an exception if the page cannot be found (since perldoc returns 1).
#!/usr/bin/perl -w
use strict;
use IPC::System::Simple qw(capture);
# Make sure we're called with command-line arguments.
#ARGV or die "Usage: $0 arguments\n";
my $documentation = capture('perldoc', #ARGV);
IPC::System::Simple is pure Perl, works on 5.6.0 and above, and doesn't have any dependencies that wouldn't normally come with your Perl distribution. (On Windows it depends upon a Win32:: module that comes with both ActiveState and Strawberry Perl).
Disclaimer: I'm the author of IPC::System::Simple, so I may show some bias.
The rule is simple: never use backticks if you can find a built-in to do the same job, or if their is a robust module on the CPAN which will do it for you. Backticks often rely on unportable code and even if you untaint the variables, you can still open yourself up to a lot of security holes.
Never use backticks with user data unless you have very tightly specified what is allowed (not what is disallowed -- you'll miss things)! This is very, very dangerous.
Backticks should be used if and only if you need to capture the output of a command. Otherwise, system() should be used. And, of course, if there's a Perl function or CPAN module that does the job, this should be used instead of either.
In either case, two things are strongly encouraged:
First, sanitize all inputs: Use Taint mode (-T) if the code is exposed to possible untrusted input. Even if it's not, make sure to handle (or prevent) funky characters like space or the three kinds of quote.
Second, check the return code to make sure the command succeeded. Here is an example of how to do so:
my $cmd = "./do_something.sh foo bar";
my $output = `$cmd`;
if ($?) {
die "Error running [$cmd]";
}
Another way to capture stdout(in addition to pid and exit code) is to use IPC::Open3 possibily negating the use of both system and backticks.
Use backticks when you want to collect the output from the command.
Otherwise system() is a better choice, especially if you don't need to invoke a shell to handle metacharacters or command parsing. You can avoid that by passing a list to system(), eg system('cp', 'foo', 'bar') (however you'd probably do better to use a module for that particular example :))
In Perl, there's always more than one way to do anything you want. The primary point of backticks is to get the standard output of the shell command into a Perl variable. (In your example, anything that the cp command prints will be returned to the caller.) The downside of using backticks in your example is you don't check the shell command's return value; cp could fail and you wouldn't notice. You can use this with the special Perl variable $?. When I want to execute a shell command, I tend to use system:
system("cp $from $to") == 0
or die "Unable to copy $from to $to!";
(Also observe that this will fail on filenames with embedded spaces, but I presume that's not the point of the question.)
Here's a contrived example of where backticks might be useful:
my $user = `whoami`;
chomp $user;
print "Hello, $user!\n";
For more complicated cases, you can also use open as a pipe:
open WHO, "who|"
or die "who failed";
while(<WHO>) {
# Do something with each line
}
close WHO;
From the "perlop" manpage:
That doesn't mean you should go out of
your way to avoid backticks when
they're the right way to get something
done. Perl was made to be a glue
language, and one of the things it
glues together is commands. Just
understand what you're getting
yourself into.
For the case you are showing using the File::Copy module is probably best. However, to answer your question, whenever I need to run a system command I typically rely on IPC::Run3. It provides a lot of functionality such as collecting the return code and the standard and error output.
Whatever you do, as well as sanitising input and checking the return value of your code, make sure you call any external programs with their explicit, full path. e.g. say
my $user = `/bin/whoami`;
or
my $result = `/bin/cp $from $to`;
Saying just "whoami" or "cp" runs the risk of accidentally running a command other than what you intended, if the user's path changes - which is a security vulnerability that a malicious attacker could attempt to exploit.
Your example's bad because there are perl builtins to do that which are portable and usually more efficient than the backtick alternative.
They should be used only when there's no Perl builtin (or module) alternative. This is both for backticks and system() calls. Backticks are intended for capturing output of the executed command.
Backticks are only supposed to be used when you want to capture output. Using them here "looks silly." It's going to clue anyone looking at your code into the fact that you aren't very familiar with Perl.
Use backticks if you want to capture output.
Use system if you want to run a command. One advantage you'll gain is the ability to check the return status.
Use modules where possible for portability. In this case, File::Copy fits the bill.
In general, it's best to use system instead of backticks because:
system encourages the caller to check the return code of the command.
system allows "indirect object" notation, which is more secure and adds flexibility.
Backticks are culturally tied to shell scripting, which might not be common among readers of the code.
Backticks use minimal syntax for what can be a heavy command.
One reason users might be temped to use backticks instead of system is to hide STDOUT from the user. This is more easily and flexibly accomplished by redirecting the STDOUT stream:
my $cmd = 'command > /dev/null';
system($cmd) == 0 or die "system $cmd failed: $?"
Further, getting rid of STDERR is easily accomplished:
my $cmd = 'command 2> error_file.txt > /dev/null';
In situations where it makes sense to use backticks, I prefer to use the qx{} in order to emphasize that there is a heavy-weight command occurring.
On the other hand, having Another Way to Do It can really help. Sometimes you just need to see what a command prints to STDOUT. Backticks, when used as in shell scripts are just the right tool for the job.
Perl has a split personality. On the one hand it is a great scripting language that can replace the use of a shell. In this kind of one-off I-watching-the-outcome use, backticks are convenient.
When used a programming language, backticks are to be avoided. This is a lack of error
checking and, if the separate program backticks execute can be avoided, efficiency is
gained.
Aside from the above, the system function should be used when the command's output is not being used.
Backticks are for amateurs. The bullet-proof solution is a "Safe Pipe Open" (see "man perlipc"). You exec your command in another process, which allows you to first futz with STDERR, setuid, etc. Advantages: it does not rely on the shell to parse #ARGV, unlike open("$cmd $args|"), which is unreliable. You can redirect STDERR and change user priviliges without changing the behavior of your main program. This is more verbose than backticks but you can wrap it in your own function like run_cmd($cmd,#args);
sub run_cmd {
my $cmd = shift #_;
my #args = #_;
my $fh; # file handle
my $pid = open($fh, '-|');
defined($pid) or die "Could not fork";
if ($pid == 0) {
open STDERR, '>/dev/null';
# setuid() if necessary
exec ($cmd, #args) or exit 1;
}
wait; # may want to time out here?
if ($? >> 8) { die "Error running $cmd: [$?]"; }
while (<$fh>) {
# Have fun with the output of $cmd
}
close $fh;
}