Perl Detecting DESTROY as result of die() - perl

I have an object which I do lots of stuff to, and then finish up by calling a TidyUp() method which displays some stats about what happened.
As part of internal testing, I wanted to detect when these objects were being created, but not tidied up before the program exited, and in that case die.
So I added a flag to the object which is set in TidyUP(), and a DESTROY sub-routine for the class that dies if this flag isn't set.
This all works fine, but it has an unfortunate side-effect. If something else goes wrong in the program, which causes die to be called, then the global destruction fires off my DESTROY, which notices that the object hasn't been tidied and dies. The problem is that I lose the original die message at this point.
So I'd like to detect whether the destruction occurred as a result of a die, or just normal program exit, and only in the latter case add my new die.
I had a look at Detecting global destruction in Perl but I don't think that helps as I want to detected what triggered global destruction.

You could set a global flag before you check whether your objects are tidied up. Then you know in which phase your program is running. You could also try to register a callback when your program dies using $SIG{__DIE__}. Checking $? is not safe since it could be set by other means. Checking whether you're in global destruction should work, too. But the cleanest solution would be to store all objects that have to be tidied up in an extra array. Then you can loop over these objects whenever you want and do away with using DESTROY.

From playing around with this problem, I see that the call stack in the DESTROY method is slightly different depending on how the program exits. Maybe that could help:
package Foo;
sub new { bless [], __PACKAGE__ }
sub DESTROY {
my ($n,$pkg,$file,$line);
while (my #c=caller($n++)) {
($pkg,$file,$line) = #c;
}
print STDERR "DESTROY called via $file line $line\n";
}
my $foo = Foo->new;
if ($ARGV[0] eq 'die') { die } # line 11
if ($ARGV[0] eq 'exit') { exit } # line 12
# else passive exit
$ perl 14255585.pl die
Died at 14255585.pl line 11.
DESTROY called via 14255585.pl line 11
$ perl 14255585.pl exit
DESTROY called via 14255585.pl line 12
$ perl 14255585.pl foo
DESTROY called via 14255585.pl line 0
If the list of exit points in your program is small and well-defined, you could just enumerate them and handle them when the program ends. Otherwise, you could do some on-the-fly static code analysis to see what the likely cause of death is.

You can hook into $SIG{__DIE__}, the global exception handler:
#!/usr/bin/perl
use Modern::Perl;
package Foo;
my $DIED = 0;
BEGIN { $SIG{__DIE__} = sub { $DIED = 1 }; }
sub new { bless [0] }
sub DESTROY { die 'untidy' unless shift->[0] or $DIED }
sub tidy_up { shift->[0] = 1 }
package main;
my $foo = new Foo;
die if #ARGV; # line 13
$foo->tidy_up();
say 'success';
$ perl test.pl
success
$ perl test.pl die
Died at test.pl line 13.
Since that exception handler is installed globally, make sure you don't override an existing error handler. Signals::XSIG helps with that.

Related

Why does Perl's IO::Pipe exception behave differently than croak or die in eval block?

I noticed in my program that an exception raised from IO::Pipe was behaving oddly, and I cannot figure out what it's doing (let alone how it's doing it). I've boiled it down to a simple example program:
use strict;
use warnings;
use Carp;
use IO::Pipe;
my($path) = shift;
my($bad) = shift || "";
eval {
if ($path =~ m{pipe}i) {
my($bin) = ($bad ? "/bin/lsddd" : "/bin/ls");
my($pipe) = IO::Pipe->new();
$pipe->reader("$bin -l .");
print "$_" while <$pipe>;
$pipe->close;
}
elsif ($path =~ m{croak}i) {
croak "CROAKED" if $bad;
}
else {
die "DIED" if $bad;
}
};
if ($#) {
my($msg) = $#;
die "Caught Exception: $msg\n";
}
die "Uh-oh\n" if $bad;
print "Made it!\n";
The example program takes two arguments, one to indicate which code path to go down inside the eval block, and the second to indicate whether or not to generate an error (anything that evaluates to false will not generate an error). All three paths behave as expected when no error is requested; they all print Made it! with no error messages.
When asking for an error and running through the croak or die paths, it also behaves as I expect: the exception is caught, reported, and the program terminates.
$ perl example.pl die foo
Caught Exception: DIED at example.pl line 23.
and
$ perl example.pl croak foo
Caught Exception: CROAKED at example.pl line 11.
eval {...} called at example.pl line 10
When I send an error down the IO::Pipe path, though, it reports an error, but the program execution continues until the outer die is reached:
$ perl example.pl pipe foo
Caught Exception: IO::Pipe: Cannot exec: No such file or directory at example.pl line 15.
Uh-oh
The first question is why -- why does the program report the "Caught Exception" message but not terminate? The second question is how do I prevent this from happening? I want the program to stop executing if the program can't be run.
There are two processes running after the eval in the case of interest. You can see this by adding a print statement before if ($#). One drops through eval and thus gets to the last die.
The reader forks when used with an argument, to open a process. That process is exec-ed in the child while the parent returns, with its pid. The code for this is in _doit internal subroutine
When this fails the child croaks with the message you get. But the parent returns regardless as it has no IPC with the child, which is expected to just disappear via exec. So the parent escapes and makes its way down the eval. That process has no $# and bypasses if ($#).
It appears that this is a hole in error handling, in the case when reader is used to open a process.
There are ways to tackle this. The $pipe is an IO::Handle and we can check it and exit that extra process if it's bad (but simple $pipe->error turns out to be the same in both cases). Or, since close is involved, we can go to $? which is indeed non-zero when error happens
# ...
$pipe->close;
exit if $? != 0;
(or rather first examine it). This is still a "fix," which may not always work. Other ways to probe the $pipe, or to find PID of the escapee, are a bit obscure (or worse, digging into class internals).
On the other hand, a simple way to collect the output and exit code from a program is to use a module for that. A nice pick is Capture::Tiny. There are others, like IPC::Run and IPC::Run3, or core but rather low-level IPC::Open3.
Given the clarifications, the normal open should also be adequate.

Under what circumstances are END blocks skipped in Perl?

I have a long-running program that used File::Temp::tempdir to create a temporary file and sometimes interrupted it via ^C.
The following program prints the name of the temporary directory it creates and the name of a file in it.
#!/usr/bin/env perl
use strict;
use warnings;
use File::Temp qw[tempdir];
my $dir = tempdir(CLEANUP => 1);
print "$dir\n";
print "$dir/temp.txt\n";
`touch $dir/temp.txt`;
exit;
On OS X, this creates a directory inside /var/folders
If the last line is exit; or die;, then the folder will get cleaned up and the temporary file inside it will get deleted.
However, if we replace the last line with sleep 20; and then interrupt the perl program via ^C, the temporary directory remains.
% perl maketemp.pl
/var/folders/dr/cg4fl5m11vg3jfxny3ldfplc0000gn/T/ycilyLSFs6
/var/folders/dr/cg4fl5m11vg3jfxny3ldfplc0000gn/T/ycilyLSFs6/temp.txt
^C
% stat /var/folders/dr/cg4fl5m11vg3jfxny3ldfplc0000gn/T/ycilyLSFs6/temp.txt
16777220 6589054 -rw-r--r-- 1 <name> staff 0 0 "Aug 1 20:46:27 2016" "Aug 1 20:46:27 2016" "Aug 1 20:46:27 2016" "Aug 1 20:46:27 2016" 4096 0 0
/var/folders/dr/cg4fl5m11vg3jfxny3ldfplc0000gn/T/ycilyLSFs6/temp.txt
%
using a signal handler that just calls exit; does clean up the directory. E.g.
#!/usr/bin/env perl
use strict;
use warnings;
use File::Temp qw[tempdir];
$SIG{INT} = sub { exit; };
my $dir = tempdir(CLEANUP => 1);
print "$dir\n";
print "$dir/temp.txt\n";
`touch $dir/temp.txt`;
sleep 20;
As does using a "trivial" signal handler
#!/usr/bin/env perl
use strict;
use warnings;
use File::Temp qw[tempdir];
$SIG{INT} = sub { };
my $dir = tempdir(CLEANUP => 1);
print "$dir\n";
print "$dir/temp.txt\n";
`touch $dir/temp.txt`;
sleep 20;
I tried looking through the source code (https://github.com/Perl-Toolchain-Gang/File-Temp/blob/master/lib/File/Temp.pm) to determine how tempdir is registering a cleanup action
Here's the exit handler installation
https://github.com/Perl-Toolchain-Gang/File-Temp/blob/master/lib/File/Temp.pm#L1716
which calls _deferred_unlink
https://github.com/Perl-Toolchain-Gang/File-Temp/blob/master/lib/File/Temp.pm#L948
which modified the global hashes %dirs_to_unlink and %files_to_unlink, but uses the pid $$ as a key for some reason (probably in case the Perl interpreter forks? Not sure why that's necessary though since removing a directory seems like it would be an idempotent operation.)
The actual logic to clean up the files is here, in the END block.
https://github.com/Perl-Toolchain-Gang/File-Temp/blob/master/lib/File/Temp.pm#L878
A quick experiment shows that END blocks are indeed run when perl has exited normally or abnormally.
sleep 20;
END {
print "5\n";
}
# does not print 5 when interrupted
And are run here
$SIG{INT} = sub {};
sleep 20;
END {
print "5\n";
}
# does print 5 when interrupted
So ... why does the END block get skipped after a SIGINT unless there's a signal handler, even one that seems like it should do nothing?
By default, SIGINT kills the process[1]. By kill, I mean the process is immediately terminated by the kernel. The process doesn't get to perform any cleanup.
By setting a handler for SIGINT, you override this behaviour. Instead of killing the process, the signal handler is called. It might not do anything, but its very existence prevented the process from being killed. In this situation, the program won't exit as a result of the signal unless it chooses to exit (by calling die or exit in the handler. If it does, it would get a chance to cleanup as normal.
Note that if a signal for which a handler was defined comes in during a system call, the system call exits with error EINTR in order to allow the program to safely handle the signal. This is why sleep returns as soon as SIGINT is received.
If instead you had used $SIG{INT} = 'IGNORE';, the signal would have been completely ignored. Any systems calls in progress won't be interrupted.
On my system, man 1 kill lists the default actions of signals.
Your signal handler $SIG{INT} = sub {} isn't doing nothing, it is trapping the signal and preventing the program from exiting.
But to answer your original question, END blocks, as perlmod says:
is executed as late as possible, that is, after perl has finished running the program and just before the interpreter is being exited, even if it is exiting as a result of a die() function. (But not if it's morphing into another program via exec, or being blown out of the water by a signal--you have to trap that yourself (if you can).)
That is, a fatal signal, if not trapped, circumvents Perl's global destruction and does not call END blocks.

In Perl is there a way to restart the program currently running from within itself?

I am running a program in Perl that at one point evaluates data in an if statement called from within a subroutine, e.g.
sub check_good {
if (!good) {
# exit this subroutine
# restart program
}
else {
# keep going
}
} # end sub
The problem I have is with exiting and restarting. I know that I can just use exit 0; to exit straight out, but obviously this is not correct if I want to go back to the beginning. I tried calling the subroutine which essentially starts the program, but of course once it has run it will go back to this point again.
I thought about putting it in a while loop, but this would mean putting the whole file in the loop and it would be very impractical.
I don't actually know whether this is possible, so any input would be great.
If you have not changed #ARGV, or you keep a copy of it, you could possibly do something like exec($^X, $0, #ARGV).
$^X and $0 (or $EXECUTABLE_NAME and $PROGRAM_NAME, see Brian's comment below) are the current perl interpreter and current perl script, respectively.
An alternative would be to always have two processes: A supervisor and a worker.
Refactor all your logic into a subroutine called run(or main or whatever). Whn your real logic detect that it needs to restart it should exit with a predefined non-zero exit code (like 1 for example).
Then your main script and supervisor would look like this:
if (my $worker = fork) {
# child process
run(#ARGV);
exit 0;
}
# supervisor process
waitpid $worker;
my $status = ($? >> 8);
if ($status == 1) { ... restart .. }
exit $status; # propagate exit code...
In the simple scenario where you just want to restart once, this might be a bit overkill. But if you at any point need to be able to handle other error scenarios this method might be preferable.
For example if the exit code is 255, this indicates that the main script called die(). In this case you might want to implement some decision procedure wether to restart the script, ignore the error, or escalate the issue.
There are quite a few modules on CPAN implementing such supervisors. Proc::Launcher is one of them and the manual page includes a extensive discussion of related works. (I have never used Proc::Launcher, it is mainly due to this discussion I'm linking to it)
There's nothing to stop you calling system on yourself. Something like this (clearly in need of a tidy), where I pass in a command-line argument to prevent the code calling itself forever.
#!/usr/bin/perl
use strict;
use warnings;
print "Starting...\n";
sleep 5;
if (! #ARGV) {
print "Start myself again...\n";
system("./sleep.pl secondgo");
print "...and die now\n";
exit;
} elsif ((#ARGV) && $ARGV[0] eq "secondgo") {
print "Just going to die straightaway this time\n";
exit;
}

Why do I need to localize $# before using eval?

I'm aware of the fact that $# is a global variable, still I can't figure out why I need to localize it before using eval:
For instance:
eval { SOME_FUNC_THAT_MAY_DIE(); };
if ($#) {
print "An error occured!\n";
}
The only possible thing I can think of is, if some signal handler will call die at the same time I try to read $#, what am I missing here?
The reason to say local $# before calling eval is to avoid stepping on your caller's $#. It's rude for a subroutine to alter any global variables (unless that's one of the stated purposes of the subroutine). This isn't really an issue with top-level code (not inside any subroutine).
Also, on older Perl's, any eval called during object destruction would clobber the global $# (if the object was being destroyed because an exception was being thrown from an eval block) unless $# was localized first. This was fixed in 5.14.0, but many people are still running older Perls.
The Try::Tiny module documentation gives the rationale (as well as providing an alternative):
When you run an eval block and it succeeds, $# will be cleared, potentially clobbering an error that is currently being caught.
This causes action at a distance, clearing previous errors your caller may have not yet handled.
$# must be properly localized before invoking eval in order to avoid this issue.
More specifically, $# is clobbered at the beginning of the eval, which also makes it impossible to capture the previous error before you die (for instance when making exception objects with error stacks).
You don't need to, but if you wrote code like this, localizing $# would keep the first error as it was. and if you didn't write code like this, the local $# would have no effect. better would be to handle errors before running any extra code.
eval {
die "error 1\n";
};
foo();
print "processing $#\n";
sub foo {
#local $#;
eval {
die "error 2\n";
};
}

How can I prevent my perl script from terminating if an exception is thrown in a module it uses?

I have a perl script, using standard-as-dirt Net::HTTP code, and perl 5.8.8.
I have come across an error condition in which the server returns 0 bytes of data when I call:
$_http_connection->read_response_headers;
Unfortunately, my perl script dies, because the Net::HTTP::Methods module has a "die" on line 306:
Server closed connection without sending any data back at
/usr/lib/perl5/vendor_perl/5.8.8/Net/HTTP/Methods.pm line 306
And lines 305-307 are, of course:
unless (defined $status) {
die "Server closed connection without sending any data back";
}
How can I have my script "recover gracefully" from this situation, detecting the die and subsequently going into my own error-handling code, instead of dieing itself?
I'm sure this is a common case, and probably something simple, but I have not come across it before.
Using eval to catch exceptions can occasionally be problematic, especially pre 5.14. You can use Try::Tiny.
You can use eval { } to catch die() exceptions. Use $# to inspect the thrown value:
eval {
die "foo";
};
print "the block died with $#" if $#;
See http://perldoc.perl.org/functions/eval.html for details.
Customizing the die to mean something else is simple:
sub custom_exception_handler { ... } # Define custom logic
local $SIG{__DIE__} = \&custom_exception_handler; # Won't die now
# Calls custom_exception_handler instead
The big advantage of this approach over eval is that it doesn't require calling another perl interpreter to execute the problematic code.
Of course, the custom exception handler should be adequate for the task at hand.