When you use exec() in Perl:
Note that exec will not call your END blocks, nor will it invoke DESTROY methods on your objects.
How do I force perl to call END blocks anyway? Can I do something like END(); exec($0) or whatever?
I really am trying to make the program end its current instance and start a brand new instance of itself, and am too lazy to do this correctly (using cron or putting the entire program in an infinite loop). However, my END subroutines cleanup temp files and other important things, so I need them to run between executions.
Unhelpful links to code:
https://github.com/barrycarter/bcapps/blob/master/bc-metar-db.pl
https://github.com/barrycarter/bcapps/blob/master/bc-voronoi-temperature.pl
https://github.com/barrycarter/bcapps/blob/master/bc-delaunay-temperature.pl
So you're trying to execute a program within your script? exec probably isn't what you want then. exec behaves like the C exec: what gets called replaces your current process; to keep going, you'd have to do something like a fork to preserve your current process while executing another.
But good news! That all exists in the system builtin.
Does exactly the same thing as exec LIST , except that a fork is done first and the parent process waits for the child process to exit.
Here's what it looks like:
use 5.012; # or use 5.012 or newer
use warnings;
... # some part of my program
system($my_command, $arg1, $arg2); # forks, execs, returns.
END {
# still gets called because you never left the script.
}
If you absolutely must use an exec, you must call your cleanup routine automatically. To understand more about END, see perldoc perlmod for full details. The short of it: END is one of several types of blocks of code that gets execucted at a particular stage in the execution of the script. They are NOT subroutines. However, you can execute any code you want in those subroutines. So you can do:
sub cleanup { ... } # your cleanup code
sub do_exec {
cleanup();
exec( ... );
}
END {
cleanup();
}
and then you know your cleanup code will be executed at either script exit OR when you do your exec.
To answer the narrow question of how to invoke your END blocks at arbitrary times, you can use the B::end_av method with B::SV::object_2svref to get the code references to your END blocks.
sub invoke_end_blocks_before_exec {
use B;
my #ENDS = B::end_av->ARRAY;
foreach my $END (#ENDS) {
$END->object_2svref->();
}
}
END { print "END BLOCK 1\n" }
END { print "END BLOCK 2\n" }
...
invoke_end_blocks_before_exec();
exec("echo leave this program and never come back");
Output:
END BLOCK 2
END BLOCK 1
leave this program and never come back
I would usually prefer something less magical, though. Why not a structure like
sub cleanup { ... }
END { &cleanup }
if (need_to_exec()) {
cleanup(); # same thing END was going to do anyway
exec( ... );
}
?
Fork and exec
It'll leave you with a new pid, but you could do a fork/exec:
my $pid = fork();
defined $pid or die "fork failed";
exit if $pid; # parent immediately exits, calling END blocks.
exec($0) or die "exec failed"; # child immediately execs, will not call END blocks (but parent did, so OK)
This strikes me as far less fragile than mucking with internals or trying to make sure your exec is in the final END block.
Wrap your program
Also, it is trivial to just wrap your Perl program in a shell (or Perl) script that looks something like this:
#!/bin/sh
while sleep 5m; do
perl your-program.pl
done
or
#!/usr/bin/perl
while (1) {
system("perl your-program.pl");
sleep(5*60);
}
Can you put your call to exec in at the end of the (final) END block? Where your current call to exec is, set a flag, then exit. At the end of the END block, check the flag, and if it's true, call exec there. This way, you can exit your script without restarting, if necessary, and still have the END blocks execute.
That said, I'd recommend not implementing this type of process-level tail recursion.
Related
As I understand these special functions inside Perl code, BEGIN and CHECK blocks run during the compilation phase while INIT and END blocks run during actual execution phase.
I can understand using these blocks inside actual Perl code (Perl libraries) but what about using them inside modules? Is that possible?
Since when we use use <Module-name> the module is compiled, so in effect BEGIN and CHECK blocks run. But how will the INIT and END blocks run since module code I don't think is run in the true sense. We only use certain functions from inside the modules.
Short The special code blocks in packages loaded via use are processed and run (or scheduled to run) as encoutered, in the same way and order as in main::, since use itself is a BEGIN block.
Excellent documentation on this can be found in perlmod. From this section
A BEGIN code block is executed as soon as possible, that is, the moment it is completely defined, even before the rest of the containing file (or string) is parsed.
Since the use statements are BEGIN blocks they run as soon as encountered. From use
It is exactly equivalent to
BEGIN { require Module; Module->import( LIST ); }
So the BEGIN blocks in a package run in-line with others, as they are encountered. The END blocks in a package are then also compiled in the same order, as well as the other special blocks. As for the order of (eventual) execution
An END code block is executed as late as possible ...
and
You may have multiple END blocks within a file--they will execute in reverse order of definition; that is: last in, first out (LIFO)
The order of compilation and execution of INIT and CHECK blocks follows suit.
Here is some code to demonstrate these special code blocks used in a package.
File PackageBlocks.pm
package PackageBlocks;
use warnings;
BEGIN { print "BEGIN block in the package\n" }
INIT { print "INIT block in the package\n" }
END { print "END block in the package\n" }
1;
The main script
use warnings;
BEGIN { print "BEGIN in main script.\n" }
print "Running in the main.\n";
INIT { print "INIT in main script.\n" }
use PackageBlocks;
END { print "END in main script.\n" }
BEGIN { print "BEGIN in main script, after package is loaded.\n" }
print "After use PackageBlocks.\n";
Output
BEGIN in main script.
BEGIN block in the package
BEGIN in main script, after package is loaded.
INIT in main script.
INIT block in the package
Running in the main.
After use PackageBlocks.
END in main script.
END block in the package
The BEGIN block in the package runs in order of appearance, in comparison with the ones in main::, and before INIT. The END block runs at end,
and the one in the package runs after the one in main::, since the use comes before it in this example.
This is very easy to test for yourself
use Module (and require EXPR and do EXPR and eval EXPR) compile the Perl code and then immediately run it
That is where the 1; at the end of most modules is picked up. If executing the module's code after compiling it doesn't return a true value then require will fail
Admittedly there usually isn't much use for an INIT or an END block, because the run-time phase is so intimately tied to the compilation, and because modules are generally about defining subroutines, but the option is there if you want it
For the temp directory I need in my script, I go with:
my $work_dir = File::Temp->newdir(TEMPLATE => "/tmp/work.$$.XXXX" ) or die "Cannot create tempdir directory $!";
My hope with newdir() is to get the promise of:
By default the directory is deleted when the object goes out of scope.
Only to learn that if I hit CTRL-C, $work_dir will NOT be removed.
So I added signals:
use sigtrap qw(handler signal_handler normal-signals);
and then I simply use (File::Path 'remove_tree');
sub signal_handler
{
remove_tree $work_dir;
}
This helps if I hit CTRL-C while my script runs...
However, it does NOT clean up while using the debugger, if I CTRL-C out of the debugger! If I quit out cleanly (with the q command) then it works/cleans fine, only if I CTRL-C out of the debug session, that's when $work_dir is not being removed.
Is it possible, in any way, to have the signal handler being called automatically even within a perl debug session ?
(or any other "proper" ways how to use/install signal handlers)
Your signal handler isn't doing what you think it does, because passing an object to remove_tree doesn't work:
use strict;
use warnings;
use 5.010;
use File::Path qw(remove_tree);
use File::Temp;
my $tmpdir = File::Temp->newdir(CLEANUP => 0);
remove_tree $tmpdir;
say "$tmpdir still exists" if -d $tmpdir;
Outputs:
/tmp/lTfotn79RD still exists
The call to remove_tree in your signal handler seems to work when run outside of the debugger, but it's actually not doing anything. (You can prove this to yourself by commenting out the call to remove_tree and re-running your script.) So why does the directory get removed?
If a signal handler doesn't exit or die, execution continues wherever it left off before the signal was caught. In your case, after the signal handler finishes, the program simply runs to completion. When the program terminates, any objects that are still in scope are cleaned up by calling their DESTROY methods. File::Temp->newdir returns a File::Temp::Dir object; this object's DESTROY method is what actually removes the directory from the filesystem (it uses rmtree under the hood).
This doesn't work when you interrupt the debugger; I'm not familiar with the debugger's internals, but I'm guessing it keeps references to objects so that DESTROY isn't called, even when you step off the end of the program. If you Ctrl+C again at this point, the object is never cleaned up, and neither is the temporary directory.
One way I found to work around this is to explicitly undef the object returned by File::Temp->newdir in your signal handler:
use strict;
use warnings;
use 5.010;
use File::Temp;
use sigtrap qw(handler cleanup normal-signals);
my $tmpdir = File::Temp->newdir;
sub cleanup {
my ($sig) = #_;
say "Caught signal SIG$sig";
undef $tmpdir;
exit 0;
}
This causes the DESTROY method to be called before the program exits, so you're not relying on the debugger to clean up. This seems like kind of a hack, though; why not just quit the debugger gracefully with q?
Note that you could also pass the stringified version of $tmpdir to remove_tree like this:
remove_tree "$tmpdir";
But I wouldn't recommend this, since the documentation strongly cautions against relying on file names:
For maximum security, endeavour always to avoid ever looking at, touching, or even imputing the existence of the filename. You do not know that that filename is connected to the same file as the handle you have, and attempts to check this can only trigger more race conditions. It's far more secure to use the filehandle alone and dispense with the filename altogether.
I like to use an END block. The any clean exit from the program, especially a 'quit' from the debugger will trigger the END block an, in my case, delete all my test data.
So put your clean up code in and END block have have you sig handler call exit() instead of remove_tree.
END {
remove_tree $work_dir;
}
sub signal_handler
{
exit();
}
I have an object which I do lots of stuff to, and then finish up by calling a TidyUp() method which displays some stats about what happened.
As part of internal testing, I wanted to detect when these objects were being created, but not tidied up before the program exited, and in that case die.
So I added a flag to the object which is set in TidyUP(), and a DESTROY sub-routine for the class that dies if this flag isn't set.
This all works fine, but it has an unfortunate side-effect. If something else goes wrong in the program, which causes die to be called, then the global destruction fires off my DESTROY, which notices that the object hasn't been tidied and dies. The problem is that I lose the original die message at this point.
So I'd like to detect whether the destruction occurred as a result of a die, or just normal program exit, and only in the latter case add my new die.
I had a look at Detecting global destruction in Perl but I don't think that helps as I want to detected what triggered global destruction.
You could set a global flag before you check whether your objects are tidied up. Then you know in which phase your program is running. You could also try to register a callback when your program dies using $SIG{__DIE__}. Checking $? is not safe since it could be set by other means. Checking whether you're in global destruction should work, too. But the cleanest solution would be to store all objects that have to be tidied up in an extra array. Then you can loop over these objects whenever you want and do away with using DESTROY.
From playing around with this problem, I see that the call stack in the DESTROY method is slightly different depending on how the program exits. Maybe that could help:
package Foo;
sub new { bless [], __PACKAGE__ }
sub DESTROY {
my ($n,$pkg,$file,$line);
while (my #c=caller($n++)) {
($pkg,$file,$line) = #c;
}
print STDERR "DESTROY called via $file line $line\n";
}
my $foo = Foo->new;
if ($ARGV[0] eq 'die') { die } # line 11
if ($ARGV[0] eq 'exit') { exit } # line 12
# else passive exit
$ perl 14255585.pl die
Died at 14255585.pl line 11.
DESTROY called via 14255585.pl line 11
$ perl 14255585.pl exit
DESTROY called via 14255585.pl line 12
$ perl 14255585.pl foo
DESTROY called via 14255585.pl line 0
If the list of exit points in your program is small and well-defined, you could just enumerate them and handle them when the program ends. Otherwise, you could do some on-the-fly static code analysis to see what the likely cause of death is.
You can hook into $SIG{__DIE__}, the global exception handler:
#!/usr/bin/perl
use Modern::Perl;
package Foo;
my $DIED = 0;
BEGIN { $SIG{__DIE__} = sub { $DIED = 1 }; }
sub new { bless [0] }
sub DESTROY { die 'untidy' unless shift->[0] or $DIED }
sub tidy_up { shift->[0] = 1 }
package main;
my $foo = new Foo;
die if #ARGV; # line 13
$foo->tidy_up();
say 'success';
$ perl test.pl
success
$ perl test.pl die
Died at test.pl line 13.
Since that exception handler is installed globally, make sure you don't override an existing error handler. Signals::XSIG helps with that.
I am running a program in Perl that at one point evaluates data in an if statement called from within a subroutine, e.g.
sub check_good {
if (!good) {
# exit this subroutine
# restart program
}
else {
# keep going
}
} # end sub
The problem I have is with exiting and restarting. I know that I can just use exit 0; to exit straight out, but obviously this is not correct if I want to go back to the beginning. I tried calling the subroutine which essentially starts the program, but of course once it has run it will go back to this point again.
I thought about putting it in a while loop, but this would mean putting the whole file in the loop and it would be very impractical.
I don't actually know whether this is possible, so any input would be great.
If you have not changed #ARGV, or you keep a copy of it, you could possibly do something like exec($^X, $0, #ARGV).
$^X and $0 (or $EXECUTABLE_NAME and $PROGRAM_NAME, see Brian's comment below) are the current perl interpreter and current perl script, respectively.
An alternative would be to always have two processes: A supervisor and a worker.
Refactor all your logic into a subroutine called run(or main or whatever). Whn your real logic detect that it needs to restart it should exit with a predefined non-zero exit code (like 1 for example).
Then your main script and supervisor would look like this:
if (my $worker = fork) {
# child process
run(#ARGV);
exit 0;
}
# supervisor process
waitpid $worker;
my $status = ($? >> 8);
if ($status == 1) { ... restart .. }
exit $status; # propagate exit code...
In the simple scenario where you just want to restart once, this might be a bit overkill. But if you at any point need to be able to handle other error scenarios this method might be preferable.
For example if the exit code is 255, this indicates that the main script called die(). In this case you might want to implement some decision procedure wether to restart the script, ignore the error, or escalate the issue.
There are quite a few modules on CPAN implementing such supervisors. Proc::Launcher is one of them and the manual page includes a extensive discussion of related works. (I have never used Proc::Launcher, it is mainly due to this discussion I'm linking to it)
There's nothing to stop you calling system on yourself. Something like this (clearly in need of a tidy), where I pass in a command-line argument to prevent the code calling itself forever.
#!/usr/bin/perl
use strict;
use warnings;
print "Starting...\n";
sleep 5;
if (! #ARGV) {
print "Start myself again...\n";
system("./sleep.pl secondgo");
print "...and die now\n";
exit;
} elsif ((#ARGV) && $ARGV[0] eq "secondgo") {
print "Just going to die straightaway this time\n";
exit;
}
How can I set a code that must execute before a Perl script stops?
In here how to run piece of code just before the exit of perl script I read about the END subroutine, but it only executes if the script ends normally. However, I want the code to be executed also if, for example, user aborts the program by ^C.
Trap the termination signals and re-route them so something, simplest would be:
$SIG{TERM} = $SIG{INT} = $SIG{QUIT} = $SIG{HUP} = sub { die; };