Which (untrapped) signals will cause a Perl program to stop executing? - perl

What signals will cause a Perl program to stop running if their %SIG entries are not explicitly set?

The answer is platform dependent. To see the default behavior of each signal on your own system, download the Signals::XSIG module (you don't need to install it) and run the program spike/analyze_default_signal_behavior.pl (with no arguments). Or just download and run the script from here.
Note that some signals cannot be trapped by your program even if you do install a %SIG handler. This is also system dependent but usually includes at least SIGKILL and SIGSTOP.

It is easier to talk about the ones that won't stop your program. On my machine (RHEL), everything but FPE (floating point exception), CHLD (child status change), CONT (continue process), URG (urgent condition on socket), and WINCH (window size change) cause the Perl program to stop executing.
Four of the signals don't cause the program to exit, but temporarily cause the program to stop execution: STOP (stop, unblockable), TSTP (terminal stop), and TTIN (Background read from tty), TTOU (Background write to tty). The program will start running again if it recieves CONT.

From man kill on Debian,
Name Num Action Description
0 0 n/a exit code indicates if a signal may be sent
ALRM 14 exit
HUP 1 exit
INT 2 exit
KILL 9 exit cannot be blocked
PIPE 13 exit
POLL exit
PROF exit
TERM 15 exit
USR1 exit
USR2 exit
VTALRM exit
STKFLT exit might not be implemented
PWR ignore might exit on some systems
WINCH ignore
CHLD ignore
URG ignore
TSTP stop might interact with the shell
TTIN stop might interact with the shell
TTOU stop might interact with the shell
STOP stop cannot be blocked
CONT restart continue if stopped, otherwise ignore
ABRT 6 core
FPE 8 core
ILL 4 core
QUIT 3 core
SEGV 11 core
TRAP 5 core
SYS core might not be implemented
EMT core might not be implemented
BUS core core dump might fail
XCPU core core dump might fail
XFSZ core core dump might fail

Related

Handle signal from 'kill -3' or taskkill in Win32Console application

I have a Win32 console application (built from Visual Studio as Win32 console project) which does some log file (.txt) processing. I have a separate perl program (legacy program) which now needs to start this Win32 console application and then stop when done.
Perl program starts an instance of Win32 console app using Win32::Process APIs. It can kill the console app when done by either "kill -x pid" or Win32:Process:Kill. The problem is console app needs to know if its being killed/terminated so that it can flush log handling. The console app has already registered a handler via SetConsoleCtrlHandler API but doesn't get called when killed from perl program by say kill -2/3 pid.
What do I change in perl program or in Win32 console app so that it can know when its being terminated?
Thanks!
Signal handling in Windows is a little quirky if you're used to Unix. I have done a lot of investigation into this, and wrote up my findings here (starting at line 261).
Short answer: Windows processes can set $SIG{INT}, $SIG{QUIT}, or $SIG{BREAK}. All other signal handlers are ignored. Signal them from you separate app with the builtin kill:
kill 'INT', $the_win32_logger_pid;
kill 'QUIT', $the_win32_logger_pid;
kill 'BREAK', $the_win32_logger_pid;

Trapping signals cleanly in Perl

I have a simple Perl script that simply prints a line of text to stdout. What I want to accomplish is that while this script runs, if I (or someone else) issues a signal to that process to stop, I want it to trap that signal and exit cleanly. The code I have looks like the following
#!/usr/bin/perl -w
$| = 1;
use sigtrap 'handler' => \&sigtrap, 'HUP', 'INT','ABRT','QUIT','TERM';
while(1){
print "Working...\n";
sleep(2);
}
sub sigtrap(){
print "Caught a signal\n";
exit(1);
}
While this works well when I actually hit ctrl-c from the command line, if I issue a
kill -9 <pid>
It just dies. How do I get it to execute something before exiting? My general idea is to use this framework to capture when this script dies on a server due to a server reboot for maintenance or failure.
Thanks much in advance
Signal #9 (SIGKILL) can not be trapped. That's how Unix is designed.
But the system does not send that signal when shutting down for maintainance. At least if your daemon behaves correctly. It will normally send the TERM signal (or more exactly what your daemon handling script in /etc/init.d does). Only processes that do not correctly shutdown after a timeout will receive SIGKILL.
So your aim should be to correctly handle the TERM signal and to write the wrapper script in /etc/init.d that will be called when the system is changing runlevel.
Update: You can use the Daemon::Control module for the init script.
You're sending two very different signals to your process. Pressing Ctrl-C in console usually sends the process a TERMINT signal, which - judging by your code - is caught and serviced. kill -9, though, sends signal number 9 explicitly, which is called KILL. This is one of the signals whose servicing cannot be redefined and delivery of this signal always immediately ends the process, which is done by the kernel itself.
As far as I know you can't capture kill -9. Try kill <pid> instead.

How can I attach a debugger to a running Perl process?

I have a running Perl process that’s stuck, I’d like to poke inside with a debugger to see what’s wrong. I can’t restart the process. Can I attach the debugger to the running process? I know I can do gdb -p, but gdb does not help me. I’ve tried Enbugger, but failed:
$ perl -e 'while (1) {}'&
[1] 86836
$ gdb -p 86836
…
Attaching to process 86836.
Reading symbols for shared libraries . done
Reading symbols for shared libraries ............................. done
Reading symbols for shared libraries + done
0x000000010c1694c6 in Perl_pp_stub ()
(gdb) call (void*)Perl_eval_pv("require Enbugger;Enbugger->stop;",0)
perl(86836) malloc: *** error for object 0x3: pointer being realloc'd was not allocated
*** set a breakpoint in malloc_error_break to debug
Program received signal SIGABRT, Aborted.
0x00007fff8269d82a in __kill ()
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on"
Evaluation of the expression containing the function (Perl_eval_pv) will be abandoned.
(gdb)
Am I doing it wrong? Are there other options?
P.S. If you think you could benefit from a debugger attached to a running process yourself, you can insert a debugger back door triggered by SIGUSR1:
use Enbugger::OnError 'USR1';
Then you can simply kill -USR1 pid and your process will jump into the debugger.
First, please use a DEBUGGING perl, if you want to inspect it with gdb.
Please define "stuck". Busy or non-busy waiting (high or low CPU), eating memory or not?
With while 1 it is busy waiting. I usually get busy waiting (endless cycles) on HV corruption in Perl_hfree_next_entry() since 5.15. Non-busy waiting is usually waiting on a blocking IO read.
I get the correct:
`0x00007fba15ab35c1 in Perl_runops_debug () at dump.c:2266`
`2266 } while ((PL_op = PL_op->op_ppaddr(aTHX)));`
and can inspect everything, much more than with a simple perl debugger. With a non-threaded perl you have to type less.
`(gdb) p Perl_op_dump(PL_op)`
and so on.
If you have to do with perl: Inside the pp_stub function it is not a good idea to enter the Enbugger runloop, you should be in the main runloop in dump.c. Set a breakpoint to the shown line.
"error for object 0x3" on eval sound like internal corruption in the context, so you should look at the cx and stack pointers. Probably because you started it in a bad context.
I've never used gdb, but maybe you could get something useful out of strace?
strace -f -s512 -p <PID>
http://metacpan.org/pod/App::Stacktrace
“perl-stacktrace prints Perl stack traces of Perl threads for a given Perl process. For each Perl frame, the full file name and line number are printed.”

how to use a shell script to supervise a program?

I've searched around but haven't quite found what I'm looking for. In a nutshell I have created a bash script to run in a infinite while loop, sleeping and checking if a process is running. The only problem is even if the process is running, it says it is not and opens another instance.
I know I should check by process name and not process id, since another process could jump in and take the id. However all perl programs are named Perl5.10.0 on my system, and I intend on having multiple instances of the same perl program open.
The following "if" always returns false, what am I doing wrong here???
while true; do
if [ ps -p $pid ]; then
echo "Program running fine"
sleep 10
else
echo "Program being restarted\n"
perl program_name.pl &
sleep 5
read -r pid < "${filename}_pid.txt"
fi
done
Get rid of the square brackets. It should be:
if ps -p $pid; then
The square brackets are syntactic sugar for the test command. This is an entirely different beast and does not invoke ps at all:
if test ps -p $pid; then
In fact that yields "-bash: [: -p: binary operator expected" when I run it.
Aside from the syntax error already pointed out, this is a lousy way to ensure that a process stays alive.
First, you should find out why your program is dying in the first place; this script doesn't fix a bug, it tries to hide one.
Secondly, if it is so important that a program remain running, why do you expect your (at least once already) buggy shell script will do the job? Use a system facility that is specifically designed to restart server processes. If you say what platform you are using and the nature of your server process. I can offer more concrete advice.
added in response to comment:
Sure, there are engineering exigencies, but as the OP noted in the OP, there is still a bug in this attempt at a solution:
I know I should check by process name
and not process id, since another
process could jump in and take the id.
So now you are left with a PID tracking script, not a process "nanny". Although the chances are small, the script as it now stands has a ten second window in which
the "monitored" process fails
I start up my week long emacs process which grabs the same PID
the nanny script continues on blissfully unaware that its dependent has failed
The script isn't merely buggy, it is invalid because it presumes that PIDs are stable identifiers of a process. There are ways that this could be better handled even at the shell script level. The simplest is to never detach the execution of perl from the script since the script is doing nothing other than watching the subprocess. For example:
while true ; do
if perl program_name.pl ; then
echo "program_name terminated normally, restarting"
else
echo "oops program_name died again, restarting"
fi
done
Which is not only shorter and simpler, but it actually blocks for the condition that you are really interested in: the run-state of the perl program. The original script repeatedly checks a bad proxy indication of the run state condition (the PID) and so can get it wrong. And, since the whole purpose of this nanny script is to handle faults, it would be bad if it were faulty itself by design.
I totally agree that fiddling with the PID is nearly always a bad idea. The while true ; do ... done script is quite good, however for production systems there a couple of process supervisors which do exactly this and much more, e.g.
enable you to send signals to the supervised process (without knowing it's PID)
check how long a service has been up or down
capturing its output and write it to a log file
Examples of such process supervisors are daemontools or runit. For a more elaborate discussion and examples see Init scripts considered harmful. Don't be disturbed by the title: Traditional init scripts suffer from exactly the same problem like you do (they start a daemon, keep it's PID in a file and then leave the daemon alone).
I agree that you should find out why your program is dying in the first place. However, an ever running shell script is probably not a good idea. What if this supervising shell script dies? (And yes, get rid of the square braces around ps -p $pid. You want the exit status of ps -p $pid command. The square brackets are a replacement for the test command.)
There are two possible solutions:
Use cron to run your "supervising" shell script to see if the process you're supervising is still running, and if it isn't, restart it. The supervised process can output it's PID into a file. Your supervising program can then cat this file and get the PID to check.
If the program you're supervising is providing a service upon a particular port, make it an inetd service. This way, it isn't running at all until there is a request upon that port. If you set it up correctly, it will terminate when not needed and restart when needed. Takes less resources and the OS will handle everything for you.
That's what kill -0 $pid is for. It returns success if a process with pid $pid exists.

Emacs shell-command restart on crash

Is it possible to detect when a long running process started with shell-command crashes, so it can automatically be restarted? Without manually checking its buffer and restarting by hand.
I wouldn't handle this from Emacs at all. Instead, I'd write a wrapper script around my original long-running process that restarts the process if it dies in a partiular way. For example, if your program dies by getting the SIGABRT signal, the wrapper script might look like this:
#!/bin/bash
while true
do
your-original-command --switch some args
if [ $? -ne 134 ]; then break; fi
echo "Program crashed; restarting"
done
I got the value 134 for the SIGABRT signal by doing this:
perl -e 'kill ABRT => $$'; echo $?
This is all assuming some kind of Unix-y system.