I have a script which is used to redeploy a couple programs in a custom server environment, (ie: not an established standard container which has code hotswapping). To do this, it takes down the server processes, but these take some time to fully close all their connections. These aren't child processes of the perlscript. They run for hundreds of days at a time normally, so I'd rather not have to wrap the server processes in perlscripts just so I can fork them to shut them down elegantly months or years later.
So currently to wait on them to die during redeployment, I'm parsing the output of ps -ef, grabbing the pid field, killing that pid, waiting 60 seconds, (which seems a reasonable time with these processes), rechecking the ps -ef to make sure they're dead, etc. Go on with copies, chmods, etc.
This solution feels lame/clunky to me. I've google'd all over and have not seen anything on this particular topic; there's a pile of material about waiting on forked children, and waitpid would be perfect if only it operated in this way.
From reading How to wait for exit of non-children processes (which is c specific)I'm guessing there's really not much else I can do, apart from reading /proc/pid instead, but I thought maybe there'd be a perl-specific solution out there somewhere. Any ideas?
You can use kill 0, $pid (returns 1 on success and 0 on failure) instead of rechecking ps -ef, but that has the possible gotcha that the pid may have been reused.
If you already have ps-parsing code, it's probably not worth it to switch, but there's Proc::ProcessTable.
Other than that, no ideas.
In Unix \ Linux only the parent process gets a signal when a process exits parent process - This is an OS feature, and not language specific.
Other solutions will be equivalent to yours - checking the process table for the existence of the process (although the specific method may vary - like using ps or directly querying the kernel)
Related
I have a Perl script that spawns some children. They all take quite a while to run, creating directories and files along the way. I often notice things I'd like to change before the children die of natural causes, so then I have to shut everything down (which involves a few grep and kill calls) and delete any files the children created. It's not really a big deal, but a bit of a pain in the neck. I'd like to create a setup where the children are all monitored so when I start up the parent again, old children still running are reported.
My best idea so far is keeping a log file of children I've run with their PIDs, checking it and updating it at the start of the parent script. Kill and report any children still running, and die to let the user clean up directories and files by hand before a fresh run.
So my question is this: How might I go about adding children to the log file? Is there a way to set up a trigger which could automatically take care of it, or am I stuck having to remember to do it any place in the code where a new process starts?
p.s. I'm certainly open to suggestions of better ways to accomplish this!
How about signal handlers? The parent can track the pids of the jobs it has launched (and if you want, the pids that have been reaped with a SIGCHLD handler). When you want to terminate everything prematurely, signal the parent to kill off all the children and clean up after them. If the child processes are also Perl scripts, you could put signal handlers in the children and see if they will clean up after themselves.
I have a Perl script that forks.
Each fork runs an external program, parses the output, and converts the output to a Storable file.
The Storable files are then read in by the parent and the total data from each of the children are analyzed before proceeding onto a repeat of the previous fork or else the parent stops.
What exactly happens when I issue a ^C while some of the children are still running the external program? The parent perl script was called in the foreground and, I presume, remained in the foreground despite the forking.
Is the SIGINT passed to all children, that is, the parent, the parent's children, and the external program called by the children??
UPDATE:
I should add, it appears that when I issue the SIGINIT, the external program called by the children of my script seem to acknowledge the signal and terminate. But the children, or perhaps the parent program, carry on. This is all unclear to me.
UPDATE 2:
With respect to tchrist's comment, the external program is called with Perl's system() command.
In fact, tchrist's comment also seems to contain the explanation I was looking for. After some more debugging, based on the behavior of my program, it appears that, indeed, SIGINT is being passed from the parent to all children and from all children to all of their children (the external program).
Thus, what appears to be happening, based on tchrist's comment, is that CTRL-C is killing the external program which causes the children to move out of the system() command - and nothing more.
Although I had my children check the exit status of what was called in system(), I was assuming that a CTRL-C would kill everything from the parent down, rather than lead to the creation of more rounds of processing, which is what was happening!!!
SOLUTION (to my problem):
I need to just create a signal handler for SIGINT in the parent. The signal handler would then send SIGTERM to each of the children (which I presume would also send a SIGTERM to the children's children), and then cause the parent to exit gracefully. Although this somewhat obvious solution likely would have fixed things, I wanted to understand my misconception about the behavior of SIGINT with respect to forking in Perl.
Perl’s builtin system function works just like the C system(3) function from the standard C library as far as signals are concerned. If you are using Perl’s version of system() or pipe open or backticks, then the parent — the one calling system rather than the one called by it — will IGNORE any SIGINT and SIGQUIT while the children are running. If you’ve you rolled your own using some variant of the fork?wait:exec trio, then you must think about these matters yourself.
Consider what happens when you use system("vi somefile") and hit ^C during a long search in vi: only vi takes a (nonfatal) SIGINT; the parent ignores it. This is correct behavior. That’s why C works this way, and that’s why Perl works this way.
The thing you have to remember is that just because a ^C sends a SIGINT to all processes in the foregrounded process group (even those of differing effective UID or GID), that does not mean that it causes all those processes to exit. A ^C is only a SIGINT, meant to interrupt a process, not a SIGKILL, meant to terminate with no questions asked.
There are many sorts of program that it would be wrong to just kill off with no warning; an editor is just one such example. A mailer might be another. Be exceedingly careful about this.
Many sorts of programs selectively ignore, trap, or block (means delay delivery of) various sorts of signals. Only the default behavior of SIGINT is to cause the process to exit. You can find out whether this happened, and in fact which signal caused it to happen (amongst other things), with this sort of code on traditional operating systems:
if ($wait_status = system("whatever")) {
$sig_killed = $wait_status & 127;
$did_coredump = $wait_status & 128;
$exit_status = $wait_status >> 8;
# now do something based on that...
}
Note carefully that a ^C’d vi, for example, will not have a wait status word indicating it died from an untrapped SIGINT, since there wasn’t one: it caught it.
Sometimes your kids will go and have kids of their own behind your back. Messy but true. I have therefore been known, on occasion, to genocide all progeny known and unknown this way:
# scope to temporize (save+restore) any previous value of $SIG{HUP}
{
local $SIG{HUP} = "IGNORE";
kill HUP => -$$; # the killpg(getpid(), SIGHUP) syscall
}
That of course doesn’t work with SIGKILL or SIGSTOP, which are not amenable to being IGNOREd like that.
Another matter you might want to be careful of is that before the 5.8 release, signal handling in Perl has not historically been a reliably safe operation. It is now, but this is a version-dependent issue. If you haven’t yet done so, then you should definitely read up on deferred signals in the perlipc manpage, and perhaps also on the PERL_SIGNALS envariable in the perlrun manpage.
When you hit ^C in a terminal window (or any terminal, for that matter), it will send a SIGINT to the foreground process GROUP in that terminal. Now when you start a program from the command line, its generally in its own process group, and that becomes the foreground process group. By default, when you fork a child, it will be in the same process group as the parent, so by default, the parent (top level program invoked from the command line), all its children, children's children, etc, as well as any external programs invoked by any of these (which are all just children) will all be in that same process group, so will all receive the SIGINT signal.
However, if any of those children or programs call setpgrp or setpgid or setsid, or any other call that causes the process to be a in a new progress group, those processes (and any children they start after leaving the foreground process group) will NOT receive the SIGINT.
In addition, when a process receives a SIGINT, it might not terminate -- it might be ignoring the signal, or it might catch it and do something completely different. The default signal handler for SIGINT terminates the process, but that's just the default which can be overridden.
edit
From your update, it sounds like everything is remaining in the same process group, so the signals are being delivered to everyone, as the grandchildren (the external programs) are exiting, but the children are catching and ignoring the SIGINTs. By tchrist's comment, it sounds like this is the default behavior for perl.
If you kill the parent process, the children (and external programs running) will still run until they terminate one way or another. This can be avoided if the parent has a signal handler that catches the SIGINT, and then kills the process group (often parent's pid). That way, once the parent gets the SIGINT, it will kill all of it's children.
So to answer your question, it all depends on the implementation of the signal handler. Based on your update, it seems that the parent indeed kills it's children and then instead of terminating itself, it goes back to doing something else (think of it as a reset instead of a complete shutdown/start combination).
I have this Perl script for monitoring a folder in Linux.
To continuously check for any updates to the directory, I have a while loop that sleeps for 5 minutes in-between successive loops :
while(1) {
...
sleep 300;
}
Nobody on my other question suggested using cron for scheduling instead of a for loop.
This while construct, without any break looks ugly to me as compared to submitting a cronjob using crontab :
0 */5 * * * ./myscript > /dev/null 2>&1
Is cron the right choice? Are there any advantages of using the while loop construct?
Are there any better ways of doing this except the loop and cron?
Also, I'm using a 2.6.9 kernel build.
The only reasons I have ever used the while solution is if either I needed my code to be run more than once a minute or if it needed to respond immediately to an external event, neither of which appear to be the case here.
My thinking is usually along the lines of: cron has been tested by millions and millions of people over decades so it's at least as reliable as the code I've just strung together.
Even in situations where I've used while, I've still had a cron job to restart my script in case of failure.
My advice would be to simply use cron. That's what it's designed for. And, as an aside, I rarely redirect the output to /dev/null, that makes it too hard to debug. Usually I simply redirect to a file in the /tmp file system so that I can see what's going on.
You can append as long as you have an automated clean-up procedure and you can even write to a more private location if you're worried about anyone seeing stuff in the output.
The bottom line, though, is that a rare failure can't be analysed if you're throwing away the output. If you consider your job to be bug-free then, by all means, throw the output away but I rarely consider my scripts bug-free, just in case.
Why don't you make the build process that puts the build into the directory do the notification? (See SO 3691739 for where that comes from!)
Having cron run the program is perfectly acceptable - and simpler than a permanent loop with a sleep, though not by much.
Against a cron solution, since the process is a simple one-shot, you can't tell what has changed since the last time it was run - there is no state. (Or, more accurately, if you provide state - via a file, probably - you are making life much more complex than running a single script that keeps its state internally.)
Also, stopping the notification service is less obvious. If there's a single process hanging around, you kill it and the notifications stop. If the notifications are run by cron, then you have to know that they're run out of a crontab, know whose crontab it is, and edit that entry in order to stop it.
You should also consider persuading your company to upgrade to a version of Linux where the inotify mechanism is available.
If you go for the loop instead of cron and want your job run at regular intervals, sleep(300) tends to drift. (consider the execution time of the rest of your script)
I suggest using a construct like this:
use constant DELAY => 300;
my $next=time();
while (1){
$next+=DELAY;
...;
sleep ($next-time());
};
Yet another alternative is the 'anacron' utility.
if you don't want to use cron.
this http://upstart.ubuntu.com/ can be used to babysit processes.
or you can use watch whichever is easier.
I have a Perl program that needs to run about half a dozen programs at the same time in the background and wait for them all to finish before continuing. It is also very important that the exit status of each can be captured.
Is there a common idiom for doing this in Perl? I'm currently thinking of using threads.
Don't use threads. Threads suck. The proper way is to fork multiple processes and wait for them to finish. If you use wait or waitpid, the exit status of the process in question will be available in $?.
See the perldocs for fork, wait, and waitpid, and also the examples in this SO thread.
If all you need is to just manage a pool of subprocesses that doesn't exceed a certain size, check out the excellent Parallel::ForkManager.
Normally you would fork + exec (on unix based systems, this is traditional)
The fork call will duplicate the current process, and if you needed to you could then call exec in one of the children to do something different. It sounds like you just want to fork and call a different method/function in your child process.
If you want something more complex, check cpan for POE - that lets you manage all sorts of complex scenarios.
Useful links:
"spawning multiple child processes" on PerlMonks
Google "perl cookbook forking server" too - only allowed to post one link unless I log in.
I'm writing a Perl script that runs 4 simultaneous, identical processes with different input parameters (see background here - the rest of my question will make much more sense after reading that).
I am making a system() call to a program that generates data (XFOIL, again see above link). My single-core version of this program looks like this:
eval{
local $SIG{ALRM} = sub{die "TIMEOUT"};
alarm 250;
system("xfoil <command_list >xfoil_output");
alarm 0;
};
if ($#){
# read the output log and run timeout stuff...
system('killall xfoil') # Kill the hung XFOIL. now it's a zombie.
}
Essentially, XFOIL should take only about 100 seconds to run - so after 250 seconds the program is hanging (presumably waiting for user input that it's never going to get).
The problem now is, if I do a killall in the multi-core version of my program, I'm going to kill 3 other instances of XFOIL, and those processes are generating data. So I need to kill only the hung instance, and this requires getting a PID.
I don't know very much about forks and such. From what I can tell so far, I would run an exec('xfoil') inside the child process that I fork. But the PID of the exec() will be different than the PID of the child process (or is it? It's a separate process so I'd assume it is, but again I've no experience with this..), so this still doesn't help when I want to forcefully kill the process since I won't have the PID anyway. How do I go about doing this?
Thanks a ton for your help!
If you want the PID, fork the process yourself instead of using system. The system command is mostly designed as a "fire and forget" tool. If you want to interact with the process, use something else. See, for instance, the perlipc documentation.
I think you've already looked at Parallel::ForkManager based on answers to your question How can I make my Perl script use multiple cores for child processes?