Perl: When does "system" actually return? - perl

In a CGI webpage, I have a button which when pressed submits a form and a subroutine is called which has:
sub run {
&emailDebug("Started " .localtime);
system("(/tools/script1.pl &) ; (/tools/script2.pl &)");
&emailDebug("Ended " .localtime);
}
They both start at 11:08:05 (hence the fancy command), I take timestamps in the scripts themselves and send them in an email to myself.
The second finished 11:08:22 and the first 11:08:36 but the sent emails from above would show 11:08:06.
Most interestingly, the page is loading for about 30 seconds, as long as the longer of the two scripts run.
I don't mind the page loading but I don't understand why it behaves like this. As the
page is loading, clearly the subroutine run itself doesn't return, but both emails are sent almost at the same time.

system returns as soon as command that you call lets it to. In this particular example it should return immediately, but this may break depending on shell. To be extra sure you might try to just split call in two:
system("/tools/script1.pl &");
system("/tools/script2.pl &");
Also, depending on your web-server configuration, it may notice that you still have STDOUT open in forked childs and it will wait for all them to end before serving response, even if your main script is already ended. Redirect your spawned scripts STDOUT/ERR to /dev/null, close(STDOUT) and close(STDERR) in main script or consult your web-server/framework documentation on how to flush response when you're done with output.

You are executing the scripts as background processes so system returns immediately. If you want to wait for completion then remove the &s and execute them sequentially:
system("(/tools/script1.pl) ; (/tools/script2.pl)");

What do you mean by those print statements above? Your code shows no print calls
Your run subroutine is returning as soon as the two child processes are started. You could make things clearer by making two calls to system and removing the parentheses in the command lines. Because of the trailing ampersand both calls to system will return immediately leaving the children to run independently
Something else must be delaying the return of the page after the call to run, and presumably it is somehow detecting when the work of the child processes is complete
By the way, you should never use an ampersand on a subroutine name when you are making a simple call. It hasn't been necessary since we got Perl 5 eighteen years ago

Related

How can I have one perl script call another and get the return results?

How can I have one perl script call another perl script and get the return results?
I have perl Script B, which does a lot of database work, prints out nothing, and simply exits with a 0 or a 3.
So I would like perl Script A call Script B and get its results. But when I call:
my $result = system("perl importOrig.pl filename=$filename");
or
my $result = system("/usr/bin/perl /var/www/cgi-bin/importOrig.pl filename=$filename");
I get back a -1, and Script B is never called.
I have debugged Script B, and when called manually there are no glitches.
So obviously I am making an error in my call above, and not sure what it is.
There are many things to consider.
Zeroth, there's the perlipc docs for InterProcess Communication. What's the value in the error variable $!?
First, use $^X, which is the path to the perl you are executing. Since subprocesses inherit your environment, you want to use the same perl so it doesn't confuse itself with PERL5LIB and so on.
system("$^X /var/www/cgi-bin/importOrig.pl filename=$filename")
Second, CGI programs tend to expect particular environment variables to be set, such as REQUEST_METHOD. Calling them as normal command-line programs often leaves out those things. Try running the program from the command line to see how it complains. Check that it gets the environment it wants. You might also check the permissions of the program to see if you (or whatever user runs the calling program) are allowed to read it (or its directory, etc). You say there are no glitches, so maybe that's not your particular problem. But, do the two environments match in all the ways they should?
Third, consider making the second program a modulino. You could run it normally as a script from the command line, but you could also load it as a Perl library and use its features directly. This obviates all the IPC stuff. You could even fork so that stuff runs concurrently.

Is it possible to detect which pipe threw a SIGPIPE?

I'm trying to deal with a server that works as follows:
It has a parent process
It creates a "helper" child process to handles some special tasks
It opens the child process with a pipe; and uses the pipe to issue commands to the child.
It also spawns off many other child processes (the server's main goal is to execute various commands).
I would like to be able to detect when the write to the pipe to the child process fails; and issue a special notification.
Ordinarily, I would achieve that by creating a custom $SIG{PIPE} handler in the parent process.
However, what I'm concerned with is the fact that some of the processes the parent launches to execute commands might have their own pipes open to them; and if the write to THOSE pipes fails, I'd like to simply ignore the SIGPIPE.
Q1. Is there a way for me to tell from within SIGPIPE handler, which of the open pipes threw the signal? (I know every child's PID, so PID would be fine... or if there's a way to do it via file descriptor #s?).
Q2. Could I solve the problem using local $SIG{PIPE} somehow? My assumption is that I would need to:
Set helper-process-specific local $SIG{PIPE} right before writing to that pipe
do print $HELPER_PIPE (this happens in only one subroutine)
Reset $SIG{PIPE} to DEFAULT or IGNORE
Ensure that these 3 actions are within their own block scope.
The write syscall returns the error EPIPE in the same case when a SIGPIPE is triggered, assuming that the SIGPIPE doesn't succeed in killing the process. So your best bet is to set $SIG{PIPE} = 'IGNORE' (to avoid dying from the signal), to use $fh->autoflush (to avoid PerlIO buffering, ensuring that you're notified of any I/O errors immediately), and to check the return value of print whenever you call it. If print returns false and $!{EPIPE} is set, then you've tried to write to a closed pipe. If print returns false and $!{EPIPE} isn't set, you have some other issue to deal with.
Portably you can't tell. However, you might find your OS supports the SIG_INFO information, and if you can get that up to Perl somehow, the siginfo structure contains a field that gives the FD number on SIGPIPE.

back ticks not working in perl

Got stuck with one problem in our live server.
Have script (perl) which runs almost 15 to 18 hrs a day. it creates 100+ sub process every day . One place it has command (product command which we run in command line solaris box) which is being triggerred with back ticks inside perl code.
It looks like the back ticks command gets skipped or failed randomly.
for eg. if i need to run for 50 customers 2 or 3 gets failed randomly.
I do not see the evidence that the command has been triggerred in anywhere.
since its live server we can't even try making much in code change until we are sure about the problem.
here is the code..
my $comm = "inventory -noX customer1"; #sample command i have given here
my $newLogFile = "To capture command output here we have path whre the file gets created");
my $piddy = `$comm 2>&1 > $newLogFile`;
Is it because of the back ticks it happens I am really not sure :(.
Also tried various analysis like memory/CPU/diskspace/Adding librtld_db.so in LD_LIBRARY_PATH etc....but no luck...Also the perl is in 64 bit ...what else Can i? :(
I suspect you are not checking for errors (and perl doesn't make that easy to do correctly for backticks).
Consider using IPC::System::Simple's capture in place of your backticks/qx.
As its doc says, "If there's an error, it will die with a detailed description of what went wrong."
It shouldn't fail just because of backticks, however because it is spawning a new process, that process may be periodically subject to failure due to system conditions (eg. sysLoad). Backticks are really a "fire and forget" method and should never be used for anything critical in a production environment. As previously suggested, there are far more detailed ways to manage spawning external processes.
If the command's output is being lost due to buffering, you might try turning off buffering, but keep an eye on it for performance degradation (it's usually not significant).
Buffering can be turned off for an entire script by adding this near the top:
$|=1;
When calling external commands, I'm using system of IPC::System::Simple or open3 of IPC::Open3.

What happens to a SIGINT (^C) when sent to a perl script containing children?

I have a Perl script that forks.
Each fork runs an external program, parses the output, and converts the output to a Storable file.
The Storable files are then read in by the parent and the total data from each of the children are analyzed before proceeding onto a repeat of the previous fork or else the parent stops.
What exactly happens when I issue a ^C while some of the children are still running the external program? The parent perl script was called in the foreground and, I presume, remained in the foreground despite the forking.
Is the SIGINT passed to all children, that is, the parent, the parent's children, and the external program called by the children??
UPDATE:
I should add, it appears that when I issue the SIGINIT, the external program called by the children of my script seem to acknowledge the signal and terminate. But the children, or perhaps the parent program, carry on. This is all unclear to me.
UPDATE 2:
With respect to tchrist's comment, the external program is called with Perl's system() command.
In fact, tchrist's comment also seems to contain the explanation I was looking for. After some more debugging, based on the behavior of my program, it appears that, indeed, SIGINT is being passed from the parent to all children and from all children to all of their children (the external program).
Thus, what appears to be happening, based on tchrist's comment, is that CTRL-C is killing the external program which causes the children to move out of the system() command - and nothing more.
Although I had my children check the exit status of what was called in system(), I was assuming that a CTRL-C would kill everything from the parent down, rather than lead to the creation of more rounds of processing, which is what was happening!!!
SOLUTION (to my problem):
I need to just create a signal handler for SIGINT in the parent. The signal handler would then send SIGTERM to each of the children (which I presume would also send a SIGTERM to the children's children), and then cause the parent to exit gracefully. Although this somewhat obvious solution likely would have fixed things, I wanted to understand my misconception about the behavior of SIGINT with respect to forking in Perl.
Perl’s builtin system function works just like the C system(3) function from the standard C library as far as signals are concerned. If you are using Perl’s version of system() or pipe open or backticks, then the parent — the one calling system rather than the one called by it — will IGNORE any SIGINT and SIGQUIT while the children are running. If you’ve you rolled your own using some variant of the fork?wait:exec trio, then you must think about these matters yourself.
Consider what happens when you use system("vi somefile") and hit ^C during a long search in vi: only vi takes a (nonfatal) SIGINT; the parent ignores it. This is correct behavior. That’s why C works this way, and that’s why Perl works this way.
The thing you have to remember is that just because a ^C sends a SIGINT to all processes in the foregrounded process group (even those of differing effective UID or GID), that does not mean that it causes all those processes to exit. A ^C is only a SIGINT, meant to interrupt a process, not a SIGKILL, meant to terminate with no questions asked.
There are many sorts of program that it would be wrong to just kill off with no warning; an editor is just one such example. A mailer might be another. Be exceedingly careful about this.
Many sorts of programs selectively ignore, trap, or block (means delay delivery of) various sorts of signals. Only the default behavior of SIGINT is to cause the process to exit. You can find out whether this happened, and in fact which signal caused it to happen (amongst other things), with this sort of code on traditional operating systems:
if ($wait_status = system("whatever")) {
$sig_killed = $wait_status & 127;
$did_coredump = $wait_status & 128;
$exit_status = $wait_status >> 8;
# now do something based on that...
}
Note carefully that a ^C’d vi, for example, will not have a wait status word indicating it died from an untrapped SIGINT, since there wasn’t one: it caught it.
Sometimes your kids will go and have kids of their own behind your back. Messy but true. I have therefore been known, on occasion, to genocide all progeny known and unknown this way:
# scope to temporize (save+restore) any previous value of $SIG{HUP}
{
local $SIG{HUP} = "IGNORE";
kill HUP => -$$; # the killpg(getpid(), SIGHUP) syscall
}
That of course doesn’t work with SIGKILL or SIGSTOP, which are not amenable to being IGNOREd like that.
Another matter you might want to be careful of is that before the 5.8 release, signal handling in Perl has not historically been a reliably safe operation. It is now, but this is a version-dependent issue. If you haven’t yet done so, then you should definitely read up on deferred signals in the perlipc manpage, and perhaps also on the PERL_SIGNALS envariable in the perlrun manpage.
When you hit ^C in a terminal window (or any terminal, for that matter), it will send a SIGINT to the foreground process GROUP in that terminal. Now when you start a program from the command line, its generally in its own process group, and that becomes the foreground process group. By default, when you fork a child, it will be in the same process group as the parent, so by default, the parent (top level program invoked from the command line), all its children, children's children, etc, as well as any external programs invoked by any of these (which are all just children) will all be in that same process group, so will all receive the SIGINT signal.
However, if any of those children or programs call setpgrp or setpgid or setsid, or any other call that causes the process to be a in a new progress group, those processes (and any children they start after leaving the foreground process group) will NOT receive the SIGINT.
In addition, when a process receives a SIGINT, it might not terminate -- it might be ignoring the signal, or it might catch it and do something completely different. The default signal handler for SIGINT terminates the process, but that's just the default which can be overridden.
edit
From your update, it sounds like everything is remaining in the same process group, so the signals are being delivered to everyone, as the grandchildren (the external programs) are exiting, but the children are catching and ignoring the SIGINTs. By tchrist's comment, it sounds like this is the default behavior for perl.
If you kill the parent process, the children (and external programs running) will still run until they terminate one way or another. This can be avoided if the parent has a signal handler that catches the SIGINT, and then kills the process group (often parent's pid). That way, once the parent gets the SIGINT, it will kill all of it's children.
So to answer your question, it all depends on the implementation of the signal handler. Based on your update, it seems that the parent indeed kills it's children and then instead of terminating itself, it goes back to doing something else (think of it as a reset instead of a complete shutdown/start combination).

In Perl, how can I do a non-blocking system call?

In Perl, without using the Thread library, what is the simplest way to spawn off a system call so that it is non-blocking? Can you do this while avoiding fork() as well?
EDIT
Clarification. I want to avoid an explicit and messy call to fork.
Do you mean like this?
system('my_command_which_will_not_block &');
As Chris Kloberdanz points out, this will call fork() implicitly -- there's really no other way for perl to do it; especially if you want the perl interpreter to continue running while the command executes.
The & character in the command is a shell meta-character -- perl sees this and passes the argument to system() to the shell (usually bash) for execution, rather than running it directly with an execv() call. & tells bash to fork again, run the command in the background, and exit immediately, returning control to perl while the command continues to execute.
The post above says "there's no other way for perl to do it", which is not true.
Since you mentioned file deletion, take a look at IO::AIO. This performs the system calls in another thread (POSIX thread, not Perl pseudothread); you schedule the request with aio_rmtree and when that's done, the module will call a function in your program. In the mean time, your program can do anything else it wants to.
Doing things in another POSIX thread is actually a generally useful technique. (A special hacked version of) Coro uses it to preempt coroutines (time slicing), and EV::Loop::Async uses it to deliver event notifications even when Perl is doing something other than waiting for events.