IO::Select loop is somehow keeping child from exiting? - perl

I have a perl IO::Select loop for running a command (with no stdin) and saving it's output into a variety of logs.
It's worked great for a number of years, but suddenly it's misbehaving with one particular command. The command runs to completion and exits properly if it's run from the command line, but when I run it from this perl code then after it's done (as seen from the logs) the child process won't exit, and it just sits in a hung state. I don't have strace rights on the systems this is happening on, so I can't see exactly what is stuck.
The code is essentially:
# Start process
open(local *nostdin, '<', '/dev/null') or die $!;
my $pid = open3('<&nostdin',
my $outH = gensym(),
my $errH = gensym(),
$command
);
# Setup select
my $sel = IO::Select->new();
$sel->add($outH);
$sel->add($errH);
# Select loop
while (my #ready = $sel->can_read()) {
foreach my $handle (#ready) {
my $bytesRead = sysread($handle, my $buf='', 1024);
if ($bytesRead <= 0) {
warn("Error reading command out $!\n Command: $command\n") if $bytesRead;
$sel->remove($handle);
next;
}
if ($handle==$outH) {
syswrite(SYSOUT,$buf) if $stdoutOpen;
syswrite(SYSLOG,$buf) if $logOpen;
}
if ($handle==$errH) {
syswrite(SYSERR,$buf) if $stderrOpen;
syswrite(SYSLOG,$buf) if $logOpen;
}
}
}
Is there anything I could be doing wrong? And even if there is a mistake, how can the perl I/O loop keep the child process from doing it's exit? It seems once the child exits it would close all it's I/O and we'd finish this loop (as has always been the case with this code).
I've done some tests using waitPid and some output, and it looks like the $outH is closing, but $errH never has any data and never closes, and then I can see from waitPid that the child has exiting.
So I open two pipes, for child out and child err. The child uses the output pipe and never the error pipe. Then the child closes the output pipe and exits, somehow this exit isn't closing the error pipe. How is this happening? Is this something a child could even do on purpose?

Related

want to get the die error message from a command executed via open in Perl

I am trying to fork out a cmd like below
my $h = IO::Handle->new;
$self->{-handle} = $h;
die "IO::Handle->new failed." unless defined $h;
$self->{-pid} = open $h, $self->{-command} . ' 2>&1 |';
$self->fileevent($h, 'readable' => [\&_read_cmd_op, $self]);
sub _read_cmd_op{
my $h = $self->{-handle};
if ( sysread $h, $_, 4096 ) {
my $t = $self->Subwidget('text');
$t->insert('end', $_);
$t->yview('end');
} else {
$self->{-finish} = 1;
}
}
Now the problem is that the '$self{-command}' is invoking
another perl script which if dies I want to know.
Note that the $self{-pid} still exists even if cmd dies.
The above code is in a Perl/TK app, where the $self->{-command} o/p in captured in a
text widget.
Somehow i don't get the die message even in the test widget.
I see it on stdout.
2 questions
How can i get the cmd op/error in the text widget?
How can i know that the command fired via IO::Handle died?
$self->{-pid} is just the pid of the forked process, not some magic object which goes away if the command exits.
I cannot reproduce the problem not getting the die() message. If the snippet above is called with 'perl -e "die 123"', then "123" appears in the text widget (at least on a Unix system).
For getting the exit code you can use something like the following.
} else {
$mw->fileevent($h, 'readable', '');
my $pid = waitpid($self->{-pid},0);
warn "pid $pid finished";
warn "retcode is " . ($? >> 8);
$self->{-finish} = 1;
}
The fileevent call with the empty callback stops further selects on this filehandle. With the waitpid call you wait for the termination of the child process. Once this happens, the exit code is available in the $? variable, like after a normal system() call. So for a non-zero exit code you know that the command died or exited with a false value.

Open3 outputting to std error only and not asynchronously

I am relatively new to perl programming and I am trying to figure out how open3 works. Here is the code.
#!/usr/bin/perl
use strict;
use warnings;
use IPC::Open3;
my $dir = "/home/vman/Documents/Dev/perl_scripts/Provenance/temp";
my $fileHandle;
my $bashPid;
print "Starting main program\n";
my $pid = fork();
if($pid)#Parent process2
{
print("Start transfer.\n");
$bashPid = $pid;
#Attaching an strace to the executed command which happens in the child process
open3(\*WRITE, \*READ,\*ERROR,"strace", "-f", "-F", "-e", "trace=open,execve","-p", $bashPid, "-s", "2097152","-q");
while(<READ>)
{
print("Here1\n");
print("$_");
}
while(<ERROR>)
{
print("$_");
}
print("Finish transfer.\n");
}
elsif($pid == 0)
{
if (scalar(#ARGV == 0))
{
exit
}
my $args = join(' ', #ARGV);
exec($args);
}
else
{
die("Could not fork.");
}
close(READ);
close(WRITE);
close(ERROR);
waitpid($bashPid, 0);
print "End of main program\n";
I want to run an strace on a bash process, then capture all the output while it is being outputted. Then I will take that output and parse it to see what files are being changed by which process and I will save those changes in a mysql database. For now all I am trying to do is attach an strace onto an existing bash process and get the output of that strace printed within the bash terminal that is running just to make sure that it is asynchronously reading the output.
One of the problems is that I am getting the output through the ERROR filehandle. I am a little confused on to why this is happening. Am I using the correct order for open3 and if there is an error why is the correct output even making it to stderr?
The second problem I have is that I am getting the output only when exec ends which is no good since it needs to be done while exec is running. I thought open3 runs asynchronously.
As per suggested this is what I did and it works perfectly.
#!/usr/bin/perl
use strict;
use warnings;
use IPC::Run3;
my $bashPid;
print "Starting main program\n";
my $pid = fork();
if($pid)#Parent process
{
print("Start transfer.\n");
$bashPid = $pid;
#Attaching an strace to the executed command which happens in the child process
my $command = "strace -fFe trace=open,execve -p $bashPid -s 2097152 -q";
run3($command, \*STDIN, \*STDOUT, \*STDERR);
if ($?)
{
die "something went horribly wrong";
}
while(<STDERR>)
{
print($_);
}
print("Finish transfer.\n");
}
elsif($pid == 0)#cild process
{
if (scalar(#ARGV == 0))
{
exit
}
my $args = join(' ', #ARGV);
exec($args);
}
else
{
die("Could not fork.");
}
close(STDIN);
close(STDOUT);
close(STDERR);
waitpid($bashPid, 0);
print "End of main program\n";
One of the problems is that I am getting the output through the ERROR filehandle
Correct. strace writes to STDERR.
The second problem I have is that I am getting the output only when exec ends which is no good since it needs to be done while exec is running. I thought open3 runs asynchronously.
That's because you only start reading from the child's STDERR after the child closes its STDOUT when it ends.
In fact, you're lucky you haven't deadlocked yet. By reading one at a time as you are currently, doing, you'll deadlock when strace has output enough to fill the pipe.
You need to read from both the child's STDOUT and STDERR as it comes in. You could do this using with the help of select, polling non-blocking handle or threads. None of those options are as simple as ditching open3 and using a higher-level module that handles this for you. The simpler IPC::Run3 and the fully featured IPC::Run are good choices.

close multiple output pipes in perl without blocking on each one

I have a perl script which sends a lot of output to multiple subprocesses. I need to be able to close my end of all the pipes and then wait for the subprocesses to finish their work. So far I've only succeeded at closing each pipe and waiting for each subprocess to finish one by one.
More concretely, I'm doing something like this:
for ($i=0;$i<24;$i++) {
my $fh;
open $fh, "|externalprogram $i";
$fhs{$i}=$fh;
}
#...now I can write output to the pipes
while (moreworktodo()) {
$whichone, $data = do_some_work();
print $fhs{$whichone} $data;
}
#Now I just need to wait for all the subprocesses to finish. However, they
#need to do a lot of work that can only begin when they've finished reading input. So I need to close my end of the pipe to indicate I'm finished.
for ($i=0;$i<24;$i++) {
my $file = $fhs{$i};
close $file; #unfortunately, this blocks until process $i finishes
#meanwhile all the other processes are waiting for EOF
#on their STDIN before they can proceed. So I end up waiting
#for 24 processes to finish one-at-a-time instead of all at once
}
One way to get all the subprocesses to finish promptly (closing their stdin) is simply to let my script exit without closing the (pipe) filehandles at all, but that's no good because the script is part of a larger job that needs the subprocess' work to actually be done before proceeding.
What is a simple way to close each subprocesses' stdin (so that they can all finish working) and then wait for all of them to finish before proceeding? I've tried forking off a child to close each pipe but that doesn't seem to work -- only the parent's "close" actually closes the stdin of the subprocess and waits for the subprocess to finish.
I would create the pipes myself and not use open(P, "|external-program").
Then you can close the pipe and not wait for the child process to exit.
Example of opening a pipe to a child process yourself:
sub spawn {
my ($cmd) = #_;
pipe(my $rp, $wp) or die "pipe failed: $!";
my $pid = fork();
die "fork: $!" unless defined($pid);
if ($pid) {
# parent
close($rp);
return ($wp, $pid);
} else {
# child
close($wp);
open(STDIN, "<&", $rp);
exec($cmd) or die "exec: $!";
}
}
sub main {
$| = 1;
my ($wp, $pid) = spawn("./child");
for (1..10) {
print {$wp} "sending $_\n";
}
close($wp);
print "done\n";
}
main();
Here's a sample child program to test that close() is NOT waiting for the child to exit:
# file: ./child
while (<STDIN>) {
print "got: $_";
sleep(2);
}
The last piece of the puzzle is to asynchronously wait for the child processes to exit.
This can be done with a $SIG{CHLD} handler, or, alternatively, here is a simplistic "join_children" function:
my #child_ids = (1..24); # or whatever ids you want to use
my %pipe; # hash map from child_id -> pipe handle
sub join_children {
for my $id (#child_ids) {
close( $pipe{$id} );
}
my $count = scalar(#child_ids);
while ($count > 0) {
wait;
$count--;
}
}

Initiating Non-waiting Background Process in Perl

I have a Perl script that needs to kick off another process in the background and exit without waiting for the other script to finish. There are lots of threads on StackOverflow that cover how to wait in Perl or how to not wait for other programming languages, but I can't seem to find the right answer for Perl.
I've read up quite a bit and thought I was doing the right things but none of my attempts seem to be working correctly. Here are all the variations I've tried so far to no avail:
system(qq|perl /util/script.pl $id|);
system(qq|perl /util/script.pl $id &|);
exec(qq|perl /util/script.pl $id|);
exec(qq|perl /util/script.pl $id &|);
With each of these the parent process continues to wait for the child to finish before exiting. Please let me know what I'm doing wrong and the right way to fork the background process.
Thanks in advance for your help!
Full code to help with debugging. Note that the API->Function() calls are object oriented modules our code base uses for specific functions that database interactions, etc:
sub Add {
my $self = shift;
my $domain = shift;
if(!$self->IsValid($domain)) { return; }
my $SQL = qq| insert into domains set added=NOW(),domain=| . API->Database->Quote($domain);
my $sth = API->DatabaseQuery($SQL);
$SQL = qq| select last_insert_id() |;
$sth = API->DatabaseQuery($SQL);
my $id = $sth->fetchrow_array();
my $command = qq|perl /home/siteuser/util/new.pl $id &|;
system($command);
return {'id'=>$id,'domain'=>$domain};
}
The first one is designed to work that way - system executes the command and finishes for it to end.
The last two are also designed that way - exec specifically is designed to never return. It basically replaces the parent process with the child process.
However, the second one should do the trick: the command launched from the system call is shell, which is given your string to execute. Since the string ends with "&", that means the shell will launch your command as a background process, and finish its own execution after that launch.
Can you please post more code illustrating how #2 didn't work?
Also, see what happens if you try backticks or qx:
my $output = qx|perl /util/script.pl $id &|;
print $output;
Also, as a way of reducing unknowns, can you please run the following and tell me what prints:
my $output = qx|(echo "AAAAAAA"; /bin/date; sleep 5; /bin/date; echo "BBBBBBB") &|;
print $output;
Are you calling fork() before calling system or exec?
my $pid = fork();
if (defined($pid) && $pid==0) {
# background process
my $exit_code = system( $command );
exit $exit_code >> 8;
}
my $pid = fork();
if (defined($pid) && $pid==0) {
# background process
exec( $command );
# doesn't^H^H^H^H^H^H shouldn't return
}
You need to disassociate the child from the parent.
See perldoc -q daemon. Or Proc::Daemon
Using fork is a good way to background processes:
my $pid = fork;
die "fork failed" unless defined $pid;
if ($pid == 0) {
# child process goes here
do '/util/script.pl';
exit;
}
# parent process continues here

How can I prevent the parent from blocking when writing to a child?

Recently I had a problem using (pipe |-) when I wanted to communicate between two processes.
Basically, the child process couldn't process STDIN as fast as it was filled up by parent. This caused parent to wait until STDIN was free and made it run slow.
How big can STDIN be and is it possible to modify it. If yes, what is the best practice size?
Here is some code sample to show what I mean:
if ($child_pid = open($child, "|-"))
{
$child->autoflush(1);
# PARENT process
while (1)
{
# Read packet from socket save in $packet
process_packet($packet);
# forward packet to child
print $child $packet;
}
}
else
{
die "Cannot fork: $!" unless defined $child_pid;
# CHILD process
my $line;
while($line = <STDIN>)
{
chomp $line;
another_process_packet($line);
}
}
In this sample another_process_packet slower than process_packet. The reason I write the code like this is, I want to use same data comes from socket and actually get it once.
Thanks in advance.
You can of course buffer in the parent process, and only write to the child when the child's fd is writable (i.e., writing won't block). You can do this yourself with the right args to syswrite, or use an event loop:
use AnyEvent;
use AnyEvent::Handle;
# make child, assume you write to it via $fh
my $done = AnyEvent->condvar;
my $h = AnyEvent::Handle->new( fh => $fh );
while( you do stuff ){
my $data = ...;
$h->push_write($data); # this will never block
}
$h->on_drain(sub { $done->send });
$done->wait; # now you block, waiting for all writes to actually complete
Edit: This used to be untested, but I tested it, and it works. (I used perl -ne "sleep 1; print $_" as the slow child.) Writes proceed during the while loop, if possible, but never block the loop. At the end, you actually block until all the writes have completed.
My test scripts are on gist.github: http://gist.github.com/126488
You can see how the child blocks the blocking loop, but how it doesn't block the non-blocking loop. Obvious when you put it that way ;)
(Finally, as a general rule of thumb; if you are interacting with the network or with other processes, you should probably be using an event loop.)
The size is set in the kernel. You can either recompile the kernel with a higher limit or use an intermediary buffer process.
Process handle contains a member function named 'blocking'. Just set the blocking to 0, and the parent process will not be blocked.
if ($child_pid = open($child, "|-"))
{
$child->blocking(0); # Key to the solution.
$child->autoflush(1);
# PARENT process
while (1)
{
# Read packet from socket save in $packet
process_packet($packet);
# forward packet to child
print $child $packet;
}
}
else
{
die "Cannot fork: $!" unless defined $child_pid;
# CHILD process
my $line;
while($line = <STDIN>)
{
chomp $line;
another_process_packet($line);
}
}