Redirecting STDOUT in child process - perl

Have a Parent process which spawns multipe child process via fork. I want the log files by the parent and child process to be separate. The Problem is child process STDOUT gets redirected into the parent log file as well as the child log file. Not sure what i need to change to avoid child process log message to get into the parent log file. Also i dont understand in the below setEnvironment function the purpose of creating OUT and ERR file handle. This is a existing code so i kept as it is. In the parent process and child process i set the variable $g_LOGFILE to contain different file name so that separate log files are created. Also i call setEnvironment function in both parent and child process. I tried by closing STDOUT,STDERR,STDIN in the child process and calling the setenvironment but it was not working properly.
sub setEnvironment()
{
unless ( open(OUT, ">&STDOUT") )
{
print "Cannot redirect STDOUT";
return 2;
}
unless ( open(ERR, ">&STDERR") )
{
print "Cannot redirect STDERR";
return 2;
}
unless ( open(STDOUT, "|tee -ai $g_LOGPATH/$g_LOGFILE") )
{
print "Cannot open log file $g_LOGPATH/$g_LOGFILE");
return 2;
}
unless ( open(STDERR, ">&STDOUT") )
{
print "Cannot redirect STDERR");
return 2 ;
}
STDOUT->autoflush(1);
}
####################### Main Program ######################################
$g_LOGFILE="parent.log";
while ($file = readdir(DIR))
{
my $pid = fork;
if ( $pid ) {
setEnvironment();
#parent process code goes here
printf "%s\n", "parent";
next;
}
$g_LOGFILE="child.log";
setEnvironment();
#child code goes here
printf "%s\n", "child";
exit;
}
wait for #pids

Ok i tested this code alitle. Here is my sample code. In my code there is similar(not exact) problem: all messages are double-written to childs log file.
So my answers to your questions:
The Problem is child process STDOUT gets redirected into the parent log file as well as the child log file.
This because when you open file with pipe (open(STDOUT, "|tee ...) as a underlying result your process fork() to create child process and then exec into program what you run (tee). Forking(for tee) takes STDOUT of master process so tee will write into parent's logfile. So i think you must revoke using STDOUT handle for master process. Or, second way - remove use of tee - its simplest way.
Also i dont understand in the below setEnvironment function the purpose of creating OUT and ERR file handle.
Seems this is someone's woraround about problem above. You can grep -rE '
\bERR\b' . to search in code if it used or not. Probably someone wanted to save real STDOUT and STDERR to further use.

It would appear that the intent of the original code is as follows:
when the script is started from, say, a terminal, then provide aggregate parent and child output to the terminal
additionally, provide a copy of the parent output in parent.log, and a copy of child output in child.log
Note that #Unk's answer is correct as far as 2. goes, and has less moving parts than any code using tee, but fails to achieve 1.
If it is important to achieve both 1. and 2. above, then take your original code and simply add the following at the top of your setEnvironment method:
sub setEnvironment()
{
if ( fileno OUT )
{
unless ( open(STDOUT, ">&OUT") )
{
print "Cannot restore STDOUT";
return 2;
}
unless ( open(STDERR, ">&ERR") )
{
print "Cannot restore STDERR";
return 2;
}
}
else
{
unless ( open(OUT, ">&STDOUT") )
{
print "Cannot redirect STDOUT";
return 2;
}
unless ( open(ERR, ">&STDERR") )
{
print "Cannot redirect STDERR";
return 2;
}
}
unless ( open(STDOUT, "|tee -ai $g_LOGPATH/$g_LOGFILE") )
...
Incidentally, do not forget to also add $pid to #pids if your actual code does not do that already:
...
my $pid = fork;
if ( $pid ) {
push #pids, $pid;
...
Why and how does this work? We simply want to temporarily restore the original STDOUT immediately before rewiring it into tee, so that tee inherits it as its standard output and actually writes directly to the original STDOUT (e.g. your terminal) instead of writing (in the case of the forked children) through the parent's tee (which is where the child's STDOUT normally pointed to before this change, by virtue of inheriting from the paremnt process, and which is what injected those child lines into parent.log.)
So in answer to one of your questions, whoever wrote the code to set OUT and ERR must have had exactly the above in mind. (I cannot help but wonder whether or not the difference in indentation in your original code is indicative of someone having removed, in the past, code similar to the one you have to add back now.)
Here's what you now get at the end of the day:
$ rm -f parent.log child.log
$ perl test.pl
child
parent
child
parent
parent
child
parent
child
parent
$ cat parent.log
parent
parent
parent
parent
parent
$ cat child.log
child
child
child
child
child

You can always redirect STDOUT to log file by closing it first and then reopening:
close STDOUT;
open STDOUT, ">", $logfile;
Small downside to this is that once STDOUT is redirected, you will not see any output on terminal during script execution.
If you want parent and child process have different log files, just perform this redirection in both to different log files after fork(), something like this:
print "Starting, about to fork...\n";
if (fork()) {
print "In master process\n";
close STDOUT;
open STDOUT, ">", "master.log";
print "Master to log\n";
} else {
print "In slave process\n";
close STDOUT;
open STDOUT, ">", "slave.log";
print "Slave to log\n";
}
I have tested that this works as expected on Linux and Windows.

#!/usr/bin/perl
use strict;
use warnings;
use utf8;
use Capture::Tiny qw/capture_stdout/;
my $child_log = 'clild.log';
my $parent_log = 'parent.log';
my $stdout = capture_stdout {
if(fork()){
my $stdout = capture_stdout {
print "clild\n";
};
open my $fh, '>', $child_log;
print $fh $stdout;
close $fh;
exit;
}
print "parent\n";
};
open my $fh, '>', $parent_log;
print $fh $stdout;
close $fh;

All the other answers are correct (PSIalt's in particular) - I'm merely hoping that I can answer with corrected code that is identifiably close to that in the question. The key things to notice:
"|tee -ai..."
The tee commands prints its standard in to its standard out while also printing to the given file. As PSIalt says, removing it is the easiest way to ensure each process' output goes only to the correct file.
setEnvironment() inside loop for parent
The original code is constantly redirecting STDOUT back to the teeed file. Therefore recapturing STDOUT. Given my code below, if you moved setEnvironment to above #parent process code goes here you would see all but one 'Real STDOUT' and 'Real STDERR' actually appearing in parent.log.
Options
The ideal is to remove any reliance on redirecting STDOUT / STDERR for logging. I would have a dedicated log($level, $msg) function and start to move all code to using it. Initially it's OK if it is simply a façade for the existing behaviour - you can simply switch it out when you reach an appropriate threshold of code covered.
If it's a basic script and doesn't produce stupidly large logs, why not just print everything to STDOUT with some prefix you can grep for (e.g. 'PARENT:' / 'CHILD:')?
It's a bit outside the scope of the question, but consider using a more structured approach to logging. I would consider using a CPAN logging module, e.g. Log::Log4perl. This way, the parent and children can simply request the correct log category, rather than mess around with file handles. Additional advantages:
Standardise output
Allow reconfiguration on the fly - change logging level from ERROR to DEBUG on a running-but-misbehaving system
Easily redirect output - no change is needed to your code to rearrange log files, rotate files, redirect to a socket / database instead, etc...
use strict;
use warnings;
our $g_LOGPATH = '.';
our $g_LOGFILE = "parent.log";
our #pids;
setEnvironment();
for ( 1 .. 5 ) {
my $pid = fork;
if ($pid) {
#parent process code goes here
printf "%s\n", "parent";
print OUT "Real STDOUT\n";
print ERR "Real STDERR\n";
push #pids, $pid;
next;
}
$g_LOGFILE = "child.log";
setEnvironment();
#child code goes here
printf "%s\n", "child";
exit;
}
wait for #pids;
sub setEnvironment {
unless ( open( OUT, ">&STDOUT" ) ) {
print "Cannot redirect STDOUT";
return 2;
}
unless ( open( ERR, ">&STDERR" ) ) {
print "Cannot redirect STDERR";
return 2;
}
unless ( open( STDOUT, '>>', "$g_LOGPATH/$g_LOGFILE" ) ) {
print "Cannot open log file $g_LOGPATH/$g_LOGFILE";
return 2;
}
unless ( open( STDERR, ">&STDOUT" ) ) {
print "Cannot redirect STDERR";
return 2;
}
STDOUT->autoflush(1);
}
child.log:
child
child
child
child
child
parent.log:
parent
parent
parent
parent
parent
STDOUT taken from terminal:
Real STDOUT (x5 lines)
STDERR taken from terminal:
Real STDERR (x5 lines)

Related

Perl: closing child process fails when file handle to child/pipe is aliased

I've read all the perl documentation on open, close and IPC and read a number of related forum threads, but still have been unable to figure out what I'm doing wrong. To start with I created this simple forked process to prove that I did the open handling correctly:
#!/usr/bin/perl
use IO::Handle;
my $pid = open(CHILD, "|-");
if ($pid == 0) {
# child:
while (<STDIN>) { # keeps reading until parent closes it's end of the pipe
chomp $_;
print "child: $_\n";
}
}
else {
# parent:
for my $line (1..3) {
print CHILD "$line\n";
}
my $success = close CHILD;
print "parent: success: $success, \$!: $!, \$?: $?\n";
}
The output, as expected, is:
child: 1
child: 2
child: 3
parent: success: 1, $!: Illegal seek, $?: 0
The success value is true; so I just ignore the 'Illegal seek' as it's not relevant, supposedly. However, when I alias STDOUT to the pipe (to avoid needing to print explicitly to CHILD) and then later revert STDOUT back to it's original value, as follows, the close CHILD no longer works (even though STDOUT is no longer pointing to it):
#!/usr/bin/perl
use IO::Handle;
my $pid = open(CHILD, "|-");
if ($pid == 0) {
# child:
while (<STDIN>) { # keeps reading until parent closes it's end of the pipe
chomp $_;
print "child: $_\n";
}
}
else {
# parent:
open(ORIG_STDOUT, ">&STDOUT") or die "Unable to dup STDOUT: $!";
open(STDOUT, ">&=CHILD") or die "Unable to alias CHILD: $!";
for my $line (1..3) {
print "$line\n";
}
open(STDOUT, ">&=ORIG_STDOUT") or die "Unable to alias ORIG_STDOUT: $!";
my $success = close CHILD;
print "parent: success: $success, \$!: $!, \$?: $?\n";
}
The output is this:
child: 1
child: 2
child: 3
parent: success: , $!: Illegal seek, $?: -1
The success value is no longer true, which means that either there was an error in closing the pipe, or the CHILD returned a non-zero value. The -1 value of $? indicates that the wait system call failed for some reason and I'm not sure what the "Illegal seek" means, but the fact that $! was not zero, further indicates that it was not just the CHILD process returning a non-zero code that made the close CHILD return false, but rather that there was some kind of error in the closing procedure. What am I missing here?
There are at least two problems. First, your call to open is actually failing in both programs, or rather it does if you append the check clause after the first call.
#!/usr/bin/perl
use IO::Handle;
my $pid = open(CHILD, "|-") or die("A descriptive message");
The program immediately crashes upon execution. However, since the pipe in the open-mode argument implies a call to fork, this means there are supposed to be two separate processes worth of total output. Understandably, you assumed the call succeeded, since code after that point was being executed.
The reason the call to open failed, however, is multi-faceted and interesting. You're passing in an invalid file descriptor, but something really interesting would happen if the child process could live long enough to try to duplicate the file descriptor. Using ">&=" instead of simply ">&" in the call to open changes the underlying call to the kernel from dup to fdopen; the first returns a file descriptor and the second a file handle. (In C: int vs FILE*)
TL;DR: You're not duplicating the file descriptor because you said you didn't want to duplicate the file descriptor, and the seek error was the logical consequence of attempting to read 1 + n bytes past the 0 total bytes of the null handle you received from the call to open.

IPC communication between 2 processes with Perl

Let's say we have a 'Child' and 'Parent' process defined and subroutines
my $pid = fork;
die "fork failed: $!" unless defined($pid);
local $SIG{USR1} = sub {
kill KILL => $pid;
$SIG{USR1} = 'IGNORE';
kill USR1 => $$;
};
and we divide them, is it possible to do the following?
if($pid == 0){
sub1();
#switch to Parent process to execute sub4()
sub2();
#switch to Parent process to execute sub5()
sub3();
}
else
{
sub4();
#send message to child process so it executes sub2
sub5();
#send message to child process so it executes sub3
}
If yes, can you point how, or where can I look for the solution? Maybe a short example would suffice. :)
Thank you.
There is a whole page in the docs about inter process communication: perlipc
To answer your question - yes, there is a way to do what you want. The problem is, exactly what it is ... depends on your use case. I can't tell what you're trying to accomplish - what you you mean by 'switch to parent' for example?
But generally the simplest (in my opinion) is using pipes:
#!/usr/bin/env perl
use strict;
use warnings;
pipe ( my $reader, my $writer );
my $pid = fork(); #you should probably test for undef for fork failure.
if ( $pid == 0 ) {
## in child:
close ( $writer );
while ( my $line = <$reader> ) {
print "Child got $line\n";
}
}
else {
##in parent:
close ( $reader );
print {$writer} "Parent says hello!\n";
sleep 5;
}
Note: you may want to check your fork return codes - 0 means we're in the child - a number means we're in the parent, and undef means the fork failed.
Also: Your pipe will buffer - this might trip you over in some cases. It'll run to the end just fine, but you may not get IO when you think you should.
You can open pipes the other way around - for child->parent comms. Be slightly cautious when you multi-fork though, because an active pipe is inherited by every child of the fork - but it's not a broadcast.

want to get the die error message from a command executed via open in Perl

I am trying to fork out a cmd like below
my $h = IO::Handle->new;
$self->{-handle} = $h;
die "IO::Handle->new failed." unless defined $h;
$self->{-pid} = open $h, $self->{-command} . ' 2>&1 |';
$self->fileevent($h, 'readable' => [\&_read_cmd_op, $self]);
sub _read_cmd_op{
my $h = $self->{-handle};
if ( sysread $h, $_, 4096 ) {
my $t = $self->Subwidget('text');
$t->insert('end', $_);
$t->yview('end');
} else {
$self->{-finish} = 1;
}
}
Now the problem is that the '$self{-command}' is invoking
another perl script which if dies I want to know.
Note that the $self{-pid} still exists even if cmd dies.
The above code is in a Perl/TK app, where the $self->{-command} o/p in captured in a
text widget.
Somehow i don't get the die message even in the test widget.
I see it on stdout.
2 questions
How can i get the cmd op/error in the text widget?
How can i know that the command fired via IO::Handle died?
$self->{-pid} is just the pid of the forked process, not some magic object which goes away if the command exits.
I cannot reproduce the problem not getting the die() message. If the snippet above is called with 'perl -e "die 123"', then "123" appears in the text widget (at least on a Unix system).
For getting the exit code you can use something like the following.
} else {
$mw->fileevent($h, 'readable', '');
my $pid = waitpid($self->{-pid},0);
warn "pid $pid finished";
warn "retcode is " . ($? >> 8);
$self->{-finish} = 1;
}
The fileevent call with the empty callback stops further selects on this filehandle. With the waitpid call you wait for the termination of the child process. Once this happens, the exit code is available in the $? variable, like after a normal system() call. So for a non-zero exit code you know that the command died or exited with a false value.

close multiple output pipes in perl without blocking on each one

I have a perl script which sends a lot of output to multiple subprocesses. I need to be able to close my end of all the pipes and then wait for the subprocesses to finish their work. So far I've only succeeded at closing each pipe and waiting for each subprocess to finish one by one.
More concretely, I'm doing something like this:
for ($i=0;$i<24;$i++) {
my $fh;
open $fh, "|externalprogram $i";
$fhs{$i}=$fh;
}
#...now I can write output to the pipes
while (moreworktodo()) {
$whichone, $data = do_some_work();
print $fhs{$whichone} $data;
}
#Now I just need to wait for all the subprocesses to finish. However, they
#need to do a lot of work that can only begin when they've finished reading input. So I need to close my end of the pipe to indicate I'm finished.
for ($i=0;$i<24;$i++) {
my $file = $fhs{$i};
close $file; #unfortunately, this blocks until process $i finishes
#meanwhile all the other processes are waiting for EOF
#on their STDIN before they can proceed. So I end up waiting
#for 24 processes to finish one-at-a-time instead of all at once
}
One way to get all the subprocesses to finish promptly (closing their stdin) is simply to let my script exit without closing the (pipe) filehandles at all, but that's no good because the script is part of a larger job that needs the subprocess' work to actually be done before proceeding.
What is a simple way to close each subprocesses' stdin (so that they can all finish working) and then wait for all of them to finish before proceeding? I've tried forking off a child to close each pipe but that doesn't seem to work -- only the parent's "close" actually closes the stdin of the subprocess and waits for the subprocess to finish.
I would create the pipes myself and not use open(P, "|external-program").
Then you can close the pipe and not wait for the child process to exit.
Example of opening a pipe to a child process yourself:
sub spawn {
my ($cmd) = #_;
pipe(my $rp, $wp) or die "pipe failed: $!";
my $pid = fork();
die "fork: $!" unless defined($pid);
if ($pid) {
# parent
close($rp);
return ($wp, $pid);
} else {
# child
close($wp);
open(STDIN, "<&", $rp);
exec($cmd) or die "exec: $!";
}
}
sub main {
$| = 1;
my ($wp, $pid) = spawn("./child");
for (1..10) {
print {$wp} "sending $_\n";
}
close($wp);
print "done\n";
}
main();
Here's a sample child program to test that close() is NOT waiting for the child to exit:
# file: ./child
while (<STDIN>) {
print "got: $_";
sleep(2);
}
The last piece of the puzzle is to asynchronously wait for the child processes to exit.
This can be done with a $SIG{CHLD} handler, or, alternatively, here is a simplistic "join_children" function:
my #child_ids = (1..24); # or whatever ids you want to use
my %pipe; # hash map from child_id -> pipe handle
sub join_children {
for my $id (#child_ids) {
close( $pipe{$id} );
}
my $count = scalar(#child_ids);
while ($count > 0) {
wait;
$count--;
}
}

perl, no child process w/ "open"

Hi I have this problem where the perl script spits back "No child process found at" ...
My script calls several different types of forks, so I tried implementing the perldoc's waitpid() implementation method to be able to use handle both fork & exec & system & qw.
$SIG{CHLD} = sub {
# don't change $! and $? outside handler
local ($!, $?);
my $pid = waitpid(-1, WNOHANG) > 0;
return if $pid == -1;
return unless defined $children{$pid};
delete $children{$pid};
};
my $pid = fork();
die "cannot fork" unless defined $pid;
if ($pid == 0) {
# ...
exit 0;
} else {
$children{$pid}=1;
# ...
exec($command);
}
There is no problem with this part of the execution of the code, however the "No child processor found" occurs when I try to close the filehandle's CLOSE. Can someone explain to me how come this is happening, as I really want to understand this problem more in depth. Do I end up reaping the child process forked by the OPEN call, so that the close doesn't know how to handle the file handle? or maybe I'm 100% off. Any solutions would be appreciated
open(RESULTS, "-|", "find $dir\/ -maxdepth 1 -name RESULTS -print0 | xargs -0 cat ") or die $!;
while(<RESULTS>){
if($_ =~ /match/){
print $_;
}
}
close RESULTS;
close on a handle* opened thusly calls waitpid to reap the child created by open. However, your signal handler managed to reap the child before close did, so close could not find the child, so close returned an error.
You could fix this by changing your signal handler to only reap the children you've created using fork (below), or you could ignore that error from close.
$SIG{CHLD} = sub {
local ($!, $?, $^E);
for my $pid (keys(%children)) {
if (waitpid($pid, WNOHANG) > 0) {
delete $children{$pid};
}
}
};
* — The proper term is "file handle". It is named such since it allows you to hold onto a file. It's not handler as it performs no action.