I don't know if I accidentally deleted or put in a typo somewhere but all of the sudden some of my code stopped working. For some reason no lines are ever read from $in.
use Win32::Job;
use IO::Handle;
STDOUT->autoflush;
pipe my $in, my $out;
my $job = Win32::Job->new;
sub flush_pipe{
while (defined(my $line = <$in>)) {
chomp($line);
print($line);
}
}
my $pid = $job->spawn("cmd", "cmd /C \"ipconfig\"",
{
stdout=>$out
}
);
flush_pipe();
Edit:
Through trial and error I eventually found out I have to close the $out filehandle before flushing the pipe.
A pipe is unidirectional. Each of the processes it connects can either read or write.
After pipe you have two filehandles, and both the parent and the child see them both. If the child is going to write and the parent to read, as in your code, then the child must first close the handle it won't use ($in) and the parent must close its unused one, $out. Otherwise you'll have deadlocks.
The spawn from the module starts a child process (or, rather, a Windows approximation of it) and redirects its STDOUT to the writing end of the pipe, $out.
Some very basic code that should cover this
use strict;
use warnings;
use feature 'say';
pipe my $in, my $out;
my $pid = fork // die "Can't fork: $!";
if ($pid == 0) { # child
close $in;
print $out "hi "; # can't read this yet (no newline) ...
sleep 1;
say $out "from child"; # now the other end can read it
close $out;
exit;
}
# parent
close $out;
say while <$in>;
close $in;
wait;
When you want prints to become available to the reader right away (up to buffering outside your code) send a newline. Close unused ends of the pipe in each process before doing anything else.
I can't write code on Windows now, but in your code parent must close $out (after spawn).
The term "flush" here can relate to the code in the writer or to Perl's clearing of IO buffers; the code in your flush_pipe() merely reads the pipe. So I'd change that name, to read_pipe or such.
Related
Also posted on PerlMonks.
I have this very simple Perl script on my linux server.
What I would like to be able to do is to call the script from a browser on a separate machine
Have the script initiate a fork
Have the parent send an httpResponse (freeing up the browser)
Immediately end the parent
Allow the child to do its job, heavy complex database work, which could take a minute or two
Have the child end itself with no output whatsoever
When I call this script from a browser, the browser does not receive the sent response till the child is complete.
Yes, it works when called from the command line.
Is what I want to do possible?
p.s. I even tried it with ProcSimple, but I get the same hang up.
#!/usr/bin/perl
local $SIG{CHLD} = "IGNORE";
use lib '/var/www/cgi-bin';
use CGI;
my $q = new CGI;
if(!defined($pid = fork())) {
die "Cannot fork a child: $!";
} elsif ($pid == 0) {
print $q->header();
print "i am the child\n";
sleep(10);
print "child is done\n";
exit;
} else {
print $q->header();
print "I am the parent\n";
print "parent is done\n";
exit 0;
}
exit 0;
In general you must detach the child process from its parent to allow the parent to exit cleanly -- otherwise the parent can't assume that it won't need to handle more input/output.
} elsif ($pid == 0) {
close STDIN;
close STDERR;
close STDOUT; # or redirect
do_long_running_task();
exit;
In your example, the child process is making print statements until it exits. Where do those prints go if the parent process has been killed and closed its I/O handles?
One way for a parent process to start another process that will go on its own is to "double fork." The child itself forks and it then exits right away, so its child is taken over by init and can't be a zombie.
This may help here as it does seem that there may be blocking since file descriptors are shared between parent and child, as brought up in comments. If the child were to exit quickly that may work but as you need a process for a long running job then fork twice
use warnings;
use strict;
use feature 'say';
my $pid = fork // die "Can't fork: $!";
if ($pid == 0) {
say "\tChild. Fork";
my $ch_pid = fork // die "Can't fork from child: $!";
if ($ch_pid == 0) {
# grandchild, run the long job
sleep 10;
say "\t\tgrandkid done";
exit;
}
say "\tChild, which just forked, exiting right away.";
exit;
}
say "Parent, and done";
I am not sure how to simulate your setup to test whether this helps but since you say that the child produces "no output whatsoever" it may be enough. It should be worth trying since it's simpler than demonizing the process (which I'd expect to do the trick).
Similarly to #mob's post, here's how my web apps do it:
# fork long task
if (my $pid = fork) {
# parent: return with http response to web client
} else {
# child: suppress further IO to ensure termination of http connection to client
open STDOUT, '>', "/dev/null";
open STDIN, '>', "/dev/null";
open STDERR, '>', "/dev/null";
}
# Child carries on from here,
Sometimes the (child) long process prints to a semaphore or status file that the web client may watch to see when the long process is complete.
I don't remember which Perl adept suggested this years ago, but it's served reliably in many situations, and seems very clear from the "re-visit it years later - what was I doing?" perspective...
Note that if /dev/null doesn't work outside of UNIX/Linux, then #mob's use of close might be more universal.
I'm new to fork concept. I'm trying a simple script with fork loop expecting to execute twice but end up with 3 occurrence. is it because i did not exit the child process ? can you clarify please
script :
#!/usr/bin/perl
open(my $fh, '>', 'report.txt');
print $fh "before invoke fork \n";
close $fh;
#list = (1, 2);
foreach $a (#list){
if($pid = fork){
open(my $fh, '>>', 'report.txt');
print $fh "Parent process! $a $pid\n";
close $fh;
}else {
open(my $fh, '>>', 'report.txt');
print $fh "Child process $a $pid\n";
close $fh;
}
}
output:
before invoke fork
My second report generated by perl 1 1808
My second report generated by perl 2 1809
My thrid report generated by perl 2 0
My thrid report generated by perl 1 0
My second report generated by perl 2 1810
My thrid report generated by perl 2 0
It's because the point at which you fork() both processes start at exactly the same point.
So 'child 1' will continue to run, and continue through the loop, and run 'iteration 2'.
So parent will fork twice, child 1 will fork once, and child 2 - because it's the last loop iteration - will not fork at all.
I would suggest you really should consider Parallel::ForkManager because it simplifies this a lot. And also - turn on use strict; and use warnings. And don't use single letter variables, especially not $a.
Here's some example code of how to use Parallel::ForkManager.
However to address your problem - sticking a last in the child stanza will cause it to bail out of the loop.
You are not waiting child process in your loop. There is an good explanation about fork in loops here:
http://perlmaven.com/fork
There is a manual too.
http://perldoc.perl.org/perlfork.html
Try to be careful about all possibilities in a fork. Treat problems matters! A simple, good and recommended way to implement:
use strict;
use warning;
use Carp qw/confess/;
my $pid = fork;
if(!defined($pid)){
confess "\nWhy you hates me, O.S.? Why?\nANSWER: $!\n\n";
}
elsif($pid == 0){
#Child process. Do something and exit
}
else{
#Parent process
waitpid $pid,0;
}
There is another features related that is not described in this example. To read more about it, please checkout the manual linked above
Checkout Parallel::ForkManager too!
Is there a simple way in Perl to send STDOUT or STDERR to multiple places without forking, using File::Tee, or opening a pipe to /usr/bin/tee?
Surely there is a way to do this in pure perl without writing 20+ lines of code, right? What am I missing? Similar questions have been asked, both here on SO and elsewhere, but none of the answers satisfy the requirements that I not have to
fork
use File::Tee / IO::Tee / some other module+dependencies
whose code footprint is 1000x larger than my actual script
open a pipe to the actual tee command
I can see the use of a Core module as a tradeoff here, but really is that needed?
It looks like I can simply do this:
BEGIN {
open my $log, '>>', 'error.log' or die $!;
$SIG{__WARN__} = sub { print $log #_ and print STDERR #_ };
$SIG{__DIE__} = sub { warn #_ and exit 1 };
}
This simply and effectively sends most error messages both to the original STDERR and to a log file (apparently stuff trapped in an eval doesn't show up, I'm told). So there are downsides to this, mentioned in the comments. But as mentioned in the original question, the need was specific. This isn't meant for reuse. It's for a simple, small script that will never be more than 100 lines long.
If you are looking for a way to do this that isn't a "hack", the following was adapted from http://grokbase.com/t/perl/beginners/096pcz62bk/redirecting-stderr-with-io-tee
use IO::Tee;
open my $save_stderr, '>&STDERR' or die $!;
close STDERR;
open my $error_log, '>>', 'error.log' or die $!;
*STDERR = IO::Tee->new( $save_stderr, $error_log ) or die $!;
I am using IPC::Open3 for the suggestion given by Hans Lub here.
My issue is that the open3 call works correctly for the first time, but subsequent invocations return the warning:
Use of uninitialized value in numeric ne (!=) at /usr/lib/perl5/5.8.8/IPC/Open3.pm line 215.
The code sample I am using looks like this:
use IPC::Open3;
my $pid;
# dup the old standard output and error
open(OLDOUT, ">&STDOUT") or die "Can't dup STDOUT: $!\n";
open(OLDERR, ">&STDERR") or die "Can't dup STDERR: $!\n";
my $transcript_file = "transcript.temp";
# reopen stdout and stderr
open (STDOUT, "|tee -i $transcript_file") or die "Can't reopen STDOUT: $!\n";
open (STDERR, ">&STDOUT") or die "Can't reopen STDERR: $!\n";
# print statements now write to log
print "Logging important info: blah!\n";
print STDERR "OOPS!\n";
#eval { $pid = open3("\*STDIN", "\*OLDOUT", "\*OLDERR", "ls"); }; # Tried this, but doesnt seem to help. Output does not appear on STDOUT.
eval { $pid = open3(">&STDIN", ">&OLDOUT", ">&OLDERR", "ls"); }; #This works correctly
waitpid( $pid, 0 );
eval { $pid = open3(">&STDIN", ">&OLDOUT", ">&OLDERR", "ls"); }; #First warning
waitpid( $pid, 0 );
eval { $pid = open3(">&STDIN", ">&OLDOUT", ">&OLDERR", "ls"); }; #Second warning
waitpid( $pid, 0 );
I apologize if I look to be trying to get others solve my problems, but I just can't seem to get around this, and looking inside Perl modules is beyond my current understanding.
It doesn't make sense to give the same STDIN to multiple parallel process. open3 thus assumes the handle you tell open3 to use isn't used by anything else, so it closes it.
It looks like your children aren't using the STDIN you provide them, so you should provide a handle to /dev/null.
open(local *CHILD_STDIN, '<', '/dev/null') or die $!;
$pid = open3('<&CHILD_STDIN', '>&STDOUT', '>&STDERR', #cmd);
I think the problem is the way open3 uses the file handles that you pass. If you use, say, >&STDOUT then the file handle is duped, the dupe is passed to the child process, and the parent's copy is closed. That means the second time you do the same thing you are duping a closed file handle, which doesn't have the effect you want.
The only way around this that I can see is to dupe the file handles separately and pass the dupes to the child process. It won't matter that the parent's copy of the dupes is closed because it still has the original STDOUT etc. Unfortunately it adds another three statements to each open3 call, so you woul probably want to wrap the whole thing in a subroutine, like this.
my_open3('ls');
my_open3('ls');
my_open3('ls');
sub my_open3 {
my #cmd = #_;
my $pid;
open IN_COPY, '<&', STDIN or die "Couldn't dup STDIN: $!";
open OUT_COPY, '>&', STDOUT or die "Couldn't dup STDOUT: $!";
open ERR_COPY, '>&', STDERR or die "Couldn't dup STDERR: $!";
eval {
$pid = open3('>&IN_COPY', '>&OUT_COPY', '>&ERR_COPY', #cmd);
};
waitpid $pid, 0;
}
This isn't the nicest of solutions, so if anyone can see anything better then please chime in. The only alternative I can see is to let the parent keep its own standard IO handles and use completely new ones to communicate with the child process each time. Then the parent would have mess with IO::Select to do the copying from the child output to its own STDOUT and STDERR.
As nwellnhof says, if the child doesn't use its STDIN (as is the case with the ls command) then you can just pass undef as the first parameter. That saves duplicating one of three standard handles.
I have some code that appends into some files in the nested for loops. After exiting the for loops, I want to append .end to all the files.
foreach my $file (#SPICE_FILES)
{
open(FILE1, ">>$file") or die "[ERROR $0] cannot append to file : $file\n";
print FILE1 "\n.end\n";
close FILE1;
}
I noticed in some strange cases that the ".end" is appended into the middle of the files!
how do i resolve this??
Since I do not yet have the comment-privilege I'll have to write this as an 'answer'.
Do you use any dodgy modules?
I have run into issues where (obviously) broken perl-modules have done something to the output buffering. For me placing
$| = 1;
in the code has helped. The above statement turns off perls output buffering (AFAIK). It might have had other effects too, but I have not seen anything negative come out of it.
I guess you've got data buffered in some previously opened file descriptors. Try closing them before re-opening:
open my $fd, ">>", $file or die "Can't open $file: $!";
print $fd, $data;
close $fd or die "Can't close: $!";
Better yet, you can append those filehanles to an array/hash and write to them in cleanup:
push #handles, $fd;
# later
print $_ "\n.end\n" for #handles;
Here's a case to reproduce the "impossible" append in the middle:
#!/usr/bin/perl -w
use strict;
my $file = "file";
open my $fd, ">>", $file;
print $fd "begin"; # no \n -- write buffered
open my $fd2, ">>", $file;
print $fd2 "\nend\n";
close $fd2; # file flushed on close
# program ends here -- $fd finally closed
# you're left with "end\nbegin"
It’s not possible to append something to the middle of the file. The O_APPEND flag guarantees that each write(2) syscall will place its contents at the old EOF and update the st_size field by incrementing it by however many bytes you just wrote.
Therefore if you find that your own data is not showing up at the end when you go to look at it, then another agent has written more data to it afterwards.