Problems while making a multiprocessing task in Perl - perl

I'm trying to make a basic multiprocessing task and this is what I have. First of all, I don't know the right way to make this program as a non-blocking process, because when I am waiting for the response of a child (with waitpid) the other processes also have to wait in the queue, but, what will happen if some child processes die before (I mean, the processes die in disorder)? So, I've been searching and I foud that I can get the PID of the process that just die, for that I use waitpid(-1, WNOHANG). I always get a warning that WNOHANG is not a number, but when I added the lib sys_wait_h, I didn't get that error but the script never waits for PID, what may be the error?
#!/usr/bin/perl
#use POSIX ":sys_wait_h"; #if I use this library, I dont get the error, but it wont wait for the return of the child
use warnings;
main(#ARGV);
sub main{
my $num = 3;
for(1..$num){
my $pid = fork();
if ($pid) {
print "Im going to wait (Im the parent); my child is: $pid\n";
push(#childs, $pid);
}
elsif ($pid == 0) {
my $slp = 5 * $_;
print "$_ : Im going to execute my code (Im a child) and Im going to wait like $slp seconds\n";
sleep $slp;
print "$_ : I finished my sleep\n";
exit(0);
}
else {
die "couldn’t fork: $!\n";
}
}
foreach (#childs) {
print "Im waiting for: $_\n";
my $ret = waitpid(-1, WNOHANG);
#waitpid($_, 0);
print "Ive just finish waiting for: $_; the return: $ret \n";
}
}
Thanks in advance, bye!

If you use WNOHANG, the process will not block if no children have terminated. That's the point of WNOHANG; it ensures that waitpid() will return quickly. In your case, it looks like you want to just use wait() instead of waitpid().

I find that POE handles all of this stuff for me quite nicely. It's asynchronous (non-blocking) control of all sorts of things, including external processes. You don't have to deal with all the low level stuff because POE does it for you.

Related

Understanding how Perl fork works

What would be the right way to fork processes that each one of them runs a different subroutine sub1,sub2,...,subN. After reading a lot of previous thread and material, I feel that I understand the logic but a bit confused on how to write in the cleanest way possible (readability is important to me).
Consider 4 subs. Each one of them gets different arguments. It feels like that the most efficient way would be to create 7 forks that each one of them will run a different sub. The code will look something like this:
my $forks = 0;
foreach my $i (1..4) {
if ($i == 1) {
my $pid = fork();
if ($pid == 0) {
$forks++;
run1();
exit;
}
} elsif ($i == 2) {
my $pid = fork();
if ($pid == 0) {
$forks++;
run1();
exit;
}
} elsif ($i == 3) {
my $pid = fork();
if ($pid == 0) {
$forks++;
run1();
exit;
}
} elsif ($i == 4) {
my $pid = fork();
if ($pid == 0) {
$forks++;
run1();
exit;
}
}
}
for (1 .. $forks) {
my $pid = wait();
print "Parent saw $pid exiting\n";
}
print "done\n";
Some points:
This will work only if all of the forks were successful. But I would like to run the subs even though the fork failed (even though it will not be parallel. In that case, I guess we need to take the subs out of the if and exit only if the $pid wasn't 0. something like:
my $pid = fork();
run1();
$forks++ if ($pid == 0);
exit if ($pid == 0);
But it still feels not right.
Using exit is the right way to kill the child process? if the processes were killed with exit should I still use wait? Will it prevent zombies?
Maybe the most interesting question: What will I do if we have 15 function calls? I would like to somehow create 15 forks but I can't create 15 if-else statements - the code will not be readable that way. At first, I thought that it is possible to insert those function calls into an array (somehow) and loop over that array. But after some research, I didn't find a way that it is possible.
If possible, I prefer not to use any additional modules like Parallel::ForkManager.
Is there a clean and simple way to solve it?
There are a few questions to clear up here.
A basic example
use warnings;
use strict;
use feature 'say';
my #coderefs;
for my $i (1..4) {
push #coderefs, sub {
my #args = #_;
say "Sub #$i with args: #args";
};
}
my #procs;
for my $i (0 .. $#coderefs) {
my $pid = fork // do {
warn "Can't fork: $!";
# retry, or record which subs failed so to run later
next;
};
if ($pid == 0) {
$coderefs[$i]->("In $$: $i");
exit;
}
push #procs, $pid;
#sleep 1;
}
say "Started: #procs";
for my $pid (#procs) {
my $goner = waitpid $pid, 0;
say "$goner exited with $?";
}
We generate anonymous subroutines and store those code references in an array, then go through that array and start that many processes, running a sub in each. After that the parent waitpids on these in the order in which they were started, but normally you'll want to reap as they exit; see docs listed below.
A child process always exits, or you'd have multiple processes executing all of the rest of the code in the program. Once a child process exits the kernel will notify the parent, and the parent can "pick up" that notification ("reap" the exit status of the child process) via wait/waitpid, or use a signal handler to handle/ignore it.
If the parent never does this after the child exited, once it exits itself later the OS stays stuck with that information about the (exited) child process in the process table; that's a zombie. So you do need to wait, so that OS gets done with the child process (and you check up on how it went). Or, you can indicate in a signal handler that you don't care about the child's exit.† Modern systems reap would-be zombies but not always and you cannot rely on that; clean up after yourself.
Note, you'll need to be reading perlipc, fork, wait and waitpid, perlvar ... and yet other resources that'll come up while working on all this. It will take a little playing and some trial and error. Once you get it all down you may want to start using modules, at least for some types of tasks.
† To ignore the SIGCHLD (default)
$SIG{CHLD} = 'IGNORE';
Or, can run code there (but well advised to be minimal)
$SIG{CHLD} = sub { ... };
These signal "dispositions" are inherited in fork-ed processes (but not via execve).
See the docs listed above, and the basics of %SIG variable in perlvar. Also see man(7) signal. All this is generally *nix business.
This is a global variable, affecting all code in the interpreter. In order to limit the change to the nearest scope use local
local $SIG{CHLD} = ...

IPC communication between 2 processes with Perl

Let's say we have a 'Child' and 'Parent' process defined and subroutines
my $pid = fork;
die "fork failed: $!" unless defined($pid);
local $SIG{USR1} = sub {
kill KILL => $pid;
$SIG{USR1} = 'IGNORE';
kill USR1 => $$;
};
and we divide them, is it possible to do the following?
if($pid == 0){
sub1();
#switch to Parent process to execute sub4()
sub2();
#switch to Parent process to execute sub5()
sub3();
}
else
{
sub4();
#send message to child process so it executes sub2
sub5();
#send message to child process so it executes sub3
}
If yes, can you point how, or where can I look for the solution? Maybe a short example would suffice. :)
Thank you.
There is a whole page in the docs about inter process communication: perlipc
To answer your question - yes, there is a way to do what you want. The problem is, exactly what it is ... depends on your use case. I can't tell what you're trying to accomplish - what you you mean by 'switch to parent' for example?
But generally the simplest (in my opinion) is using pipes:
#!/usr/bin/env perl
use strict;
use warnings;
pipe ( my $reader, my $writer );
my $pid = fork(); #you should probably test for undef for fork failure.
if ( $pid == 0 ) {
## in child:
close ( $writer );
while ( my $line = <$reader> ) {
print "Child got $line\n";
}
}
else {
##in parent:
close ( $reader );
print {$writer} "Parent says hello!\n";
sleep 5;
}
Note: you may want to check your fork return codes - 0 means we're in the child - a number means we're in the parent, and undef means the fork failed.
Also: Your pipe will buffer - this might trip you over in some cases. It'll run to the end just fine, but you may not get IO when you think you should.
You can open pipes the other way around - for child->parent comms. Be slightly cautious when you multi-fork though, because an active pipe is inherited by every child of the fork - but it's not a broadcast.

Call Several Other Scripts Async

I know there a lot of ways to do this, but because there are so many I don't know which one to choose.
What I want to accomplish:
1. Start several child scripts
2. Be able to check if they are running
3. Be able to kill them
4. I DON'T need to capture their output, and their output does not need to be displayed.
Each of these scripts is in their own file.
I haven't done scripting in a while and I'm stuck in an OOP mindset, so forgive me if I say something ridiculous.
use Parallel::ForkManager qw( );
use constant MAX_SIMUL_CHILDREN => 10;
my $pm = Parallel::ForkManager->new(MAX_SIMUL_CHILDREN);
for my $cmd (#cmds) {
$pm->start()
and next;
open(STDOUT, '>', '/dev/null')
or die($!);
exec($cmd)
or die($!);
$pm->finish(); # Never reached, but that's ok.
}
$pm->wait_all_children();
Adding the following before the loop will log the PID of the children.
$pm->run_on_start(sub {
my ($pid, $ident) = #_;
print("Child $pid started.\n");
});
$pm->run_on_finish(sub {
my ($pid, $exit_code, $ident, $exit_signal) = #_;
if ($exit_signal) { print("Child $pid killed by signal $exit_signal.\n"); }
elsif ($exit_code) { print("Child $pid exited with error $exit_code.\n"); }
else { print("Child $pid completed successfully.\n"); }
});
$ident is the value passed to $pm->start(). It can be used to give a "name" to a process.
Perl and parallel don't go well together, but here are a few thoughts :
fork() a few times, and manage each child independently
Perl allows you to open filehandles to processes: open my $fh, '-|', 'command_to_run.sh'. You could use this and poll those handles
Fork them to the background and store their process IDs

Reading from a file descriptor in a non-blocking way with Perl

Let's say I have this:
pipe(READ,WRITE);
$pid = fork();
if ($pid == 0) {
close(READ);
# do something that may be blocking
print WRITE "done";
close(WRITE);
exit(0);
} else {
close(WRITE);
$resp = <READ>;
close(READ);
# do other stuff
}
In this situation, it's possible for the child to hang indefinitely. Is there a way I can read from READ for a certain amount of time (ie, a timeout) and if I don't get anything, I proceed in the parent with the assumption that the child is hanging?
Typically, in C or Perl, you use select() to test if there is any input available. You can specify a timeout of 0 if you like, though used 1 second in the example below.:
use IO::Select;
pipe(READ,WRITE);
$s = IO::Select->new();
$s->add(\*READ);
$pid = fork();
if ($pid == 0) {
close(READ);
# do something that may be blocking
for $i (0..2) {
print "child - $i\n";
sleep 1;
}
print WRITE "donechild";
close(WRITE);
print "child - end\n";
exit(0);
} else {
print "parent - $pid\n";
close(WRITE);
for $i (0..10) {
print "parent - $i\n";
# 1 second wait (timeout) here. Can be 0.
print "parent - ", (#r=$s->can_read(1))?"yes":"no", "\n";
last if #r;
}
$resp = <READ>;
print "parent - read: $resp\n";
close(READ);
# do other stuff
}
Is there a way I can read from READ for a certain amount of time (ie, a timeout) and if I don't get anything, I proceed in the parent with the assumption that the child is hanging?
When you fork, you are working with two entirely separate processes. You're running two separate copies of your program. Your code cannot switch back and forth between the parent and child in your program. You're program is either the parent or the child.
You can use alarm in the parent to send a SIGALRM to your parent process. If I remember correctly, you set your $SIG{ALRM} subroutine, start your alarm, do your read, and then set alarm back to zero to shut it off. The whole thing needs to be wrapped in an eval.
I did this once a long time ago. For some reason, I remember that the standard system read didn't work. You have to use sysread. See Perl Signal Processing for more help.

perl - child process signaling parent

I have written the following piece of code to test signaling between child and parent. Ideally, when the child gives a SIGINT to parent the parent should come back in the new iteration and wait for user input. This I have observed in perl 5.8, but in perl 5.6.1(which I am asked to use) the parent is actually "killed". There is no next iteration.
my $parent_pid = $$;
$pid = fork();
if($pid == 0)
{
print "child started\n";
kill 2, $parent_pid;
}
else
{
while(1)
{
eval
{
$SIG{INT} = sub{die "GOTCHA";};
print 'inside parent'."\n";
$a = <>;
};
if($#)
{
print "got the signal!!!!\n$#\n";
next;
}
}
}
Could someone please give a walkaround for this problem or some other way to signal the parent so that it enters the new iteration.
The failure on 5.6.X might be because of the way Perl used to handle signals, which was fixed with 'Safe Signal Handling' in Perl 5.8.0. In either case, you are using a Perl which is practically archaeological and you should argue strongly to your masters that you should be using at least Perl 5.12, and ideally 5.14.
This is likely to be a race condition, caused by the child sending the SIGINT before the parent was ready for it. Remember that after you fork() you will have two independent processes, each might proceed at whatever pace it likes.
It's best in your case to set up the SIGINT handler before the fork() call, so you know it's definitely in place before the child tries to kill() its parent.
(with some minor corrections):
$SIG{INT} = sub { die "GOTCHA" };
my $parent_pid = $$;
defined( my $pid = fork() ) or die "Cannot fork() - $!";
if($pid == 0)
{
print "child started\n";
kill INT => $parent_pid;
}
else
{
while(1)
{
eval
{
print "inside parent\n";
<>;
};
if($#)
{
print "got the signal!!!!\n$#\n";
next;
}
}
}