How can I manage a fork pool in Perl? - perl

I'm setting something up to SSH out to several servers in 'batches'. I basically want to maintain 5 connections at a time, and when one finishes open up another (following an array of server IPs).
I'm wondering for something like this should I be using fork()? If so, what logic can I use to ensure that the I maintain 5 children at a time?

Forking (or threading) is what you want, but you should look at CPAN for modules that will provide most of what you need to prevent you from reinventing the wheel and going through the learning pains of what you need to do.
For example, Parallel::ForkManager looks like it's EXACTLY what you want.
use Parallel::ForkManager;
$pm = new Parallel::ForkManager($MAX_PROCESSES);
foreach $data (#all_data) {
# Forks and returns the pid for the child:
my $pid = $pm->start and next;
... do some work with $data in the child process ...
$pm->finish; # Terminates the child process
}

There are several modules that solve exactly this problem. See Parallel::ForkManager, Forks::Super, or Proc::Queue, for example.

use Net::OpenSSH::Parallel;
my $pssh = Net::OpenSSH::Parallel->new(connections => 5);
for my $ip (#ips) {
$pssh->add_host($ip);
}
$pssh->push('*', command => 'do this');
$pssh->push('*', command => 'do that');
$pssh->push('*', scp_get => 'foo', 'bar-%HOST%');
$pssh->push('*', scp_put => 'doz', 'there');
$pssh->run;

My personal forking(!) favourite is Proc::Fork
General overview from pod:
use Proc::Fork;
run_fork {
child {
# child code goes here.
}
parent {
my $child_pid = shift;
# parent code goes here.
waitpid $child_pid, 0;
}
retry {
my $attempts = shift;
# what to do if if fork() fails:
# return true to try again, false to abort
return if $attempts > 5;
sleep 1, return 1;
}
error {
# Error-handling code goes here
# (fork() failed and the retry block returned false)
}
};
And to limit the number of maximum processes running for something like SSH batches then this should do the trick:
use strict;
use warnings;
use 5.010;
use POSIX qw(:sys_wait_h);
use Proc::Fork;
my $max = 5;
my %pids;
my #ssh_files = (
sub { system "scp file0001 baz#foo:/somedir/." },
...
sub { system "scp file9999 baz#foo:/somedir/." },
);
while (my $proc = shift #ssh_files) {
# max limit reached
while ($max == keys %pids) {
# loop thru pid list until a child is released
for my $pid (keys %procs) {
if (my $kid = waitpid($pid, WNOHANG)) {
delete $pids{ $kid };
last;
}
}
}
run_fork {
parent {
my $child = shift;
$pids{ $child } = 1;
}
child {
$proc->();
exit;
}
}
}
/I3az/

Related

Can Net::LDAP and Parallel::Forkmanager work together?

I need to query different LDAP servers in perl via Net::LDAP. I have something that works well. However, in an attempt to speed up things, I tried to query the different servers in parallel, using Parallel::Forkmanager - and things do not work when I do that.
I get the following types of errors:
decode error 02<=>30 0 8 at /Users/myname/perl5/perlbrew/perls/perl-5.18.2/lib/site_perl/5.18.2/Convert/ASN1/_decode.pm line 113, <> line 18.
decode error 43<=>30 0 8 at /Users/myname/perl5/perlbrew/perls/perl-5.18.2/lib/site_perl/5.18.2/Convert/ASN1/_decode.pm line 113, <> line 25.
at the line that gets the search response:
$mesg = $ldap->search( base => $dn, filter => '(CN=*)');
I am puzzled.
Telling it in other other words, why does this fail:
use Net::LDAP;
use Parallel::Forkmanager;
...; # bind LDAP servers
while (<>) {
chop;
my $dn = $_;
foreach my $ldap (#servers) {
my $pid;
$pid = $pm->start and next; # do the fork
print $dn, $pid;
my $mesg;
try {
$mesg = $ldap->search( base => $dn, filter => '(CN=*)');
} catch {
...;
}
$pm->finish;
}
}
while this:
use Net::LDAP;
...; # bind LDAP servers
while (<>) {
chop;
my $dn = $_;
foreach my $ldap (#servers) {
print $dn;
my $mesg;
try {
$mesg = $ldap->search( base => $dn, filter => '(CN=*)');
} catch {
...;
}
}
}
works perfectly?
Whilst forking doesn't have quite the same thread safety problems of threading - there are still a few places you have gotchas. I think this is what's biting you - your Net::LDAP objects are created in the parent thread, but then (effectively) cloned to each when you fork.
Which means in your code - there's a very real possibility that if you've got the list of names coming in fast enough, that a new fork will try to reuse an existing Net::LDAP connection before a previous one is finished with it.
The easy way of preventing this is call wait_all_children to ensure all your parallel LDAP queries are finished before the next one starts.
If you put your LDAP bind within the ForkManager loop, do you still have the same problem? I appreciate that's a potential overhead as you'll be binding each iteration, but if that addresses it, I'd suggest that it's because Net::LDAP is sharing the same file descriptors between forks.
The next best solution there would be to adopt a 'worker' model, where you've got a bunch of 'workers' each with their on LDAP connections to do the querying. That's easier with threading, than forking - goes a bit like this:
#!/usr/bin/perl
use strict;
use warnings;
use threads;
use Thread::Queue;
sub server_worker {
my ( $hostname, $q_ref ) = #_;
## connect LDAP to $hostname;
while ( my $dn = $$q_ref->dequeue ) {
#query $dn
}
}
my #queues;
foreach my $server (#list_of_servers) {
my $server_q = Threads::Queue->new();
push( #queues, $server_q );
threads->create( \&server_worker, $hostname, \$server_q );
}
while ( my $dn = <STDIN> ) {
chomp($dn);
foreach my $q (#queues) {
$q->enqueue($dn);
}
}
foreach my $q ( #queues ) {
$q -> end;
}
foreach my $thr ( threads->list ) {
$thr->join();
}
Doing something similar with forking should work:
#!/usr/bin/perl
use strict;
use warnings;
use IO::Pipe;
use Parallel::ForkManager;
use Net::LDAP;
my #list_of_servers = qw ( servername servenama anotherserver );
my $pm = Parallel::ForkManager -> new ( scalar #list_of_servers );
my %pipe_for;
foreach my $server ( #list_of_servers ) {
my $pipe = IO::Pipe -> new();
my $pid = pm -> start;
if ( $pid ) {
print "$$: parent\n";
$pipe -> writer -> autoflush;
$pipe_for{$server} = $pipe;
}
else {
print "$$ child connecting to $server\n";
$pipe -> reader -> autoflush;
close ( STDIN ); #because this is a child.
#Net::LDAP setup
while ( my $item = <$pipe> ) {
chomp ( $item );
#ldap_search $item;
}
}
$pm -> finish;
}
And then send stuff:
for my $number ( 1..10 ) {
foreach my $pipe ( values %pipe_for ) {
print {$pipe} "test$number\n";
}
}
$pm -> wait_all_children();
Edit: Note - autoflush is important, otherwise the IO buffers and doesn't look like it's working. I'm pretty sure closing STDIN is probably a good idea in the child, but perhaps not vitally necessary if they don't use it.

Get the value of a process executed in a child back to parent

I'm looking for a solution which allows me to return the values of a process executed in a child back to the parent process. Currently i try this but have no idea where to hook the return value:
use Proc::ProcessTable;
use POSIX qw(:signal_h :errno_h :sys_wait_h);
$SIG{CHLD} = \&REAPER;
for my $count (1..10) { # start a few demo childs
if (fork () == 0) {
&startChild;
exit 0;
}
}
do {
print "Working\n";
sleep 1;
} while (chkChildProcess());
sub startChild {
print "Starting Child $$\n";
system("date"); #==>Need to get the output of "date" back to parent
sleep 2 + rand 7;
print "End Child $$\n";
}
sub chkChildProcess {
for my $p (#{new Proc::ProcessTable->table}){
if ($p->ppid == $$){
$curPID{$$}=$p->pid;
return 1;
}
}
return undef;
}
sub REAPER {
my $pid;
$pid = waitpid(-1, &WNOHANG);
if ($pid == -1) {
# no child waiting. Ignore it.
} elsif (WIFEXITED($?)) {
print "Process $pid exited.\n";
} else {
print "False alarm on $pid.\n";
}
$SIG{CHLD} = \&REAPER; # in case of unreliable signals
}
Any help would be great.
The bg_eval and bg_qx methods of Forks::Super were made to solve this problem.
use Forks::Super 'bg_eval';
my #result;
for my $count (1 .. 10) {
$result[$count] = bg_eval {
my $date = `date`;
sleep 2 + rand 7;
return $date;
};
}
print "$result[$_]\n" for 1..10;
The block after bg_eval is run asynchronously in a background process. When the background process is finished, the variable $result[$count] will be populated with the result.
When you print $result[$_], one of two things will happen. If the background process associated with that variable is finished, it will contain its return value. If the background process is not finished, it will wait for the process to finish, and then make the return value available in that value.
It looks like you may want to use Parallel::ForkManager, returning the value from the child via the data_structure_reference parameter of the finish method to a run_on_finish callback in the parent.
To capture the output, the easiest way is to use IPC::System::Simple's capture or capturex.
you could use threads::shared instead of fork, create a shared variable, lock it and write into it. keep in mind that locking is reeeally slow!
See also this post on perlmonks on why locking the variable is necessary.

Limiting processes with Parallel::ForkManager

I am trying to use Parallel::ForkManager to control some child processes. I would like to limit the number of processes running concurrently to 10. In total I need to run 20.
I know I could set the process limit at 10 in the first line at the object declaration, but I am also using the $pm object to run child processes that do something different (the current function is much more memory intensive so needs to be limited).
The code I have currently does not work, the run on finish call is never made, so the remaining 10 children never get forked. I don't understand why this is the case- I'd have thought the child would still call the finish code on exit,and decrement the count, but the "if" statement seems to stop this. Could someone explain why this is the case?
Thanks for any help!
# Parallel declarations
my $pm = Parallel::ForkManager->new(30);
$pm->run_on_finish(sub {
my ($pid, $exit_code, $ident, $exit_signal, $core_dump, $data_str_ref) = #_;
--$active_jobs;
})
my $total_jobs = 0;
my $active_jobs = 0;
while( $total_jobs < 20) {
sleep 300 and next if $active_jobs > 10;
my $pid = $pm->start and ++$active_p1_jobs and ++$total_p1_jobs and next;
my $return = module::function(%args);
$pm->finish(0, { index => $total_jobs, return => $return });
}
print STDERR "Submitted all jobs, now waiting for children to exit.\n";
$pm->wait_all_children();
I'm going to call "type 2" the jobs that are limited to 10.
This is how I'd do it with P::FM:
use strict;
use warnings;
use List::Util qw( shuffle );
use Parallel::ForkManager qw( );
use POSIX qw( WNOHANG );
use Time::HiRes qw( sleep );
use constant MAX_WORKERS => 30;
use constant MAX_TYPE2_WORKERS => 10;
sub is_type2_job { $_[0]{type} == 2 }
my #jobs = shuffle(
( map { { type => 1, data => $_ } } 0..19 ),
( map { { type => 2, data => $_ } } 0..19 ),
);
my $pm = Parallel::ForkManager->new(MAX_WORKERS);
my $type2_count = 0;
$pm->run_on_finish(sub {
my ($pid, $exit_code, $ident, $exit_signal, $core_dump, $job) = #_;
--$type2_count if is_type2_job($job);
print "Finished: $pid, $job->{type}, $job->{data}, $job->{result}\n";
});
my #postponed_jobs;
while (#postponed_jobs || #jobs) {
my $job;
if (#postponed_jobs && $type2_count < MAX_TYPE2_WORKERS) {
$job = shift(#postponed_jobs);
}
elsif (#jobs) {
$job = shift(#jobs);
if ($type2_count >= MAX_TYPE2_WORKERS && is_type2_job($job)) {
push #postponed_jobs, $job;
redo;
}
}
# elsif (#postponed_jobs) {
# # Already max type 2 jobs being processed,
# # but there are idle workers.
# $job = shift(#postponed_jobs);
# }
else {
local $SIG{CHLD} = sub { };
select(undef, undef, undef, 0.300);
$pm->wait_one_child(WNOHANG);
redo;
}
++$type2_count if is_type2_job($job);
my $pid = $pm->start and next;
$job->{result} = $job->{data} + 100; # Or whatever.
$pm->finish(0, $job);
}
$pm->wait_all_children();
But this is broken. The code that picks the next job should be done in the middle of start (i.e. after it waits for children to finish, but before it forks), not before start. This could cause jobs to be run out of order. This isn't the first time I've wished P::FM has a pre-fork callback. Maybe you could ask the maintainer for one.

perl process queue

I have a Perl script which forks a number of sub-processes. I'd like to have some kind of functionality like xargs --max-procs=4 --max-args=1 or make -j 4, where Perl will keep a given number of processes running until it runs out of work.
It's easy to say fork four process and wait for them all to complete, and then fork another four, but I'd like to keep four or n processes running at the same time, forking a new process as soon as one completes.
Is there a simple way in Perl to implement such a process pool?
Forks::Super can handle this requirement.
use Forks::Super MAX_PROC => 5, ON_BUSY => [ block | queue ];
Calls to fork() can block until the number of active subprocesses falls below 5, or you can pass additional parameters to the fork call and the tasks to perform can queue up:
fork { sub => sub { ... task to run in subprocess ... } }
When one subprocess finishes, another job on the queue will start up.
(I am the author of this module).
Check out Parallel::ForkManager -- it does much of what you describe. You can set a maximum number of processes, and the callback function could start a new child as soon as one finishes (as long as there is work to do).
While I would almost always use a CPAN module, or write something with the fantastic AnyEvent modules I think its important to understand how these things work under the hood. Here's an example that has no dependencies other than perl. The same approach could also be written in C without too much trouble.
#!/usr/bin/env perl
use strict;
## run a function in a forked process
sub background (&) {
my $code = shift;
my $pid = fork;
if ($pid) {
return $pid;
} elsif ($pid == 0) {
$code->();
exit;
} else{
die "cant fork: $!"
}
}
my #work = ('sleep 30') x 8;
my %pids = ();
for (1..4) {
my $w = shift #work;
my $pid = background {
exec $w;
};
$pids{$pid} = $w;
}
while (my $pid = waitpid(-1,0)) {
if ($?) {
if ($? & 127) {
warn "child died with signal " . ($? & 127);
} else {
warn "chiled exited with value " . ($? >> 8);
}
## redo work that died or got killed
my $npid = background {
exec $pids{$pid};
};
$pids{$npid} = delete $pids{$pid};
} else {
delete $pids{$pid};
## send more work if there is any
if (my $w = shift #work) {
my $pid = background {
exec shift #work;
};
$pids{$pid} = $w;
}
}
}

Retrying an operation after an exception: Please criticize my code

My Perl application uses resources that become temporarily unavailable at times, causing exceptions using die. Most notably, it accesses SQLite databases that are shared by multiple threads and with other applications using through DBIx::Class. Whenever such an exception occurs, the operation should be retried until a timeout has been reached.
I prefer concise code, therefore I quickly got fed up with repeatedly
typing 7 extra lines for each such operation:
use Time::HiRes 'sleep';
use Carp;
# [...]
for (0..150) {
sleep 0.1 if $_;
eval {
# database access
};
next if $# =~ /database is locked/;
}
croak $# if $#;
... so I put them into a (DB access-specific) function:
sub _retry {
my ( $timeout, $func ) = #_;
for (0..$timeout*10) {
sleep 0.1 if $_;
eval { $func->(); };
next if $# =~ /database is locked/;
}
croak $# if $#;
}
which I call like this:
my #thingies;
_retry 15, sub {
$schema->txn_do(
sub {
#thingies = $thingie_rs->search(
{ state => 0, job_id => $job->job_id },
{ rows => $self->{batchsize} } );
if (#thingies) {
for my $thingie (#thingies) {
$thingie->update( { state => 1 } );
}
}
} );
};
Is there a better way to implement this? Am I re-inventing the wheel? Is
there code on CPAN that I should use?
I'd probably be inclined to write retry like this:
sub _retry {
my ( $retrys, $func ) = #_;
attempt: {
my $result;
# if it works, return the result
return $result if eval { $result = $func->(); 1 };
# nah, it failed, if failure reason is not a lock, croak
croak $# unless $# =~ /database is locked/;
# if we have 0 remaining retrys, stop trying.
last attempt if $retrys < 1;
# sleep for 0.1 seconds, and then try again.
sleep 0.1;
$retrys--;
redo attempt;
}
croak "Attempts Exceeded $#";
}
It doesn't work identically to your existing code, but has a few advantages.
I got rid of the *10 thing, like another poster, I couldn't discern its purpose.
this function is able to return the value of whatever $func() does to its caller.
Semantically, the code is more akin to what it is you are doing, at least to my deluded mind.
_retry 0, sub { }; will still execute once, but never retry, unlike your present version, that will never execute the sub.
More suggested ( but slightly less rational ) abstractions:
sub do_update {
my %params = #_;
my #result;
$params{schema}->txn_do( sub {
#result = $params{rs}->search( #{ $params{search} } );
return unless (#result);
for my $result_item (#result) {
$result_item->update( #{ $params{update} } );
}
} );
return \#result;
}
my $data = _retry 15, sub {
do_update(
schema => $schema,
rs => $thingy_rs,
search => [ { state => 0, job_id => $job->job_id }, { rows => $self->{batchsize} } ],
update => [ { state => 1 } ],
);
};
These might also be handy additions to your code. ( Untested )
The only real problem I see is the lack of a last statement. This is how I would write it:
sub _retry {
my ($timeout, $func) = #_;
for my $try (0 .. $timeout*10) {
sleep 0.1 if $try;
eval { $func->(); 1 } or do {
next if $# =~ /database is locked/; #ignore this error
croak $#; #but raise any other error
};
last;
}
}
I might use 'return' instead of 'last' (in the code as amended by Chas Owens), but the net effect is the same. I am also not clear why you multiply the first parameter of your retry function by 10.
IMNSHO, it is far better to (re)factor common skeletal code into a function as you have done than to continually write the same code fragment over and over. There's too much danger that:
You have to change the logic - in far too many places
You forget to edit the logic correctly at some point
These are standard arguments in favour of using functions or equivalent abstractions over inline code.
In other words - good job on creating the function. And it is useful that Perl allows you to create the functions on the fly (thanks, Larry)!
Attempt by Mark Fowler seems to be pretty close to what I described above. Now, it would be handy if one could specify some sort of exception filter.