how to launch multiple fire and forget PHP scripts with Perl? - perl

I currently have a perl script which I am trying to use to launch three (or more) php scripts each with a set of arguments provided from a database:
$sql = "SELECT id,url,added,lasttotal,lastsnapshot,speed,nextsnapshot FROM urls WHERE DATE(NOW()) > DATE(nextsnapshot) LIMIT 0,3";
$sth = $dbh->prepare($sql);
$sth->execute or print "SQL Error: $DBI::errstr\n";
my ($urlID, $url, $added,$lastTotal,$lastSnapshot,$lastSpeed,$nextsnapshot);
$sth->bind_col(1, \$urlID);
$sth->bind_col(2, \$url);
$sth->bind_col(3, \$added);
$sth->bind_col(4, \$lastTotal);
$sth->bind_col(5, \$lastSnapshot);
$sth->bind_col(6, \$lastSpeed);
$sth->bind_col(7, \$nextsnapshot);
while ($sth->fetch) {
$myexec = "php /usr/www/users/blah/blah/launch_snapshot.php '$url' $urlID '$added' $lastTotal '$lastSnapshot' $lastSpeed".' /dev/null 2>&1 &';
exec ($myexec) or print "\n Couldn't exec $myexec: $!";
}
I don't care about any results from the PHP scripts, I just need to start them all at once, or with a very small delay.
The fetch works properly and returns three unique sets of values. However, it never seems to get past launching the first php script. I don't get any error messages.
Any help would be most appreciated.

You could use fork or just system for that.
Using fork:
foreach($sth->fetch) {
my $pid = fork();
if($pid) { # Parent
waitpid($pid, 0);
} elsif ($pid == 0) { # A child
$myexec = "...";
exec($myexec) or print "\n Couldn't exec $myexec: $!";
exit(0); # Important!
} else {
die "couldn't fork: $!\n";
}
}
Using system:
foreach($sth->fetch) {
$myexec = "...";
system($myexec);
}

From perldoc -f exec
exec LIST
exec PROGRAM LIST
The "exec" function executes a system command and never
returns-- use "system" instead of "exec" if you want it to
return. It fails and returns false only if the command does
not exist and it is executed directly instead of via your
system's command shell (see below).
You want to system (or fork) not exec.

Related

Perl how to properly handle System Commands (including Timeout with Kill & capture of RC/STDERR/STDOUT)

From a Perl script I want to execute various system commands and process the output in my script.
The script will be run automatically, so I want to make sure that no commands are hanging etc.
I'm open to any kind of feedback.
My requirements for the command execution:
Timeout -> If command runs longer than XX Seconds, it should kill its process(es)
If command returns information, it should not have to wait for end of timeout
I want to capture the exit status, STDERR, STDOUT in the script.
Here is an example I worked out from an other stackoverflow question: Kill a hung child process
What's not working for me at the moment:
cannot capture exit status of executed command
cannot capture STDERR of executed command
Code:
my $cmd = "sleep 15"; # other tests i use -> "echo bla" and "alkjdsf"
my $TIMEOUT = 10;
my $pid = open my $proc, '-|', "$cmd";
if (fork() == 0) {
my $poor_mans_alarm = "sleep 1,kill 0,$pid ||exit for 1..$TIMEOUT;kill 9,$pid";
# run poor man's alarm in a background process
exec($^X, '-e', "$poor_mans_alarm");
}
my $process_output = "";
while (<$proc>) {
$process_output .= $_;
}
If you either have a trick for this code or recommend a completely different solution, let me know.
Thanks and cheers
Addition:
Got a working Example with IPC::Open3,
But for future reader please Check out IPC::Run which has a Timeout Functionality included,
as mentioned by James Green.
Working example with IPC::Open3:
my $pid = open3(\*WRITE, \*READ,\*ERROR,"$command");
if (fork() == 0) {
my $poor_mans_alarm = "sleep 1,kill 0,$pid ||exit for 1..10;kill 9,$pid";
# run poor man's alarm in a background process
exec($^X, '-e', "$poor_mans_alarm");
}
# get all the STDOUT and STDERR from the Child.
while (<READ>) {
$output .= $_;
}
while (<ERROR>) {
$output .= $_;
}
waitpid($pid, 0);
if ($?) {
$rc = $? >> 8;
if ($rc != 1){
print "Some error $?\n";
}
}
It looks like IPC::Run provides pretty much everything you're after, including timeouts and capture of both STDOUT and STDERR. Docs are at https://metacpan.org/pod/IPC::Run including some usage examples.

want to get the die error message from a command executed via open in Perl

I am trying to fork out a cmd like below
my $h = IO::Handle->new;
$self->{-handle} = $h;
die "IO::Handle->new failed." unless defined $h;
$self->{-pid} = open $h, $self->{-command} . ' 2>&1 |';
$self->fileevent($h, 'readable' => [\&_read_cmd_op, $self]);
sub _read_cmd_op{
my $h = $self->{-handle};
if ( sysread $h, $_, 4096 ) {
my $t = $self->Subwidget('text');
$t->insert('end', $_);
$t->yview('end');
} else {
$self->{-finish} = 1;
}
}
Now the problem is that the '$self{-command}' is invoking
another perl script which if dies I want to know.
Note that the $self{-pid} still exists even if cmd dies.
The above code is in a Perl/TK app, where the $self->{-command} o/p in captured in a
text widget.
Somehow i don't get the die message even in the test widget.
I see it on stdout.
2 questions
How can i get the cmd op/error in the text widget?
How can i know that the command fired via IO::Handle died?
$self->{-pid} is just the pid of the forked process, not some magic object which goes away if the command exits.
I cannot reproduce the problem not getting the die() message. If the snippet above is called with 'perl -e "die 123"', then "123" appears in the text widget (at least on a Unix system).
For getting the exit code you can use something like the following.
} else {
$mw->fileevent($h, 'readable', '');
my $pid = waitpid($self->{-pid},0);
warn "pid $pid finished";
warn "retcode is " . ($? >> 8);
$self->{-finish} = 1;
}
The fileevent call with the empty callback stops further selects on this filehandle. With the waitpid call you wait for the termination of the child process. Once this happens, the exit code is available in the $? variable, like after a normal system() call. So for a non-zero exit code you know that the command died or exited with a false value.

Open3 outputting to std error only and not asynchronously

I am relatively new to perl programming and I am trying to figure out how open3 works. Here is the code.
#!/usr/bin/perl
use strict;
use warnings;
use IPC::Open3;
my $dir = "/home/vman/Documents/Dev/perl_scripts/Provenance/temp";
my $fileHandle;
my $bashPid;
print "Starting main program\n";
my $pid = fork();
if($pid)#Parent process2
{
print("Start transfer.\n");
$bashPid = $pid;
#Attaching an strace to the executed command which happens in the child process
open3(\*WRITE, \*READ,\*ERROR,"strace", "-f", "-F", "-e", "trace=open,execve","-p", $bashPid, "-s", "2097152","-q");
while(<READ>)
{
print("Here1\n");
print("$_");
}
while(<ERROR>)
{
print("$_");
}
print("Finish transfer.\n");
}
elsif($pid == 0)
{
if (scalar(#ARGV == 0))
{
exit
}
my $args = join(' ', #ARGV);
exec($args);
}
else
{
die("Could not fork.");
}
close(READ);
close(WRITE);
close(ERROR);
waitpid($bashPid, 0);
print "End of main program\n";
I want to run an strace on a bash process, then capture all the output while it is being outputted. Then I will take that output and parse it to see what files are being changed by which process and I will save those changes in a mysql database. For now all I am trying to do is attach an strace onto an existing bash process and get the output of that strace printed within the bash terminal that is running just to make sure that it is asynchronously reading the output.
One of the problems is that I am getting the output through the ERROR filehandle. I am a little confused on to why this is happening. Am I using the correct order for open3 and if there is an error why is the correct output even making it to stderr?
The second problem I have is that I am getting the output only when exec ends which is no good since it needs to be done while exec is running. I thought open3 runs asynchronously.
As per suggested this is what I did and it works perfectly.
#!/usr/bin/perl
use strict;
use warnings;
use IPC::Run3;
my $bashPid;
print "Starting main program\n";
my $pid = fork();
if($pid)#Parent process
{
print("Start transfer.\n");
$bashPid = $pid;
#Attaching an strace to the executed command which happens in the child process
my $command = "strace -fFe trace=open,execve -p $bashPid -s 2097152 -q";
run3($command, \*STDIN, \*STDOUT, \*STDERR);
if ($?)
{
die "something went horribly wrong";
}
while(<STDERR>)
{
print($_);
}
print("Finish transfer.\n");
}
elsif($pid == 0)#cild process
{
if (scalar(#ARGV == 0))
{
exit
}
my $args = join(' ', #ARGV);
exec($args);
}
else
{
die("Could not fork.");
}
close(STDIN);
close(STDOUT);
close(STDERR);
waitpid($bashPid, 0);
print "End of main program\n";
One of the problems is that I am getting the output through the ERROR filehandle
Correct. strace writes to STDERR.
The second problem I have is that I am getting the output only when exec ends which is no good since it needs to be done while exec is running. I thought open3 runs asynchronously.
That's because you only start reading from the child's STDERR after the child closes its STDOUT when it ends.
In fact, you're lucky you haven't deadlocked yet. By reading one at a time as you are currently, doing, you'll deadlock when strace has output enough to fill the pipe.
You need to read from both the child's STDOUT and STDERR as it comes in. You could do this using with the help of select, polling non-blocking handle or threads. None of those options are as simple as ditching open3 and using a higher-level module that handles this for you. The simpler IPC::Run3 and the fully featured IPC::Run are good choices.

Redirecting STDOUT in child process

Have a Parent process which spawns multipe child process via fork. I want the log files by the parent and child process to be separate. The Problem is child process STDOUT gets redirected into the parent log file as well as the child log file. Not sure what i need to change to avoid child process log message to get into the parent log file. Also i dont understand in the below setEnvironment function the purpose of creating OUT and ERR file handle. This is a existing code so i kept as it is. In the parent process and child process i set the variable $g_LOGFILE to contain different file name so that separate log files are created. Also i call setEnvironment function in both parent and child process. I tried by closing STDOUT,STDERR,STDIN in the child process and calling the setenvironment but it was not working properly.
sub setEnvironment()
{
unless ( open(OUT, ">&STDOUT") )
{
print "Cannot redirect STDOUT";
return 2;
}
unless ( open(ERR, ">&STDERR") )
{
print "Cannot redirect STDERR";
return 2;
}
unless ( open(STDOUT, "|tee -ai $g_LOGPATH/$g_LOGFILE") )
{
print "Cannot open log file $g_LOGPATH/$g_LOGFILE");
return 2;
}
unless ( open(STDERR, ">&STDOUT") )
{
print "Cannot redirect STDERR");
return 2 ;
}
STDOUT->autoflush(1);
}
####################### Main Program ######################################
$g_LOGFILE="parent.log";
while ($file = readdir(DIR))
{
my $pid = fork;
if ( $pid ) {
setEnvironment();
#parent process code goes here
printf "%s\n", "parent";
next;
}
$g_LOGFILE="child.log";
setEnvironment();
#child code goes here
printf "%s\n", "child";
exit;
}
wait for #pids
Ok i tested this code alitle. Here is my sample code. In my code there is similar(not exact) problem: all messages are double-written to childs log file.
So my answers to your questions:
The Problem is child process STDOUT gets redirected into the parent log file as well as the child log file.
This because when you open file with pipe (open(STDOUT, "|tee ...) as a underlying result your process fork() to create child process and then exec into program what you run (tee). Forking(for tee) takes STDOUT of master process so tee will write into parent's logfile. So i think you must revoke using STDOUT handle for master process. Or, second way - remove use of tee - its simplest way.
Also i dont understand in the below setEnvironment function the purpose of creating OUT and ERR file handle.
Seems this is someone's woraround about problem above. You can grep -rE '
\bERR\b' . to search in code if it used or not. Probably someone wanted to save real STDOUT and STDERR to further use.
It would appear that the intent of the original code is as follows:
when the script is started from, say, a terminal, then provide aggregate parent and child output to the terminal
additionally, provide a copy of the parent output in parent.log, and a copy of child output in child.log
Note that #Unk's answer is correct as far as 2. goes, and has less moving parts than any code using tee, but fails to achieve 1.
If it is important to achieve both 1. and 2. above, then take your original code and simply add the following at the top of your setEnvironment method:
sub setEnvironment()
{
if ( fileno OUT )
{
unless ( open(STDOUT, ">&OUT") )
{
print "Cannot restore STDOUT";
return 2;
}
unless ( open(STDERR, ">&ERR") )
{
print "Cannot restore STDERR";
return 2;
}
}
else
{
unless ( open(OUT, ">&STDOUT") )
{
print "Cannot redirect STDOUT";
return 2;
}
unless ( open(ERR, ">&STDERR") )
{
print "Cannot redirect STDERR";
return 2;
}
}
unless ( open(STDOUT, "|tee -ai $g_LOGPATH/$g_LOGFILE") )
...
Incidentally, do not forget to also add $pid to #pids if your actual code does not do that already:
...
my $pid = fork;
if ( $pid ) {
push #pids, $pid;
...
Why and how does this work? We simply want to temporarily restore the original STDOUT immediately before rewiring it into tee, so that tee inherits it as its standard output and actually writes directly to the original STDOUT (e.g. your terminal) instead of writing (in the case of the forked children) through the parent's tee (which is where the child's STDOUT normally pointed to before this change, by virtue of inheriting from the paremnt process, and which is what injected those child lines into parent.log.)
So in answer to one of your questions, whoever wrote the code to set OUT and ERR must have had exactly the above in mind. (I cannot help but wonder whether or not the difference in indentation in your original code is indicative of someone having removed, in the past, code similar to the one you have to add back now.)
Here's what you now get at the end of the day:
$ rm -f parent.log child.log
$ perl test.pl
child
parent
child
parent
parent
child
parent
child
parent
$ cat parent.log
parent
parent
parent
parent
parent
$ cat child.log
child
child
child
child
child
You can always redirect STDOUT to log file by closing it first and then reopening:
close STDOUT;
open STDOUT, ">", $logfile;
Small downside to this is that once STDOUT is redirected, you will not see any output on terminal during script execution.
If you want parent and child process have different log files, just perform this redirection in both to different log files after fork(), something like this:
print "Starting, about to fork...\n";
if (fork()) {
print "In master process\n";
close STDOUT;
open STDOUT, ">", "master.log";
print "Master to log\n";
} else {
print "In slave process\n";
close STDOUT;
open STDOUT, ">", "slave.log";
print "Slave to log\n";
}
I have tested that this works as expected on Linux and Windows.
#!/usr/bin/perl
use strict;
use warnings;
use utf8;
use Capture::Tiny qw/capture_stdout/;
my $child_log = 'clild.log';
my $parent_log = 'parent.log';
my $stdout = capture_stdout {
if(fork()){
my $stdout = capture_stdout {
print "clild\n";
};
open my $fh, '>', $child_log;
print $fh $stdout;
close $fh;
exit;
}
print "parent\n";
};
open my $fh, '>', $parent_log;
print $fh $stdout;
close $fh;
All the other answers are correct (PSIalt's in particular) - I'm merely hoping that I can answer with corrected code that is identifiably close to that in the question. The key things to notice:
"|tee -ai..."
The tee commands prints its standard in to its standard out while also printing to the given file. As PSIalt says, removing it is the easiest way to ensure each process' output goes only to the correct file.
setEnvironment() inside loop for parent
The original code is constantly redirecting STDOUT back to the teeed file. Therefore recapturing STDOUT. Given my code below, if you moved setEnvironment to above #parent process code goes here you would see all but one 'Real STDOUT' and 'Real STDERR' actually appearing in parent.log.
Options
The ideal is to remove any reliance on redirecting STDOUT / STDERR for logging. I would have a dedicated log($level, $msg) function and start to move all code to using it. Initially it's OK if it is simply a façade for the existing behaviour - you can simply switch it out when you reach an appropriate threshold of code covered.
If it's a basic script and doesn't produce stupidly large logs, why not just print everything to STDOUT with some prefix you can grep for (e.g. 'PARENT:' / 'CHILD:')?
It's a bit outside the scope of the question, but consider using a more structured approach to logging. I would consider using a CPAN logging module, e.g. Log::Log4perl. This way, the parent and children can simply request the correct log category, rather than mess around with file handles. Additional advantages:
Standardise output
Allow reconfiguration on the fly - change logging level from ERROR to DEBUG on a running-but-misbehaving system
Easily redirect output - no change is needed to your code to rearrange log files, rotate files, redirect to a socket / database instead, etc...
use strict;
use warnings;
our $g_LOGPATH = '.';
our $g_LOGFILE = "parent.log";
our #pids;
setEnvironment();
for ( 1 .. 5 ) {
my $pid = fork;
if ($pid) {
#parent process code goes here
printf "%s\n", "parent";
print OUT "Real STDOUT\n";
print ERR "Real STDERR\n";
push #pids, $pid;
next;
}
$g_LOGFILE = "child.log";
setEnvironment();
#child code goes here
printf "%s\n", "child";
exit;
}
wait for #pids;
sub setEnvironment {
unless ( open( OUT, ">&STDOUT" ) ) {
print "Cannot redirect STDOUT";
return 2;
}
unless ( open( ERR, ">&STDERR" ) ) {
print "Cannot redirect STDERR";
return 2;
}
unless ( open( STDOUT, '>>', "$g_LOGPATH/$g_LOGFILE" ) ) {
print "Cannot open log file $g_LOGPATH/$g_LOGFILE";
return 2;
}
unless ( open( STDERR, ">&STDOUT" ) ) {
print "Cannot redirect STDERR";
return 2;
}
STDOUT->autoflush(1);
}
child.log:
child
child
child
child
child
parent.log:
parent
parent
parent
parent
parent
STDOUT taken from terminal:
Real STDOUT (x5 lines)
STDERR taken from terminal:
Real STDERR (x5 lines)

Perl system call throws strange error

I just wrote a perl script that is restarting a list of services on a linux server. It's intended to run as a cron job. when I execute the script though, I keep getting this error;
root#www:~/scripts# ./ws_restart.pl
* Stopping web server apache2 [ OK ]
sh: Syntax error: "(" unexpected
* Stopping MySQL database server mysqld [ OK ]
sh: Syntax error: "(" unexpected
The call that is being used to do this is;
system("/etc/init.d/apache2 stop");
system("/etc/init.d/mysql stop");
I can paste the entire script code if needed, but I figured that this is the source of the problem and just need to know how to stop it.
Any ideas?
Here's the entire script;
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my $old_pids = {};
my $post_stop_ids = {};
my #services = qw/apache2 mysql solr/;
my $app_dir = '/home/grip/apps/eventfinder';
# collect existing pids then kill services
foreach my $service (#services) {
# gather up existing pids
$old_pids->{$service} = [ get_pids_by_process($service) ];
# issue stop command to each service
set_service_state($service, 'stop');
# attempt to regather same ids
$post_stop_ids->{$service} = [ get_pids_by_process($service) ];
# kill any rogue ids left over
kill_rogue_procs($post_stop_ids->{$service});
# give each kill time to finish
sleep(5);
}
# attempt to restart killed services
foreach my $service (#services) {
# issue start command to each service
set_service_state($service, 'start');
# Let's give each service enough time to crawl outta bed.
# I know how much I hate waking up
sleep(5);
}
# wait for it!...wait for it! :P
# Pad an extra 5 seconds to give solr enough time to come up before we reindex
sleep(5);
# start the reindexing process of solr
system("cd $app_dir ; RAILS_ENV=production rake reindex_active");
# call it a day...phew!
exit 0;
sub kill_rogue_procs {
my #ids = shift;
# check if we still have any rogue processes that failed to die
# if so, kill them now.
if(scalar #ids) {
foreach my $pid (#ids) {
system("kill $pid");
}
}
}
sub set_service_state {
my ($proc, $state) = #_;
if($proc eq 'apache2') {
system("/etc/init.d/apache2 $state");
} elsif($proc eq 'mysql') {
system("/etc/init.d/mysql $state");
} elsif($proc eq 'solr') {
system("cd $app_dir ; RAILS_ENV=production rake sunspot:solr:$state");
}
}
sub get_pids_by_process {
my $proc = shift;
my #proc_ids = ();
open(PSAE, "/bin/ps -ae | grep $proc |") || die("Couldn't run command");
while(<PSAE>) {
push #proc_ids, $_ =~ /(\d{1,5})/;
}
close PSAE;
return #proc_ids;
}
Actually, I'd be more suspicious of what's in #ids in kill_rogue_procs. It's the result of a ps followed by a grep, so might have bogus values if ps doesn't return any results or if the pid isn't 5 digits long.
This is wrong:
sub kill_rogue_procs {
my #ids = shift;
# check if we still have any rogue processes that failed to die
# if so, kill them now.
if(scalar #ids) {
From what you're passing to this sub, #ids will always contain a single array reference, so (scalar #ids) will always be true. It also means you end up passing something like the following to sh:
kill ARRAY(0x91b0768)
You want something like (if the arrayref is empty, there's nothing to loop over anyway):
my $ids = shift;
...
for my $pid (#$ids) {
kill SIGTERM => $pid;
Or instead of the loop:
kill SIGTERM => #$ids;
Also, there is no need to call system to kill a process.
To this, I'd add the last line, so you don't grep the grep process itself:
sub get_pids_by_process {
my $proc = shift;
$proc =~ s/^(.)/[$1]/;
As sh is raising the errors, I'm pretty sure one of the parameters to system is being expanded to something unexpected. I'd print all parameters just prior to passing them to system for a quick debug.