How can I run Perl system commands in the background? - perl

#!/usr/bin/env perl
use warnings; use strict;
use 5.012;
use IPC::System::Simple qw(system);
system( 'xterm', '-geometry', '80x25-5-5', '-bg', 'green', '&' );
say "Hello";
say "World";
I tried this to run the xterm-command in the background, but it doesn't work:
No absolute path found for shell: &
What would be the right way to make it work?

Perl's system function has two modes:
taking a single string and passing it to the command shell to allow special characters to be processed
taking a list of strings, exec'ing the first and passing the remaining strings as arguments
In the first form you have to be careful to escape characters that might have a special meaning to the shell. The second form is generally safer since arguments are passed directly to the program being exec'd without the shell being involved.
In your case you seem to be mixing the two forms. The & character only has the meaning of "start this program in the background" if it is passed to the shell. In your program, the ampersand is being passed as the 5th argument to the xterm command.
As Jakob Kruse said the simple answer is to use the single string form of system. If any of the arguments came from an untrusted source you'd have to use quoting or escaping to make them safe.
If you prefer to use the multi-argument form then you'll need to call fork() and then probably use exec() rather than system().

Note that the list form of system is specifically there to not treat characters such as & as shell meta-characters.
From perlfaq8's answer to How do I start a process in the background?
(contributed by brian d foy)
There's not a single way to run code in the background so you don't have to wait for it to finish before your program moves on to other tasks. Process management depends on your particular operating system, and many of the techniques are in perlipc.
Several CPAN modules may be able to help, including IPC::Open2 or IPC::Open3, IPC::Run, Parallel::Jobs, Parallel::ForkManager, POE, Proc::Background, and Win32::Process. There are many other modules you might use, so check those namespaces for other options too.
If you are on a Unix-like system, you might be able to get away with a system call where you put an & on the end of the command:
system("cmd &")
You can also try using fork, as described in perlfunc (although this is the same thing that many of the modules will do for you).
STDIN, STDOUT, and STDERR are shared
Both the main process and the backgrounded one (the "child" process) share the same STDIN, STDOUT and STDERR filehandles. If both try to access them at once, strange things can happen. You may want to close or reopen these for the child. You can get around this with opening a pipe (see open in perlfunc) but on some systems this means that the child process cannot outlive the parent.
Signals
You'll have to catch the SIGCHLD signal, and possibly SIGPIPE too. SIGCHLD is sent when the backgrounded process finishes. SIGPIPE is sent when you write to a filehandle whose child process has closed (an untrapped SIGPIPE can cause your program to silently die). This is not an issue with system("cmd&").
Zombies
You have to be prepared to "reap" the child process when it finishes.
$SIG{CHLD} = sub { wait };
$SIG{CHLD} = 'IGNORE';
You can also use a double fork. You immediately wait() for your first child, and the init daemon will wait() for your grandchild once it exits.
unless ($pid = fork) {
unless (fork) {
exec "what you really wanna do";
die "exec failed!";
}
exit 0;
}
waitpid($pid, 0);
See Signals in perlipc for other examples of code to do this. Zombies are not an issue with system("prog &").

Have you tried?
system('xterm -geometry 80x25-5-5 -bg green &');
http://www.rocketaware.com/perl/perlfaq8/How_do_I_start_a_process_in_the_.htm

This is not purely an explanation for Perl. The same problem is under C and other languages.
First understand what the system command does:
Forks
Under the child process call exec
The parent process is waiting for forked child process to finish
It does not matter if you pass multiple arguments or one argument. The difference is, with multiple arguments, the command is executed directly. With one argument, the command is wrapped by the shell, and finally executed as:
/bin/sh -c your_command_with_redirections_and_ambersand
When you pass a command as some_command par1 par2 &, then between the Perl interpreter and the command is the sh or bash process used as a wrapper, and it is waiting for some_command finishing. Your script is waiting for the shell interpreter, and no additional waitpid is needed, because Perl's function system does it for you.
When you want to implement this mechanism directly in your script, you should:
Use the fork function. See example: http://users.telenet.be/bartl/classicperl/fork/all.html
Under the child condition (if), use the exec function. Your user is similar to system, see the manual. Notice, exec causes the child process program/content/data cover by the executed command.
Under the parent condition (if, fork exits with non-zero), you use waitpid, using pid returned by the fork function.
This is why you can run the process in the background. I hope this is simple.
The simplest example:
if (my $pid = fork) { #exits 0 = false for child process, at this point is brain split
# parent ($pid is process id of child)
# Do something what you want, asynchronously with executed command
waitpid($pid); # Wait until child ends
# If you don't want to, don't wait. Your process ends, and then the child process will be relinked
# from your script to INIT process, and finally INIT will assume the child finishing.
# Alternatively, you can handle the SIGCHLD signal in your script
}
else {
# Child
exec('some_command arg1 arg2'); #or exec('some_command','arg1','arg2');
#exit is not needed, because exec completely overwrites the process content
}

Related

Perl: Move to next item in loop before it is done executing

I have a perl script that is like so:
foreach my $addr ('http://site1.com', ...., 'http://site2.com') {
my $script = `curl -m 15 $addr`;
*do stuff with $script*
}
The -m sets a timeout of 15 seconds. Is there a way to make it if a user pushes a key, it stops the current execution and moves onto the next item in the foreach? I know last; can move to the next item but I am unsure of how to link this to the key being pushed and how to do it while the curl script is running
Edit: So based on the answers it seems difficult to do it while curl is running. Would it be possible to push a key while curl is running and have it skip to the next item in the loop as soon as the curl script returns (or times out after 15sec)?
The problem you've got with this, is that when you run curl perl hands over control and waits for completion. It blocks until it's 'done'.
So it's not as easy to do this as it might seem.
As another poster alludes to - you can use a variety of parallel processing options. I would suggest the easiest is to move away from using 'any' key, and require a ctrl-c.
So you'd then do:
foreach my $addr ('http://site1.com', ...., 'http://site2.com') {
my $pid = open ( my $curl_fh, "-|", "curl -m 15 $addr" );
$SIG{'INT'} = sub { print "Aborting fetch of $addr"; kill $pid };
while ( <$curl_fh> ) {
print;
}
#might want to set it to something else.
#undef means 'ctrl-c' will abort the whole program.
#IGNORE means exactly what it says on the tin.
#important to change it though, as it has a specific pid it'll kill,
#and that might cause problems.
$SIG{'INT'} = undef;
}
What this does is configure SIGINT (e.g. ctrl-c) so it doesn't kill your program, but does kill the sub-process.
If you wanted to look at other options, I'd offer:
Multithreading, spawn a thread to 'do' the curl fetching in the background and use Thread::Queue to pass results back and forth. (Thread::Queue supports nonblocking checks).
Forking - fork a sub process to do the curl, and use your 'main' process to send a signal if a key is pressed.
IO::Select such that you're not making blocking reads on your process.
Basically you have two options:
1. Use threads
Create a new thread, call desired system function there. Wait for output. In another thread, check for user input. On input, you can kill the child process. When child process has finished, you can ignore user input.
Such a solution seems to be rather complex, with a lot of synchronization needed, probably with using signals. Risky.
2. Use non-blocking IO
Please read this thread. It explains how to make non-blocking IO reads from either a file or a pipe. You'd like to make a non-blocking read from pipe (created with open), then non-blocking read from STDIN, loop.
Seems like a way to go, but, alas, rather complex as well.

How can I make fork in Perl in different scripts?

I have a process in Perl that creates another one with the system command, I leave it on memory and I pass some variables like this:
my $var1 = "Hello";
my $var1 = "World";
system "./another_process.pl $var1 $var2 &";
But the system command only returns the result, I need to get the PID. I want to make something like fork. What should I do? How can I make something like fork but in diferent scripts?
Thanks in advance!
Perl has a fork function.
See perldoc perlfaq8 - How do I start a process in the background?
(contributed by brian d foy)
There's not a single way to run code
in the background so you don't have to
wait for it to finish before your
program moves on to other tasks.
Process management depends on your
particular operating system, and many
of the techniques are in perlipc.
Several CPAN modules may be able to
help, including
IPC::Open2
or
IPC::Open3
,
IPC::Run
,
Parallel::Jobs
,
Parallel::ForkManager
,
POE
,
Proc::Background
, and
Win32::Process
.
There are many other modules you might
use, so check those namespaces for
other options too. If you are on a
Unix-like system, you might be able to
get away with a system call where you
put an & on the end of the command:
system("cmd &")
You can also try using
fork,
as described in
perlfunc
(although this is the same thing that
many of the modules will do for you).
STDIN, STDOUT, and STDERR are shared
Both the main process and the
backgrounded one (the "child" process)
share the same STDIN, STDOUT and
STDERR filehandles. If both try to
access them at once, strange things
can happen. You may want to close or
reopen these for the child. You can
get around this with opening a pipe
(see open) but on some systems this
means that the child process cannot
outlive the parent.
Signals
You'll have to catch the SIGCHLD
signal, and possibly SIGPIPE too.
SIGCHLD is sent when the backgrounded
process finishes. SIGPIPE is sent when
you write to a filehandle whose child
process has closed (an untrapped
SIGPIPE can cause your program to
silently die). This is not an issue
with system("cmd&").
Zombies
You have to be prepared to "reap" the
child process when it finishes.
$SIG{CHLD} = sub { wait };
$SIG{CHLD} = 'IGNORE'; You can also
use a double fork. You immediately
wait() for your first child, and the
init daemon will wait() for your
grandchild once it exits.
unless ($pid = fork) {
unless (fork) {
exec "what you really wanna do";
die "exec failed!";
}
exit 0;
}
waitpid($pid, 0);
See Signals in
perlipc
for other examples of code to do this.
Zombies are not an issue with
system("prog &").system("prog &").
It's true that you can use fork/exec, but I think it will be much easier to simply use the pipe form of open. Not only is the return value the pid you are looking for, you can be connected to either the stdin or stdout of the process, depending on how you open. For instance:
open my $handle, "foo|";
will return the pid of foo and connect you to the stdout so that if you you get a line of output from foo. Using "|foo" instead will allow you to write to foo's stdin.
You can also use open2 and open3 to do both simultaneously, though that has some major caveats applied as you can run in to unexpected issues due to io buffering.
Use fork and exec.
If you need to get the PID of a perl script you can use the $$ variable. You can put it in your another_process.pl then have it output the pid to a file. Can you be more clear on like fork? You can always use the fork exec combination.

Problem with piped filehandle in perl

I am trying to run bp_genbank2gff3.pl (bioperl package) from another perl script that
gets a genbank as its argument.
This does not work (no output files are generated):
my $command = "bp_genbank2gff3.pl -y -o /tmp $ARGV[0]";
open( my $command_out, "-|", $command );
close $command_out;
but this does
open( my $command_out, "-|", $command );
sleep 3; # why do I need to sleep?
close $command_out;
Why?
I thought that close is supposed to block until the command is done:
Closing any piped filehandle causes
the parent process to wait for the
child to finish...
(see http://perldoc.perl.org/functions/open.html).
Edit
I added this as last line:
say "ret=$ret, \$?=$?, \$!=$!";
and in both cases the printout is:
ret=, $?=13, $!=
(which means close failed in both cases, right?)
$? = 13 means your child process was terminated by a SIGPIPE signal. Your external program (bp_genbank2gff3.pl) tried to write some output to a pipe to your perl program. But the perl program closed its end of the pipe so your OS sent a SIGPIPE to the external program.
By sleeping for 3 seconds, you are letting your program run for 3 seconds before the OS kills it, so this will let your program get something done. Note that pipes have a limited capacity, though, so if your parent perl script is not reading from the pipe and if the external program is writing a lot to standard output, the external program's write operations will eventually block and you may not really get 3 seconds of effort from your external program.
The workaround is to read the output from the external program, even if you are just going to throw it away.
open( my $command_out, "-|", $command );
my #ignore_me = <$command_out>;
close $command_out;
Update: If you really don't care about the command's output, you can avoid SIGPIPE issues by redirecting the output to /dev/null:
open my $command_out, "-|", "$command > /dev/null";
close $command_out; # succeeds, no SIGPIPE
Of course if you are going to go to that much trouble to ignore the output, you might as well just use system.
Additional info: As the OP says, closing a piped filehandle causes the parent to wait for the child to finish (by using waitpid or something similar). But before it starts waiting, it closes its end of the pipe. In this case, that end is the read end of the pipe that the child process is writing its standard output to. The next time the child tries to write something to standard output, the OS detects that the read end of that pipe is closed and sends a SIGPIPE to the child process, killing it and quickly letting the close statement in the parent finish.
I'm not sure what you're trying to do but system is probably better in this case...

How can I determine if the script is being executed within a system or qx call in Perl?

In Perl, is it possible to determine if a script is being executed within another script (presumably via system or qx)?
$ cat foo.pl
print "foo";
print "\n" if not $in_qx; # or the like.
I realize this is not applicable if the script was being run via exec.
I know for certain that system runs the process as a fork and I know fork can return a value that is variable depending on whether you are in the parent or the child process. Not certain about qx.
Regardless, I'm not certain how to figure out if I'm in a forked process without actually performing a fork.
All processes are forked from another process (except init). You can sort of tell if the program was run from open, qx//, open2, or open3 by using the isatty function from POSIX, but there is no good way to determine if you are being run by system without looking at the process tree, and even then it can get murky (for instance system "nohup", "./foo.pl" will not have the calling perl process as its parent).
You could check "who's your daddy", using "getppid" (get parent id). Then check if your parent id is a perl script with pgrep or similar.
Do you control the caller? The simplest thing to do would be to pass an argument, e.g. --isforked.

How do I run a Perl script from within a Perl script?

I've got a Perl script that needs to execute another Perl script. This second script can be executed directly on the command line, but I need to execute it from within my first program. I'll need to pass it a few parameters that would normally be passed in when it's run standalone (the first script runs periodically, and executes the second script under a certain set of system conditions).
Preliminary Google searches suggest using backticks or a system() call. Are there any other ways to run it? (I'm guessing yes, since it's Perl we're talking about :P ) Which method is preferred if I need to capture output from the invoked program (and, if possible, pipe that output as it executes to stdout as though the second program were invoked directly)?
(Edit: oh, now SO suggests some related questions. This one is close, but not exactly the same as what I'm asking. The second program will likely take an hour or more to run (lots of I/O), so I'm not sure a one-off invocation is the right fit for this.)
You can just do it.
{
local #ARGV = qw<param1 param2 param3>;
do '/home/buddy/myscript.pl';
}
Prevents the overhead of loading in another copy of perl.
The location of your current perl interpreter can be found in the special variable $^X. This is important if perl is not in your path, or if you have multiple perl versions available but which to make sure you're using the same one across the board.
When executing external commands, including other Perl programs, determining if they actually ran can be quite difficult. Inspecting $? can leave lasting mental scars, so I prefer to use IPC::System::Simple (available from the CPAN):
use strict;
use warnings;
use IPC::System::Simple qw(system capture);
# Run a command, wait until it finishes, and make sure it works.
# Output from this program goes directly to STDOUT, and it can take input
# from your STDIN if required.
system($^X, "yourscript.pl", #ARGS);
# Run a command, wait until it finishes, and make sure it works.
# The output of this command is captured into $results.
my $results = capture($^X, "yourscript.pl", #ARGS);
In both of the above examples any arguments you wish to pass to your external program go into #ARGS. The shell is also avoided in both of the above examples, which gives you a small speed advantage, and avoids any unwanted interactions involving shell meta-characters. The above code also expects your second program to return a zero exit value to indicate success; if that's not the case, you can specify an additional first argument of allowable exit values:
# Both of these commands allow an exit value of 0, 1 or 2 to be considered
# a successful execution of the command.
system( [0,1,2], $^X, "yourscript.pl", #ARGS );
# OR
capture( [0,1,2, $^X, "yourscript.pl", #ARGS );
If you have a long-running process and you want to process its data while it's being generated, then you're probably going to need a piped open, or one of the more heavyweight IPC modules from the CPAN.
Having said all that, any time you need to be calling another Perl program from Perl, you may wish to consider if using a module would be a better choice. Starting another program carries quite a few overheads, both in terms of start-up costs, and I/O costs for moving data between processes. It also significantly increases the difficulty of error handling. If you can turn your external program into a module, you may find it simplifies your overall design.
All the best,
Paul
I can think of a few ways to do this. You already mentioned the first two, so I won't go into detail on them.
backticks: $retVal = `perl somePerlScript.pl`;
system() call
eval
The eval can be accomplished by slurping the other file into a string (or a list of strings), then 'eval'ing the strings. Heres a sample:
#!/usr/bin/perl
open PERLFILE, "<somePerlScript.pl";
undef $/; # this allows me to slurp the file, ignoring newlines
my $program = <PERLFILE>;
eval $program;
4 . do: do 'somePerlScript.pl'
You already got good answers to your question, but there's always the posibility to take a different point of view: maybe you should consider refactoring the script that you want to run from the first script. Turn the functionality into a module. Use the module from the first and from the second script.
If you need to asynchronously call your external script -you just want to launch it and not wait for it to finish-, then :
# On Unix systems, either of these will execute and just carry-on
# You can't collect output that way
`myscript.pl &`;
system ('myscript.pl &');
# On Windows systems the equivalent would be
`start myscript.pl`;
system ('start myscript.pl');
# If you just want to execute another script and terminate the current one
exec ('myscript.pl');
Use backticks if you need to capture the output of the command.
Use system if you do not need to capture the output of the command.
TMTOWTDI: so there are other ways too, but those are the two easiest and most likely.
See the perlipc documentation for several options for interprocess communication.
If your first script merely sets up the environment for the second script, you may be looking for exec.
#!/usr/bin/perl
use strict;
open(OUTPUT, "date|") or die "Failed to create process: $!\n";
while (<OUTPUT>)
{
print;
}
close(OUTPUT);
print "Process exited with value " . ($? >> 8) . "\n";
This will start the process date and pipe the output of the command to the OUTPUT filehandle which you can process a line at a time. When the command is finished you can close the output filehandle and retrieve the return value of the process. Replace date with whatever you want.
I wanted to do something like this to offload non-subroutines into an external file to make editing easier. I actually made this into a subroutine. The advantage of this way is that those "my" variables in the external file get declared in the main namespace. If you use 'do' they apparently don't migrate to the main namespace. Note the presentation below doesn't include error handling
sub getcode($) {
my #list;
my $filename = shift;
open (INFILE, "< $filename");
#list = <INFILE>;
close (INFILE);
return \#list;
}
# and to use it:
my $codelist = [];
$codelist = getcode('sourcefile.pl');
eval join ("", #$codelist);