I have a process in Perl that creates another one with the system command, I leave it on memory and I pass some variables like this:
my $var1 = "Hello";
my $var1 = "World";
system "./another_process.pl $var1 $var2 &";
But the system command only returns the result, I need to get the PID. I want to make something like fork. What should I do? How can I make something like fork but in diferent scripts?
Thanks in advance!
Perl has a fork function.
See perldoc perlfaq8 - How do I start a process in the background?
(contributed by brian d foy)
There's not a single way to run code
in the background so you don't have to
wait for it to finish before your
program moves on to other tasks.
Process management depends on your
particular operating system, and many
of the techniques are in perlipc.
Several CPAN modules may be able to
help, including
IPC::Open2
or
IPC::Open3
,
IPC::Run
,
Parallel::Jobs
,
Parallel::ForkManager
,
POE
,
Proc::Background
, and
Win32::Process
.
There are many other modules you might
use, so check those namespaces for
other options too. If you are on a
Unix-like system, you might be able to
get away with a system call where you
put an & on the end of the command:
system("cmd &")
You can also try using
fork,
as described in
perlfunc
(although this is the same thing that
many of the modules will do for you).
STDIN, STDOUT, and STDERR are shared
Both the main process and the
backgrounded one (the "child" process)
share the same STDIN, STDOUT and
STDERR filehandles. If both try to
access them at once, strange things
can happen. You may want to close or
reopen these for the child. You can
get around this with opening a pipe
(see open) but on some systems this
means that the child process cannot
outlive the parent.
Signals
You'll have to catch the SIGCHLD
signal, and possibly SIGPIPE too.
SIGCHLD is sent when the backgrounded
process finishes. SIGPIPE is sent when
you write to a filehandle whose child
process has closed (an untrapped
SIGPIPE can cause your program to
silently die). This is not an issue
with system("cmd&").
Zombies
You have to be prepared to "reap" the
child process when it finishes.
$SIG{CHLD} = sub { wait };
$SIG{CHLD} = 'IGNORE'; You can also
use a double fork. You immediately
wait() for your first child, and the
init daemon will wait() for your
grandchild once it exits.
unless ($pid = fork) {
unless (fork) {
exec "what you really wanna do";
die "exec failed!";
}
exit 0;
}
waitpid($pid, 0);
See Signals in
perlipc
for other examples of code to do this.
Zombies are not an issue with
system("prog &").system("prog &").
It's true that you can use fork/exec, but I think it will be much easier to simply use the pipe form of open. Not only is the return value the pid you are looking for, you can be connected to either the stdin or stdout of the process, depending on how you open. For instance:
open my $handle, "foo|";
will return the pid of foo and connect you to the stdout so that if you you get a line of output from foo. Using "|foo" instead will allow you to write to foo's stdin.
You can also use open2 and open3 to do both simultaneously, though that has some major caveats applied as you can run in to unexpected issues due to io buffering.
Use fork and exec.
If you need to get the PID of a perl script you can use the $$ variable. You can put it in your another_process.pl then have it output the pid to a file. Can you be more clear on like fork? You can always use the fork exec combination.
Related
I have to start a pdf viewer from a Perl script. The viewer should
become detached from the parent process and the terminal that the parent process was run from. If I close the parent or the terminal the
viewer should still be kept running. I considered three approaches (using evince as the pdf viewer command):
Using system and sh:
system 'evince test.pdf &';
Using fork():
$SIG{CHLD} = "IGNORE"; #reap children as they complete
my $pid = fork();
if ( $pid == 0 ) {
exec 'evince', 'test.pdf';
}
Using Proc::Daemon:
use Proc::Daemon;
my $daemon = Proc::Daemon->new(
work_dir => '/tmp/evince',
child_STDOUT => '>>stdout.txt',
child_STDERR => '>>stderr.txt',
);
my $pid = $daemon->Init();
if ( $pid == 0 ) {
exec 'evince', 'test.pdf';
}
What would be the difference between these approaches? Which approach would you recommend?
system 'evince test.pdf &';
In my experience, this is likely to really be:
system 'evince $pdf_file &';
If $pdf_file is user input, then we get shell-injection bugs, such as passing in a pdf name of $(rm -rf /) or even just ;rm -rf /. And what if the name has a space in it? Well, you can avoid all that if you quote it, right?
system 'evince "$pdf_file" &';
Well, no, now all I have to do is give you a filename of ";rm -rf "/. And what if my pdf has a double quote in its name? You could use single quotes, but the same problem comes up if the filename has single quotes in it, and the shell injection isn't really any harder. You could come up with an elaborate shellify function that properly quotes a string all so that the shell can unquote it and get back to the original entry ... but that seems like so much more work than your other options, neither of which suffers from these problems.
$SIG{CHLD} = "IGNORE"; #reap children as they complete
my $pid = fork();
if ( $pid == 0 ) {
exec 'evince', 'test.pdf';
}
Setting a global $SIG{CHLD} is nice and easy ... unless you need to handle other children as they die. So only you can tell whether that's acceptable or not. And, again in my experience, not even always then. I've been bitten by this one - though rarely. I had this mixed in with an application that, elsewhere, used AnyEvent, and managed to break AE's subprocess handling. (The same would likely hold true if you mixed this with any event system, I just happened to be using AE.)
Also, this is missing the stdout and stderr redirects - and stdin redirect. That's easy enough to add - inside your if, before the exec, just close and reopen the filehandles as you need, e.g.:
close STDOUT; open STDOUT, '>', '/dev/null';
close STDERR; open STDERR, '>', '/dev/null';
close STDIN; open STDIN, '<', '/dev/null';
No big deal. However, Proc::Daemon does set up a few more things for you to ensure signals don't reach from one to the other process, in either direction. This depends on how severe you need to get.
For most of my purposes, I've found #2 to be sufficient. I've only reached for Proc::Daemon on a few projects, but that's where a) I have full control over the module installation, and b) it really matters. Starting a pdf viewer wouldn't normally be such a case.
I avoid #1 at all costs - I have had some fairly significant bites with shell injection, and now try to avoid the shell at all times.
I have a perl script that is like so:
foreach my $addr ('http://site1.com', ...., 'http://site2.com') {
my $script = `curl -m 15 $addr`;
*do stuff with $script*
}
The -m sets a timeout of 15 seconds. Is there a way to make it if a user pushes a key, it stops the current execution and moves onto the next item in the foreach? I know last; can move to the next item but I am unsure of how to link this to the key being pushed and how to do it while the curl script is running
Edit: So based on the answers it seems difficult to do it while curl is running. Would it be possible to push a key while curl is running and have it skip to the next item in the loop as soon as the curl script returns (or times out after 15sec)?
The problem you've got with this, is that when you run curl perl hands over control and waits for completion. It blocks until it's 'done'.
So it's not as easy to do this as it might seem.
As another poster alludes to - you can use a variety of parallel processing options. I would suggest the easiest is to move away from using 'any' key, and require a ctrl-c.
So you'd then do:
foreach my $addr ('http://site1.com', ...., 'http://site2.com') {
my $pid = open ( my $curl_fh, "-|", "curl -m 15 $addr" );
$SIG{'INT'} = sub { print "Aborting fetch of $addr"; kill $pid };
while ( <$curl_fh> ) {
print;
}
#might want to set it to something else.
#undef means 'ctrl-c' will abort the whole program.
#IGNORE means exactly what it says on the tin.
#important to change it though, as it has a specific pid it'll kill,
#and that might cause problems.
$SIG{'INT'} = undef;
}
What this does is configure SIGINT (e.g. ctrl-c) so it doesn't kill your program, but does kill the sub-process.
If you wanted to look at other options, I'd offer:
Multithreading, spawn a thread to 'do' the curl fetching in the background and use Thread::Queue to pass results back and forth. (Thread::Queue supports nonblocking checks).
Forking - fork a sub process to do the curl, and use your 'main' process to send a signal if a key is pressed.
IO::Select such that you're not making blocking reads on your process.
Basically you have two options:
1. Use threads
Create a new thread, call desired system function there. Wait for output. In another thread, check for user input. On input, you can kill the child process. When child process has finished, you can ignore user input.
Such a solution seems to be rather complex, with a lot of synchronization needed, probably with using signals. Risky.
2. Use non-blocking IO
Please read this thread. It explains how to make non-blocking IO reads from either a file or a pipe. You'd like to make a non-blocking read from pipe (created with open), then non-blocking read from STDIN, loop.
Seems like a way to go, but, alas, rather complex as well.
I am trying to run bp_genbank2gff3.pl (bioperl package) from another perl script that
gets a genbank as its argument.
This does not work (no output files are generated):
my $command = "bp_genbank2gff3.pl -y -o /tmp $ARGV[0]";
open( my $command_out, "-|", $command );
close $command_out;
but this does
open( my $command_out, "-|", $command );
sleep 3; # why do I need to sleep?
close $command_out;
Why?
I thought that close is supposed to block until the command is done:
Closing any piped filehandle causes
the parent process to wait for the
child to finish...
(see http://perldoc.perl.org/functions/open.html).
Edit
I added this as last line:
say "ret=$ret, \$?=$?, \$!=$!";
and in both cases the printout is:
ret=, $?=13, $!=
(which means close failed in both cases, right?)
$? = 13 means your child process was terminated by a SIGPIPE signal. Your external program (bp_genbank2gff3.pl) tried to write some output to a pipe to your perl program. But the perl program closed its end of the pipe so your OS sent a SIGPIPE to the external program.
By sleeping for 3 seconds, you are letting your program run for 3 seconds before the OS kills it, so this will let your program get something done. Note that pipes have a limited capacity, though, so if your parent perl script is not reading from the pipe and if the external program is writing a lot to standard output, the external program's write operations will eventually block and you may not really get 3 seconds of effort from your external program.
The workaround is to read the output from the external program, even if you are just going to throw it away.
open( my $command_out, "-|", $command );
my #ignore_me = <$command_out>;
close $command_out;
Update: If you really don't care about the command's output, you can avoid SIGPIPE issues by redirecting the output to /dev/null:
open my $command_out, "-|", "$command > /dev/null";
close $command_out; # succeeds, no SIGPIPE
Of course if you are going to go to that much trouble to ignore the output, you might as well just use system.
Additional info: As the OP says, closing a piped filehandle causes the parent to wait for the child to finish (by using waitpid or something similar). But before it starts waiting, it closes its end of the pipe. In this case, that end is the read end of the pipe that the child process is writing its standard output to. The next time the child tries to write something to standard output, the OS detects that the read end of that pipe is closed and sends a SIGPIPE to the child process, killing it and quickly letting the close statement in the parent finish.
I'm not sure what you're trying to do but system is probably better in this case...
#!/usr/bin/env perl
use warnings; use strict;
use 5.012;
use IPC::System::Simple qw(system);
system( 'xterm', '-geometry', '80x25-5-5', '-bg', 'green', '&' );
say "Hello";
say "World";
I tried this to run the xterm-command in the background, but it doesn't work:
No absolute path found for shell: &
What would be the right way to make it work?
Perl's system function has two modes:
taking a single string and passing it to the command shell to allow special characters to be processed
taking a list of strings, exec'ing the first and passing the remaining strings as arguments
In the first form you have to be careful to escape characters that might have a special meaning to the shell. The second form is generally safer since arguments are passed directly to the program being exec'd without the shell being involved.
In your case you seem to be mixing the two forms. The & character only has the meaning of "start this program in the background" if it is passed to the shell. In your program, the ampersand is being passed as the 5th argument to the xterm command.
As Jakob Kruse said the simple answer is to use the single string form of system. If any of the arguments came from an untrusted source you'd have to use quoting or escaping to make them safe.
If you prefer to use the multi-argument form then you'll need to call fork() and then probably use exec() rather than system().
Note that the list form of system is specifically there to not treat characters such as & as shell meta-characters.
From perlfaq8's answer to How do I start a process in the background?
(contributed by brian d foy)
There's not a single way to run code in the background so you don't have to wait for it to finish before your program moves on to other tasks. Process management depends on your particular operating system, and many of the techniques are in perlipc.
Several CPAN modules may be able to help, including IPC::Open2 or IPC::Open3, IPC::Run, Parallel::Jobs, Parallel::ForkManager, POE, Proc::Background, and Win32::Process. There are many other modules you might use, so check those namespaces for other options too.
If you are on a Unix-like system, you might be able to get away with a system call where you put an & on the end of the command:
system("cmd &")
You can also try using fork, as described in perlfunc (although this is the same thing that many of the modules will do for you).
STDIN, STDOUT, and STDERR are shared
Both the main process and the backgrounded one (the "child" process) share the same STDIN, STDOUT and STDERR filehandles. If both try to access them at once, strange things can happen. You may want to close or reopen these for the child. You can get around this with opening a pipe (see open in perlfunc) but on some systems this means that the child process cannot outlive the parent.
Signals
You'll have to catch the SIGCHLD signal, and possibly SIGPIPE too. SIGCHLD is sent when the backgrounded process finishes. SIGPIPE is sent when you write to a filehandle whose child process has closed (an untrapped SIGPIPE can cause your program to silently die). This is not an issue with system("cmd&").
Zombies
You have to be prepared to "reap" the child process when it finishes.
$SIG{CHLD} = sub { wait };
$SIG{CHLD} = 'IGNORE';
You can also use a double fork. You immediately wait() for your first child, and the init daemon will wait() for your grandchild once it exits.
unless ($pid = fork) {
unless (fork) {
exec "what you really wanna do";
die "exec failed!";
}
exit 0;
}
waitpid($pid, 0);
See Signals in perlipc for other examples of code to do this. Zombies are not an issue with system("prog &").
Have you tried?
system('xterm -geometry 80x25-5-5 -bg green &');
http://www.rocketaware.com/perl/perlfaq8/How_do_I_start_a_process_in_the_.htm
This is not purely an explanation for Perl. The same problem is under C and other languages.
First understand what the system command does:
Forks
Under the child process call exec
The parent process is waiting for forked child process to finish
It does not matter if you pass multiple arguments or one argument. The difference is, with multiple arguments, the command is executed directly. With one argument, the command is wrapped by the shell, and finally executed as:
/bin/sh -c your_command_with_redirections_and_ambersand
When you pass a command as some_command par1 par2 &, then between the Perl interpreter and the command is the sh or bash process used as a wrapper, and it is waiting for some_command finishing. Your script is waiting for the shell interpreter, and no additional waitpid is needed, because Perl's function system does it for you.
When you want to implement this mechanism directly in your script, you should:
Use the fork function. See example: http://users.telenet.be/bartl/classicperl/fork/all.html
Under the child condition (if), use the exec function. Your user is similar to system, see the manual. Notice, exec causes the child process program/content/data cover by the executed command.
Under the parent condition (if, fork exits with non-zero), you use waitpid, using pid returned by the fork function.
This is why you can run the process in the background. I hope this is simple.
The simplest example:
if (my $pid = fork) { #exits 0 = false for child process, at this point is brain split
# parent ($pid is process id of child)
# Do something what you want, asynchronously with executed command
waitpid($pid); # Wait until child ends
# If you don't want to, don't wait. Your process ends, and then the child process will be relinked
# from your script to INIT process, and finally INIT will assume the child finishing.
# Alternatively, you can handle the SIGCHLD signal in your script
}
else {
# Child
exec('some_command arg1 arg2'); #or exec('some_command','arg1','arg2');
#exit is not needed, because exec completely overwrites the process content
}
I have a POE Perl program forking children.
The children it is forking do logging and interactive telnets into remote devices.
POE uses STDOUT to return output back up to the parent process but for some reason it's getting lost (not going to screen or to any files).
I'm theorising that this is because STDOUT is being redirected somewhere - I need to ascertain where.
I have used (-t STDOUT) to identify that the STDOUT of the child is not a TTY.
I have also reset the STDOUT of the child to be that of the parent before the child was called - but this method seems to avoid POE's event handlers and it just dumps the output to parents STDOUT.
Q) How do I identify what the current STDOUT points at so I can find where my data is going
Thanks
Simon
Would fileno help in this situation? If a child is closing and reopening STDOUT, then fileno(STDOUT) would have a different value in the child than in the parent.
$ perldoc -f fileno
fileno FILEHANDLE
Returns the file descriptor for a filehandle, or undefined if
the filehandle is not open. This is mainly useful for
constructing bitmaps for "select" and low-level POSIX tty-
handling operations. If FILEHANDLE is an expression, the value
is taken as an indirect filehandle, generally its name.
You can use this to find out whether two handles refer to the
same underlying descriptor:
if (fileno(THIS) == fileno(THAT)) {
print "THIS and THAT are dups\n";
}
(Filehandles connected to memory objects via new features of
"open" may return undefined even though they are open.)
If your forked children are Perl programs too you can "select STDOUT" and set $| to mark it unbuffered right before any logging happens.
This was down to the POE::Filter::Reference StdOut Handler being sent output by the child process that was not in the format it was expecting.
Removed the filter - could then see what it was being sent and this enabled me to rectify the issue.
The issue was the child process was spewing the contents of its subprocesses STDOUT back along the socket connection to the StdOut Handler.
Simon
Are you certain this isn't a buffering issue? I'm not familiar with POE so I don't know how you'd investigate or correct that, but I suspect it's worth checking, at least.