Can I pass #ARGV and options to a perl script invoked via do? - perl

I have a script which processes #ARGV and options (via Getopt::Long::Descriptive).
Now, I would also like to call that same script from another perl program, and pass variables to it.
One solution is to use system, and build the arguments passed accordingly like so:
system("perl my_script.pl", qw/--foo bar --baz 2/);
My question is: can I obtain the same result by calling the script via do?
I'm trying to do this because the script would run inside a Minion job queue so I would avoid spawning off a perl instance every time - that often causes out-of-memory issues.

rajashekar's answer is correct, but I would also add you can set args for the child script in a local block so that the child doesn't need to corrupt #ARGV in the parent.
{
local #ARGV = ("--foo","bar","--baz",$ARGV[3]);
do 'my_script.pl';
}
# previous #ARGV restored at end of block

You can set the #ARGV yourself and then evaluate the script file.
This will work
#ARGV = qw(--foo --bar).
do $script_file;
Since #ARGV is a global variable it is available inside do too. But if you want to access any local variables you have defined, then you can to use:
eval `cat $script_file`;
EDIT: I had mistakenly assumed that #ARGV will not be accessible within do.

Related

Calling one Perl script from another, passing a CGI parameter

I have two Perl scripts. Let's call them one.pl and two.pl.
one.pl processes some data and needs to call two.pl, passing a variable.
In one.pl I can
require "two.pl"
to call the second script.
That works fine, but I want to pass a CGI variable to two.pl.
Is there any way to do this without rewriting two.pl as a Perl module?
For example what I want is:
one.pl
...
require "two.pl $number";
two.pl
use CGI;
my $cgi = new CGI;
my $number = $cgi->param('number');
...
EDIT: two.pl should only ever be called once
If one.pl is not in a CGI environment
If your one.pl is a shell script, you can set #ARGV before the call to require. This abuses CGI's mode to work with command line arguments. The arg needs to be the param name equals the value.
{
my $number = 5;
local #ARGV = ( "number=$number" );
require "two.pl";
}
The key=value format is important. The local keyword makes sure that #ARGV is only set inside the block, making sure other possible arguments to your script are not permanently lost, but rather invisible to two.pl.
If one.pl is in a CGI environment
If the param is already there, you don't have to do anything.
Else, see above.
Note that for both of these you can only ever require a script once. That's the idea of require. Perl keeps track of what it's loaded already (in %INC). If you are in an environment like mod_perl, or a modern Perl application that runs persistently, you should use do "two.pl" instead, which will execute it every time. But then that might break other things if two.pl is not designed to be ran multiple times in the same process.
Your best bet is to refactor the code inside two.pl into a module, and use that in both scripts.

How to call a script with flags and get it's global vars?

I'm looking for something like this require "peacefullscript.pl -whiteflag"; the problem that this one doesn't work. What I want is to call a script with some flags and get the global variables it creates into the current script.
How can I do this easily?
{
local #ARGV = ("-whiteflag");
do "peacefullscript.pl";
}
The local keyword creates a temporary copy of #ARGV that won't overwrite your original command line arguments.
#ARGV = ("-whiteflag");
require "peacefullscript.pl";

How can I run Perl system commands in the background?

#!/usr/bin/env perl
use warnings; use strict;
use 5.012;
use IPC::System::Simple qw(system);
system( 'xterm', '-geometry', '80x25-5-5', '-bg', 'green', '&' );
say "Hello";
say "World";
I tried this to run the xterm-command in the background, but it doesn't work:
No absolute path found for shell: &
What would be the right way to make it work?
Perl's system function has two modes:
taking a single string and passing it to the command shell to allow special characters to be processed
taking a list of strings, exec'ing the first and passing the remaining strings as arguments
In the first form you have to be careful to escape characters that might have a special meaning to the shell. The second form is generally safer since arguments are passed directly to the program being exec'd without the shell being involved.
In your case you seem to be mixing the two forms. The & character only has the meaning of "start this program in the background" if it is passed to the shell. In your program, the ampersand is being passed as the 5th argument to the xterm command.
As Jakob Kruse said the simple answer is to use the single string form of system. If any of the arguments came from an untrusted source you'd have to use quoting or escaping to make them safe.
If you prefer to use the multi-argument form then you'll need to call fork() and then probably use exec() rather than system().
Note that the list form of system is specifically there to not treat characters such as & as shell meta-characters.
From perlfaq8's answer to How do I start a process in the background?
(contributed by brian d foy)
There's not a single way to run code in the background so you don't have to wait for it to finish before your program moves on to other tasks. Process management depends on your particular operating system, and many of the techniques are in perlipc.
Several CPAN modules may be able to help, including IPC::Open2 or IPC::Open3, IPC::Run, Parallel::Jobs, Parallel::ForkManager, POE, Proc::Background, and Win32::Process. There are many other modules you might use, so check those namespaces for other options too.
If you are on a Unix-like system, you might be able to get away with a system call where you put an & on the end of the command:
system("cmd &")
You can also try using fork, as described in perlfunc (although this is the same thing that many of the modules will do for you).
STDIN, STDOUT, and STDERR are shared
Both the main process and the backgrounded one (the "child" process) share the same STDIN, STDOUT and STDERR filehandles. If both try to access them at once, strange things can happen. You may want to close or reopen these for the child. You can get around this with opening a pipe (see open in perlfunc) but on some systems this means that the child process cannot outlive the parent.
Signals
You'll have to catch the SIGCHLD signal, and possibly SIGPIPE too. SIGCHLD is sent when the backgrounded process finishes. SIGPIPE is sent when you write to a filehandle whose child process has closed (an untrapped SIGPIPE can cause your program to silently die). This is not an issue with system("cmd&").
Zombies
You have to be prepared to "reap" the child process when it finishes.
$SIG{CHLD} = sub { wait };
$SIG{CHLD} = 'IGNORE';
You can also use a double fork. You immediately wait() for your first child, and the init daemon will wait() for your grandchild once it exits.
unless ($pid = fork) {
unless (fork) {
exec "what you really wanna do";
die "exec failed!";
}
exit 0;
}
waitpid($pid, 0);
See Signals in perlipc for other examples of code to do this. Zombies are not an issue with system("prog &").
Have you tried?
system('xterm -geometry 80x25-5-5 -bg green &');
http://www.rocketaware.com/perl/perlfaq8/How_do_I_start_a_process_in_the_.htm
This is not purely an explanation for Perl. The same problem is under C and other languages.
First understand what the system command does:
Forks
Under the child process call exec
The parent process is waiting for forked child process to finish
It does not matter if you pass multiple arguments or one argument. The difference is, with multiple arguments, the command is executed directly. With one argument, the command is wrapped by the shell, and finally executed as:
/bin/sh -c your_command_with_redirections_and_ambersand
When you pass a command as some_command par1 par2 &, then between the Perl interpreter and the command is the sh or bash process used as a wrapper, and it is waiting for some_command finishing. Your script is waiting for the shell interpreter, and no additional waitpid is needed, because Perl's function system does it for you.
When you want to implement this mechanism directly in your script, you should:
Use the fork function. See example: http://users.telenet.be/bartl/classicperl/fork/all.html
Under the child condition (if), use the exec function. Your user is similar to system, see the manual. Notice, exec causes the child process program/content/data cover by the executed command.
Under the parent condition (if, fork exits with non-zero), you use waitpid, using pid returned by the fork function.
This is why you can run the process in the background. I hope this is simple.
The simplest example:
if (my $pid = fork) { #exits 0 = false for child process, at this point is brain split
# parent ($pid is process id of child)
# Do something what you want, asynchronously with executed command
waitpid($pid); # Wait until child ends
# If you don't want to, don't wait. Your process ends, and then the child process will be relinked
# from your script to INIT process, and finally INIT will assume the child finishing.
# Alternatively, you can handle the SIGCHLD signal in your script
}
else {
# Child
exec('some_command arg1 arg2'); #or exec('some_command','arg1','arg2');
#exit is not needed, because exec completely overwrites the process content
}

How can I determine if the script is being executed within a system or qx call in Perl?

In Perl, is it possible to determine if a script is being executed within another script (presumably via system or qx)?
$ cat foo.pl
print "foo";
print "\n" if not $in_qx; # or the like.
I realize this is not applicable if the script was being run via exec.
I know for certain that system runs the process as a fork and I know fork can return a value that is variable depending on whether you are in the parent or the child process. Not certain about qx.
Regardless, I'm not certain how to figure out if I'm in a forked process without actually performing a fork.
All processes are forked from another process (except init). You can sort of tell if the program was run from open, qx//, open2, or open3 by using the isatty function from POSIX, but there is no good way to determine if you are being run by system without looking at the process tree, and even then it can get murky (for instance system "nohup", "./foo.pl" will not have the calling perl process as its parent).
You could check "who's your daddy", using "getppid" (get parent id). Then check if your parent id is a perl script with pgrep or similar.
Do you control the caller? The simplest thing to do would be to pass an argument, e.g. --isforked.

How do I run a Perl script from within a Perl script?

I've got a Perl script that needs to execute another Perl script. This second script can be executed directly on the command line, but I need to execute it from within my first program. I'll need to pass it a few parameters that would normally be passed in when it's run standalone (the first script runs periodically, and executes the second script under a certain set of system conditions).
Preliminary Google searches suggest using backticks or a system() call. Are there any other ways to run it? (I'm guessing yes, since it's Perl we're talking about :P ) Which method is preferred if I need to capture output from the invoked program (and, if possible, pipe that output as it executes to stdout as though the second program were invoked directly)?
(Edit: oh, now SO suggests some related questions. This one is close, but not exactly the same as what I'm asking. The second program will likely take an hour or more to run (lots of I/O), so I'm not sure a one-off invocation is the right fit for this.)
You can just do it.
{
local #ARGV = qw<param1 param2 param3>;
do '/home/buddy/myscript.pl';
}
Prevents the overhead of loading in another copy of perl.
The location of your current perl interpreter can be found in the special variable $^X. This is important if perl is not in your path, or if you have multiple perl versions available but which to make sure you're using the same one across the board.
When executing external commands, including other Perl programs, determining if they actually ran can be quite difficult. Inspecting $? can leave lasting mental scars, so I prefer to use IPC::System::Simple (available from the CPAN):
use strict;
use warnings;
use IPC::System::Simple qw(system capture);
# Run a command, wait until it finishes, and make sure it works.
# Output from this program goes directly to STDOUT, and it can take input
# from your STDIN if required.
system($^X, "yourscript.pl", #ARGS);
# Run a command, wait until it finishes, and make sure it works.
# The output of this command is captured into $results.
my $results = capture($^X, "yourscript.pl", #ARGS);
In both of the above examples any arguments you wish to pass to your external program go into #ARGS. The shell is also avoided in both of the above examples, which gives you a small speed advantage, and avoids any unwanted interactions involving shell meta-characters. The above code also expects your second program to return a zero exit value to indicate success; if that's not the case, you can specify an additional first argument of allowable exit values:
# Both of these commands allow an exit value of 0, 1 or 2 to be considered
# a successful execution of the command.
system( [0,1,2], $^X, "yourscript.pl", #ARGS );
# OR
capture( [0,1,2, $^X, "yourscript.pl", #ARGS );
If you have a long-running process and you want to process its data while it's being generated, then you're probably going to need a piped open, or one of the more heavyweight IPC modules from the CPAN.
Having said all that, any time you need to be calling another Perl program from Perl, you may wish to consider if using a module would be a better choice. Starting another program carries quite a few overheads, both in terms of start-up costs, and I/O costs for moving data between processes. It also significantly increases the difficulty of error handling. If you can turn your external program into a module, you may find it simplifies your overall design.
All the best,
Paul
I can think of a few ways to do this. You already mentioned the first two, so I won't go into detail on them.
backticks: $retVal = `perl somePerlScript.pl`;
system() call
eval
The eval can be accomplished by slurping the other file into a string (or a list of strings), then 'eval'ing the strings. Heres a sample:
#!/usr/bin/perl
open PERLFILE, "<somePerlScript.pl";
undef $/; # this allows me to slurp the file, ignoring newlines
my $program = <PERLFILE>;
eval $program;
4 . do: do 'somePerlScript.pl'
You already got good answers to your question, but there's always the posibility to take a different point of view: maybe you should consider refactoring the script that you want to run from the first script. Turn the functionality into a module. Use the module from the first and from the second script.
If you need to asynchronously call your external script -you just want to launch it and not wait for it to finish-, then :
# On Unix systems, either of these will execute and just carry-on
# You can't collect output that way
`myscript.pl &`;
system ('myscript.pl &');
# On Windows systems the equivalent would be
`start myscript.pl`;
system ('start myscript.pl');
# If you just want to execute another script and terminate the current one
exec ('myscript.pl');
Use backticks if you need to capture the output of the command.
Use system if you do not need to capture the output of the command.
TMTOWTDI: so there are other ways too, but those are the two easiest and most likely.
See the perlipc documentation for several options for interprocess communication.
If your first script merely sets up the environment for the second script, you may be looking for exec.
#!/usr/bin/perl
use strict;
open(OUTPUT, "date|") or die "Failed to create process: $!\n";
while (<OUTPUT>)
{
print;
}
close(OUTPUT);
print "Process exited with value " . ($? >> 8) . "\n";
This will start the process date and pipe the output of the command to the OUTPUT filehandle which you can process a line at a time. When the command is finished you can close the output filehandle and retrieve the return value of the process. Replace date with whatever you want.
I wanted to do something like this to offload non-subroutines into an external file to make editing easier. I actually made this into a subroutine. The advantage of this way is that those "my" variables in the external file get declared in the main namespace. If you use 'do' they apparently don't migrate to the main namespace. Note the presentation below doesn't include error handling
sub getcode($) {
my #list;
my $filename = shift;
open (INFILE, "< $filename");
#list = <INFILE>;
close (INFILE);
return \#list;
}
# and to use it:
my $codelist = [];
$codelist = getcode('sourcefile.pl');
eval join ("", #$codelist);