How can I pass arguments for a subroutine from the command line - perl

First off, my background is primarily in Python and I am relatively new at using Perl.
I am using tcsh for passing options into my .pl file.
In my code I have:
if( scalar #ARGV != 1)
{
help();
exit;
}
# Loading configuration file and parameters
our %configuration = load_configuration($ARGV[0]);
# all the scripts must be run for every configuration
my #allConfs = ("original");
sub Collimator
{
my $z = -25.0;
my %detector = init_det();
$detector{"pos"} = "0*cm 0.0*cm $z*cm"
}
foreach my $conf ( #allConfs )
{
$configuration{"variation"} = $conf ;
#other geometries ....
Collimator();
}
I want to add something that allows me to change the parameters of the subroutine in the .pl file from the command line. Currently, to generate the geometry I pass the following command into the tcsh CLI: perl file.pl config.dat. What I want is to be able to pass in something like this: perl file.pl config.dat -20.0.
I'm thinking that I need to add something to the effect of:
if($ARGV[1]){
$z = ARGV[1]}
However, I am not sure how to properly implement this. Is this something that I would specify within the subroutine itself or outside of it with the loading of the configuration file?

Use a library to handle command-line arguments, and Getopt::Long is excellent
use warnings;
use strict;
use Getopt::Long;
my ($config_file, #AllConfs);
my ($live, $dry_run) = (1, 0);
my $z;
GetOptions(
'config-file=s' => \$config_file,
'options-conf=s' => \#AllConfs,
'geom|g=f' => \$z,
'live!' => \$live,
'dry-run' => sub { $dry_run = 1; $live = 0 },
# etc
);
# Loading configuration file and parameters
# (Does it ===really=== need to be "our" variable?)
our %configuration = load_configuration($config_file);
my #data_files = #ARGV; # unnamed arguments, perhaps submitted as well
Options may be abbreviated as long as they are unambiguous so with my somewhat random sample above the program can be invoked for example as
prog.pl --conf filename -o original -o bare -g -25.0 data-file(s)
(Or whatever options for #AllConf are.) Providing explicitly g as another name for geom makes it a proper name, not an abbreviation (it doesn't "compete" with others).
Note that one can use -- or just -, and choose shorter or long(er) names for convenience or clarity etc. We get options, and there is a lot more than this little scoop, see docs.
Once the (named) options have been processed the rest on the command line is available in #ARGV, so one can mix and match arguments that way. Unnamed arguments are often used for file names. (The module offers a way to deal with those in some capacity as well.)

if ( #ARGV != 2 ) {
help();
}
my ( $path, $z ) = #ARGV;

perl gets input as list, so you can get its value using #ARGV or using $ARGV[0] and $ARGV[1]
die if ($#ARVG != 1);
#the first argument
print "$ARGV[0]\n";
#the second argument
print "$ARGV[1]\n"

Related

2 Sub references as arguments in perl

I have perl function I dont what does it do?
my what does min in perl?
#ARVG what does mean?
sub getArgs
{
my $argCnt=0;
my %argH;
for my $arg (#ARGV)
{
if ($arg =~ /^-/) # insert this entry and the next in the hash table
{
$argH{$ARGV[$argCnt]} = $ARGV[$argCnt+1];
}
$argCnt++;
}
return %argH;}
Code like that makes David sad...
Here's a reformatted version of the code doing the indentations correctly. That makes it so much easier to read. I can easily tell where my if and loops start and end:
sub getArgs {
my $argCnt = 0;
my %argH;
for my $arg ( #ARGV ) {
if ( $arg =~ /^-/ ) { # insert this entry and the next in the hash table
$argH{ $ARGV[$argCnt] } = $ARGV[$argCnt+1];
}
$argCnt++;
}
return %argH;
}
The #ARGV is what is passed to the program. It is an array of all the arguments passed. For example, I have a program foo.pl, and I call it like this:
foo.pl one two three four five
In this case, $ARGV is set to the list of values ("one", "two", "three", "four", "five"). The name comes from a similar variable found in the C programming language.
The author is attempting to parse these arguments. For example:
foo.pl -this that -the other
would result in:
$arg{"-this"} = "that";
$arg{"-the"} = "other";
I don't see min. Do you mean my?
This is a wee bit of a complex discussion which would normally involve package variables vs. lexically scoped variables, and how Perl stores variables. To make things easier, I'm going to give you a sort-of incorrect, but technically wrong answer: If you use the (strict) pragma, and you should, you have to declare your variables with my before they can be used. For example, here's a simple two line program that's wrong. Can you see the error?
$name = "Bob";
print "Hello $Name, how are you?\n";
Note that when I set $name to "Bob", $name is with a lowercase n. But, I used $Name (upper case N) in my print statement. As it stands, now. Perl will print out "Hello, how are you?" without a care that I've used the wrong variable name. If it's hard to spot an error like this in a two line program, imagine what it would be like in a 1000 line program.
By using strict and forcing me to declare variables with my, Perl can catch that error:
use strict;
use warnings; # Another Pragma that should always be used
my $name = "Bob";
print "Hello $Name, how are you doing\n";
Now, when I run the program, I get the following error:
Global symbol "$Name" requires explicit package name at (line # of print statement)
This means that $Name isn't defined, and Perl points to where that error is.
When you define variables like this, they are in scope with in the block where it's defined. A block could be the code contained in a set of curly braces or a while, if, or for statement. If you define a variable with my outside of these, it's defined to the end of the file.
Thus, by using my, the variables are only defined inside this subroutine. And, the $arg variable is only defined in the for loop.
One more thing:
The person who wrote this should have used the Getopt::Long module. There's a major bug in their code:
For example:
foo.pl -this that -one -two
In this case, my hash looks like this:
$args{'-this'} = "that";
$args{'-one'} = "-two";
$args{'-two'} = undef;
If I did this:
if ( defined $args{'-two'} ) {
...
}
I would not execute the if statement.
Also:
foo.pl -this=that -one -two
would also fail.
#ARGV is a special variable (refer to perldoc perlvar):
#ARGV
The array #ARGV contains the command-line arguments intended for the
script. $#ARGV is generally the number of arguments minus one, because
$ARGV[0] is the first argument, not the program's command name itself.
See $0 for the command name.
Perl documentation is also available from your command line:
perldoc -v #ARGV

How to make recursive calls using Perl, awk or sed?

If a .cpp or .h file has #includes (e.g. #include "ready.h"), I need to make a text file that has these filenames on it. Since ready.h may have its own #includes, the calls have to be made recursively. Not sure how to do this.
The solution of #OneSolitaryNoob will likely work allright, but has an issue: for each recursion, it starts another process, which is quite wasteful. We can use subroutines to do that more efficiently. Assuming that all header files are in the working directory:
sub collect_recursive_includes {
# Unpack parameter from subroutine
my ($filename, $seen) = #_;
# Open the file to lexically scoped filehandle
# In your script, you'll probably have to transform $filename to correct path
open my $fh, "<", $filename or do {
# On failure: Print a warning, and return. I.e. go on with next include
warn "Can't open $filename: $!";
return;
};
# Loop through each line, recursing as needed
LINE: while(<$fh>) {
if (/^\s*#include\s+"([^"]+)"/) {
my $include = $1;
# you should probably normalize $include before testing if you've seen it
next LINE if $seen->{$include}; # skip seen includes
$seen->{$include} = 1;
collect_recursive_includes($include, $seen);
}
}
}
This subroutine remembers what files it has already seen, and avoids recursing there again—each file is visited one time only.
At the top level, you need to provide a hashref as second argument, that will hold all filenames as keys after the sub has run:
my %seen = ( $start_filename => 1 );
collect_recursive_includes($start_filename, \%seen);
my #files = sort keys %seen;
# output #files, e.g. print "$_\n" for #files;
I hinted in the code comments that you'll probabably have to normalize the filenames. E.g consider a starting filename ./foo/bar/baz.h, which points to qux.h. Then the actual filename we wan't to recurse to is ./foo/bar/qux.h, not ./qux.h. The Cwd module can help you find your current location, and to transform relative to absolute paths. The File::Spec module is a lot more complex, but has good support for platform-independent filename and -path manipulation.
In Perl, recursion is straightforward:
sub factorial
{
my $n = shift;
if($n <= 1)
{ return 1; }
else
{ return $n * factorial($n - 1); }
}
print factorial 7; # prints 7 * 6 * 5 * 4 * 3 * 2 * 1
Offhand, I can think of only two things that require care:
In Perl, variables are global by default, and therefore static by default. Since you don't want one function-call's variables to trample another's, you need to be sure to localize your variables, e.g. by using my.
There are some limitations with prototypes and recursion. If you want to use prototypes (e.g. sub factorial($) instead of just sub factorial), then you need to provide the prototype before the function definition, so that it can be used within the function body. (Alternatively, you can use & when you call the function recursively; that will prevent the prototype from being applied.)
Not totally clear what you want the display to look like, but the basic would be a script called follow_includes.pl:
#!/usr/bin/perl -w
while(<>) {
if(/\#include "(\S+)\"/) {
print STDOUT $1 . "\n";
system("./follow_includes.pl $1");
}
}
Run it like:
% follow_includes.pl somefile.cpp
And if you want to hide any duplicate includes, run it like:
% follow_includes.pl somefile.cpp | sort -u
Usually you'd want this in some sort of tree-print.

Perl - How to create commands that users can input in console?

I'm just starting in Perl and I'm quite enjoying it. I'm writing some basic functions, but what I really want to be able to do is to use those functions intelligently using console commands. For example, say I have a function adding two numbers. I'd want to be able to type in console "add 2, 4" and read the first word, then pass the two numbers as parameters in an "add" function. Essentially, I'm asking for help in creating some basic scripting using Perl ^^'.
I have some vague ideas about how I might do this in VB, but Perl, I have no idea where I'd start, or what functions would be useful to me. Is there something like VB.net's "Split" function where you can break down the contents of a scalar into an array? Is there a simple way to analyse one word at a time in a scalar, or iterate through a scalar until you hit a separator, for example?
I hope you can help, any suggestions are appreciated! Bear in mind, I'm no expert, I started Perl all of a few weeks ago, and I've only been doing VB.net half a year.
Thank you!
Edit: If you're not sure what to suggest and you know any simple/intuitive resources that might be of help, that would also be appreciated.
Its rather easy to make a script which dispatches to a command by name. Here is a simple example:
#!/usr/bin/env perl
use strict;
use warnings;
# take the command name off the #ARGV stack
my $command_name = shift;
# get a reference to the subroutine by name
my $command = __PACKAGE__->can($command_name) || die "Unknown command: $command_name\n";
# execute the command, using the rest of #ARGV as arguments
# and print the return with a trailing newline
print $command->(#ARGV);
print "\n";
sub add {
my ($x, $y) = #_;
return $x + $y;
}
sub subtract {
my ($x, $y) = #_;
return $x - $y;
}
This script (say its named myscript.pl) can be called like
$ ./myscript.pl add 2 3
or
$ ./myscript.pl subtract 2 3
Once you have played with that for a while, you might want to take it further and use a framework for this kind of thing. There are several available, like App::Cmd or you can take the logic shown above and modularize as you see fit.
You want to parse command line arguments. A space serves as the delimiter, so just do a ./add.pl 2 3 Something like this:
$num1=$ARGV[0];
$num2=$ARGV[1];
print $num1 + $num2;
will print 5
Here is a short implementation of a simple scripting language.
Each statement is exactly one line long, and has the following structure:
Statement = [<Var> =] <Command> [<Arg> ...]
# This is a regular grammar, so we don't need a complicated parser.
Tokens are seperated by whitespace. A command may take any number of arguments. These can either be the contents of variables $var, a string "foo", or a number (int or float).
As these are Perl scalars, there is no visible difference between strings and numbers.
Here is the preamble of the script:
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
strict and warnings are essential when learning Perl, else too much weird stuff would be possible. The use 5.010 is a minimum version, it also defines the say builtin (like a print but appends a newline).
Now we declare two global variables: The %env hash (table or dict) associates variable names with their values. %functions holds our builtin functions. The values are anonymous functions.
my %env;
my %functions = (
add => sub { $_[0] + $_[1] },
mul => sub { $_[0] * $_[1] },
say => sub { say $_[0] },
bye => sub { exit 0 },
);
Now comes our read-eval-loop (we don't print by default). The readline operator <> will read from the file specified as the first command line argument, or from STDIN if no filename is provided.
while (<>) {
next if /^\s*\#/; # jump comment lines
# parse the line. We get a destination $var, a $command, and any number of #args
my ($var, $command, #args) = parse($_);
# Execute the anonymous sub specified by $command with the #args
my $value = $functions{ $command }->(#args);
# Store the return value if a destination $var was specified
$env{ $var } = $value if defined $var;
}
That was fairly trivial. Now comes some parsing code. Perl “binds” regexes to strings with the =~ operator. Regexes may look like /foo/ or m/foo/. The /x flags allows us to include whitespace in our regex that doesn't match actual whitespace. The /g flag matches globally. This also enables the \G assertion. This is where the last successful match ended. The /c flag is important for this m//gc style parsing to consume one match at a time, and to prevent the position of the regex engine in out string to being reset.
sub parse {
my ($line) = #_; # get the $line, which is a argument
my ($var, $command, #args); # declare variables to be filled
# Test if this statement has a variable declaration
if ($line =~ m/\G\s* \$(\w+) \s*=\s* /xgc) {
$var = $1; # assign first capture if successful
}
# Parse the function of this statement.
if ($line =~ m/\G\s* (\w+) \s*/xgc) {
$command = $1;
# Test if the specified function exists in our %functions
if (not exists $functions{$command}) {
die "The command $command is not known\n";
}
} else {
die "Command required\n"; # Throw fatal exception on parse error.
}
# As long as our matches haven't consumed the whole string...
while (pos($line) < length($line)) {
# Try to match variables
if ($line =~ m/\G \$(\w+) \s*/xgc) {
die "The variable $1 does not exist\n" if not exists $env{$1};
push #args, $env{$1};
}
# Try to match strings
elsif ($line =~ m/\G "([^"]+)" \s*/xgc) {
push #args, $1;
}
# Try to match ints or floats
elsif ($line =~ m/\G (\d+ (?:\.\d+)? ) \s*/xgc) {
push #args, 0+$1;
}
# Throw error if nothing matched
else {
die "Didn't understand that line\n";
}
}
# return our -- now filled -- vars.
return $var, $command, #args;
}
Perl arrays can be handled like linked list: shift removes and returns the first element (pop does the same to the last element). push adds an element to the end, unshift to the beginning.
Out little programming language can execute simple programs like:
#!my_little_language
$a = mul 2 20
$b = add 0 2
$answer = add $a $b
say $answer
bye
If (1) our perl script is saved in my_little_language, set to be executable, and is in the system PATH, and (2) the above file in our little language saved as meaning_of_life.mll, and also set to be executable, then
$ ./meaning_of_life
should be able to run it.
Output is obviously 42. Note that our language doesn't yet have string manipulation or simple assignment to variables. Also, it would be nice to be able to call functions with the return value of other functions directly. This requires some sort of parens, or precedence mechanism. Also, the language requires better error reporting for batch processing (which it already supports).

Can I call Getopts multiple times in perl?

I am a noob to perl, so please try to be patient with this question of mine.
It seems that if I make multiple calls to perl Getopts::Long::GetOpts method, the second call is completely ignored.
Is this normal??(Why)
What are the alternatives to this process??
(Actually Ive written a module, where I make a GetOpts call, an the script using my module tries to do that too, but it seems that script does not get the required options)
Thanks,
Neeraj
Getopts::Long alters #ARGV while it works, that's how it can leave non-switch values behind in #ARGV when it is done processing the switches. So, when you make your second call, there's nothing left in #ARGV to parse and nothing useful happens.
However, there is GetOptionsFromArray:
By default, GetOptions parses the options that are present in the global array #ARGV. A special entry GetOptionsFromArray can be used to parse options from an arbitrary array.
So you could use GetOptionsFromArray on a copy of #ARGV (or some other array) if you need to parse the list multiple times.
I've run GetOptions from GetOpts::Long multiple times in a single program. What I have is a .optrc file that contains command line options that can be overridden by the command line. Much the same way .cvsrc and .exrc work.
In order to do that, I run GetOptions on the .optrc file and then what's in #ARGV. In older versions of GetOptions, I had to save #ARGV, toss .optrc into #ARGV, process it with GetOptions, and then restore #ARGV and run GetOptions on that. Newer versions of GetOpts::Long now allow you to specify the array instead of using just #ARGV.
Making copies of #ARGV will keep you busy parsing again and again the same set of options. If this is what you want, fine. But.
Suppose you have a set of modules you are using in your program which can recognize only a subset of your #ARGV. What you want to do is to call GetOptions from each of these module, consume each of the options the module is able to recognize and leave the rest of the options in #ARGV to be processed by the other modules.
You can configure Getopt::Long to do this by calling
Getopt::Long::Configure qw/pass_through/;
But see perldoc Getopt::Long for various configurations side effects!
Example: a script (o1.pl) able to recognize few options and two modules (o1::p1 and o1::p2) which must get to read their own options
o1.pl:
!/usr/bin/perl
use Getopt::Long;
use o1::p1;
use o1::p2;
# now o1::p1 and o1::p2 already consumed recognizable options
#no need to configure pass_through since main:: will get to see only its options
#Getopt::Long::Configure(qw/pass_through/);
my %main_options = ( 'define' => {}, );
print "main: \#ARGV = " . join (', ', #ARGV) . "\n";
GetOptions(\%main_options, "main-vi=i","verbose",'define=s');
use Data::Dumper;
print "main_options: ", Dumper(\%main_options);
print "p1 options: ", Dumper(\%o1::p1::options);
print "p2 options: ", Dumper(\%o1::p2::options);
exit 0;
o1::p1 source (in o1/p1.pm):
package o1::p1;
use Getopt::Long;
Getopt::Long::Configure qw/pass_through/;
%options = ();
print "package p1: \#ARGV = " . join (', ', #ARGV) . "\n";;
GetOptions(\%options, "p1-v1=s", "p1-v2=i");
1;
o1::p2 source (in o1/p2.pm):
package o1::p2;
use Getopt::Long;
Getopt::Long::Configure 'pass_through';
%options = ();
print "package p2: \#ARGV=". join (', ', #ARGV). "\n";
GetOptions(\%options, "p2-v1=s", "p2-v2=i");
1;
running o1.pl with:
perl o1.pl --main-vi=1 --verbose --define a=ss --p1-v1=k1 --p1-v2=42 --define b=yy --p2-v1=k2 --p2-v2=66
will give you the following (expected) output (p1 consumed its options, then p2 did it, then main was left with what it knows about):
package p1: #ARGV = --main-vi=1, --verbose, --define, a=ss, --p1-v1=k1, --p1-v2=42, --define, b=yy, --p2-v1=k2, --p2-v2=66
package p2: #ARGV=--main-vi=1, --verbose, --define, a=ss, --define, b=yy, --p2-v1=k2, --p2-v2=66
main: #ARGV = --main-vi=1, --verbose, --define, a=ss, --define, b=yy
main_options: $VAR1 = {
'verbose' => 1,
'define' => {
'a' => 'ss',
'b' => 'yy'
},
'main-vi' => 1
};
p1 options: $VAR1 = {
'p1-v1' => 'k1',
'p1-v2' => 42
};
p2 options: $VAR1 = {
'p2-v1' => 'k2',
'p2-v2' => 66
};
Isn't this the sort of the thing the keyword "local" is supposed to be for?
{
local #ARGV = #ARGV;
our $opt_h;
&getopts('h');
&printUsage if $opt_h;
}
# Now that the local version of #ARGV has gone out of scope, the original version of #ARGV is restored.
while (#ARGV){
my $arg = shift #ARGV;

How do I pass parameters to the File::Find subroutine that processes each file?

Using File::Find, how can I pass parameters to the function that processes each file?
I have a Perl script that traverses directories in order to convert some 3-channel TIFF files to JPEG files (3 JPEG files per TIFF file). This works, but I would like to pass some parameters to the function that processes each file (short of using global variables).
Here is the relevant part of the script where I have tried to pass the parameter:
use File::Find;
sub findFiles
{
my $IsDryRun2 = ${$_[0]}{anInIsDryRun2};
}
find ( { wanted => \&findFiles, anInIsDryRun2 => $isDryRun }, $startDir);
$isDryRun is a scalar. $startDir is a string, full path to a directory.
$IsDryRun2 is not set:
Use of uninitialized value $IsDryRun2 in concatenation (.) or string at
TIFFconvert.pl line 197 (#1)
(W uninitialized) An undefined value was used as if it were already
defined. It was interpreted as a "" or a 0, but maybe it was a mistake.
To suppress this warning assign a defined value to your variables.
(The old call without parameters was: find ( \&findFiles, $startDir); )
Test platform (but the production home will be a Linux machine, Ubuntu 9.1, Perl 5.10, 64 bit): ActiveState Perl 64 bit. Windows XP. From perl -v: v5.10.0 built for MSWin32-x64-multi-thread Binary build 1004 [287188] provided by ActiveState.
You need to create a sub reference that calls your wanted sub with the desired parameters:
find(
sub {
findFiles({ anInIsDryRun2 => $isDryRun });
},
$startDir
);
This is, more-or-less, currying. It's just not pretty currying. :)
You can create any sort of code reference you like. You don't have to use a reference to a named subroutine. For many examples of how to do this, see my File::Find::Closures module. I created that module to answer precisely this question.
See the PerlMonks entry Why I hate File::Find and how I (hope I) fixed it describing how to do it with closures.
#
# -----------------------------------------------------------------------------
# Read directory recursively and return only the files matching the regex
# for the file extension. Example: Get all the .pl or .pm files:
# my $arrRefTxtFiles = $objFH->doReadDirGetFilesByExtension ($dir, 'pl|pm')
# -----------------------------------------------------------------------------
sub doReadDirGetFilesByExtension {
my $self = shift; # Remove this if you are not calling OO style
my $dir = shift;
my $ext = shift;
my #arr_files = ();
# File::find accepts ONLY single function call, without params, hence:
find(wrapp_wanted_call(\&filter_file_with_ext, $ext, \#arr_files), $dir);
return \#arr_files;
}
#
# -----------------------------------------------------------------------------
# Return only the file with the passed extensions
# -----------------------------------------------------------------------------
sub filter_file_with_ext {
my $ext = shift;
my $arr_ref_files = shift;
my $F = $File::Find::name;
# Fill into the array behind the array reference any file matching
# the ext regex.
push #$arr_ref_files, $F if (-f $F and $F =~ /^.*\.$ext$/);
}
#
# -----------------------------------------------------------------------------
# The wrapper around the wanted function
# -----------------------------------------------------------------------------
sub wrapp_wanted_call {
my ($function, $param1, $param2) = #_;
sub {
$function->($param1, $param2);
}
}