We have a module at work that is included in most scripts to create a logging event that includes who invoked the script and what command line args were passed to it. Currently this simply uses a dump of #ARGV.
However, we're wanting to include this functionality for scripts that potentially include secure information that is passed on the command line. We therefore still want to ledger the options passed to the script but masking the values. s/(?<=.{2})./X/sg
For example
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dump qw(dd);
use Getopt::Long qw(GetOptions);
local #ARGV = ( '-i', '--name' => 'value', '--password' => 'secure info', '--list' => 'foobar', '--list' => 'two' );
# The below GetOptions call specifics the allowed command line options
# to be parsed and validated.
#
# I want some way to accomplish the same but WITHOUT having to specify
# anything.
#
# Something like: GetOptinos( \my %hash ); # Just do it without validation
GetOptions( \my %hash, 'i', 'name=s', 'password=s', 'list=s#' );
for ( values %hash ) {
for ( ref($_) ? #$_ : $_ ) {
s/(?<=.{2})./X/sg;
}
}
dd \%hash; # The the command line options are generically logged to a file with the values masked.
Outputs:
{
i => 1,
list => ["foXXXX", "twX"],
name => "vaXXX",
password => "seXXXXXXXXX",
}
The module I'm used to using for CLI parsing is Getopt::Long.
Is there a way to get Getopt::Long to not validate, but simply generically parse the options into a hash without having to specify any of the allowed options? Alternatively, is there another module that would give this ability?
I am not sure how Getopt::Long affects security, but I can think of a couple of ways to limit how much it works with provided arguments.
When a user subroutine is used to process options
It is up to the subroutine to store the value, or do whatever it thinks is appropriate.
I assume that the code just passes things to the sub. Then you can put them away as you wish
GetOptions(sensitive_opt => sub { $sensitive{$_[0]} = $_[1] }, ...) or usage();
Another way would be to not document the sensitive options for Getopt::Long, but provide the argument callback <>, which runs for each unrecognized thing on the command line; then those can be processed by hand. For this the pass_through configuration option need be enabled
use Getopt::Long qw(pass_through); # pre 2.24 use Getopt::Long::Configure()
GetOptions(opt => \$normal_opt, ... '<>' => \&process_sensitive) ...
Then the given options are handled normally while the (expected) sensitive ones are processed by hand in process_sensitive().
The drawback here is that the options unmentioned in GetOptions are literally untouched and passed to the sub as mere words, and one at a time. So there would be some work to do.
Related
I am new to Perl and I'm confused with its handling of optional arguments.
If I have a perl script that's invoked with something along the lines of:
plgrep [-f] < perl regular expression > < file/directory list >
How would I determine whether or not the -f operator is given or not on the command line?
All of the parameters passed to your program appear in the array #ARGV, so you can simply check whether any of the array elements contain the string -f
But if you are writing a program that uses many different options in combination, you may find it simpler to use the Getopt::Long module, which allows you to specify which parameters are optional, which take values, whether there are multiple synonynms for an option etc.
A call to GetOptions allows you to specify the parameters that your program expects, and will remove from #ARGV any that appear in the command line, saving indicators in simple Perl variables that reflect which were provided and what values, if any, they had
For instance, in the simple case that you describe, you could write your code like this
use strict;
use warnings 'all';
use feature 'say';
use Getopt::Long;
use Data::Dump;
say "\nBefore GetOptions";
dd \#ARGV;
GetOptions( f => \my $f_option);
say "\nAfter GetOptions";
dd $f_option;
dd \#ARGV;
output
Before GetOptions
["-f", "regexp", "file"]
After GetOptions
1
["regexp", "file"]
So you can see that before the call to GetOptions, #ARGV contains all of the data in the command line. But afterwards, the -f has been removed and variable $f_option is set to 1 to indicate that the option was specified
Use Getopt::Long. You could, of course, parse #ARGV by hand (which contains command line arguments), but there is no reason to do that with the existence of good modules for the job.
use warnings;
use strict;
use Getopt::Long;
# Set up defaults here if you wish
my ($flag, $integer, $float, $string);
usage(), exit if not GetOptions(
'f|flag!' => \$flag,
'integer:i' => \$integer,
'float:f' => \$float,
'string:s' => \$string
);
# The script now goes. Has the flag been supplied?
if (defined($flag)) { print "Got flag: $flag\n" } # it's 1
else {
# $flag variable is 'undef'
}
sub usage {
print "Usage: $0 [options]\n"; # -f or -flag, etc
}
The $flag can simply be tested for truth as well, if that is sufficient. To only check whether -f is there or not, need just: GetOptions('f' => \$flag); if ($flag) { };.
The module checks whether the invocation specifies arguments as they are expected. These need not be entered, they are "options." However, for an unexpected invocation a die or warn message is printed (and in the above code our usage message is also printed and the script exits). So for script.pl -a the script exits with messages (from module and sub).
Abbreviations of option names are OK, if unambiguous; script.pl -fl 0.5 exits with messages (-flag or -float?) while script.pl -i 5 is OK and $integer is set to 5. On the other hand, if an integer is not supplied after -i that is an error, since that option is defined to take one. Multiple names for options can be specified, like f|flag. Etc. There is far more.
If I have a command line like:
my_script.pl -foo -WHATEVER
My script knows about --foo, and I want Getopt to set variable $opt_foo, but I don't know anything about -WHATEVER. How can I tell Getopt to parse out the options that I've told it about, and then get the rest of the arguments in a string variable or a list?
An example:
use strict;
use warnings;
use Getopt::Long;
my $foo;
GetOptions('foo' => \$foo);
print 'remaining options: ', #ARGV;
Then, issuing
perl getopttest.pl -foo -WHATEVER
gives
Unknown option: whatever
remaining options:
You need to configure "pass_through" option via Getopt::Long::Configure("pass_through");
Then it support actual options (e.g. stuff starting with "-" and without the special "--" delimiter to signify the end of "real" options).
Here's perldoc quote:
pass_through (default: disabled)
Options that are unknown, ambiguous or supplied with an invalid option value are passed through in #ARGV instead of being flagged as errors. This makes it possible to write wrapper scripts that process only part of the user supplied command line arguments, and pass the remaining options to some other program.
Here's an example
$ cat my_script.pl
#!/usr/local/bin/perl5.8 -w
use Getopt::Long;
Getopt::Long::Configure("pass_through");
use Data::Dumper;
my %args;
GetOptions(\%args, "foo") or die "GetOption returned 0\n";
print Data::Dumper->Dump([\#ARGV],["ARGV"]);
$ ./my_script.pl -foo -WHATEVER
$ARGV = [
'-WHATEVER'
];
Aren't the remaining (unparsed) values simply left behind in #ARGV? If your extra content starts with dashes, you will need to indicate the end of the options list with a --:
#!/usr/bin/perl
use strict;
use warnings;
use Getopt::Long;
use Data::Dumper;
my $foo;
my $result = GetOptions ("foo" => \$foo);
print Dumper([ $foo, \#ARGV ]);
Then calling:
my_script.pl --foo -- --WHATEVER
gives:
$VAR1 = [
1,
[
'--WHATEVER'
]
];
PS. In MooseX::Getopt, the "remaining" options from the command line are put into the extra_argv attribute as an arrayref -- so I'd recommend converting!
I think the answer here, sadly though, is "no, there isn't a way to do it exactly like you ask, using Getopt::Long, without parsing #ARGV on your own." Ether has a decent workaround, though. It's a feature as far as most people are concerned that any option-like argument is captured as an error. Normally, you can do
GetOptions('foo' => \$foo)
or die "Whups, got options we don't recognize!";
to capture/prevent odd options from being passed, and then you can correct the user on usage. Alternatively, you can simply pass through and ignore them.
I want to handle a feature which seems to me almost natural with programs, and I don't know how to handle it with Getopt perl package (no matter Std ot Long).
I would like something like:
./perlscript <main option> [some options like -h or --output-file some_name]
Options will be handled with - or --, but I want to be able to let the user give me the main and needed option without dashes.
Is Getopt able to do that, or do I have to handle it by hand?
It sounds as though you are talking about non-options -- basic command-line arguments. They can be accessed with #ARGV. The Getopt modules will pass regular arguments through to your script unmolested:
use strict;
use warnings;
use Getopt::Long;
GetOptions (
'foo' => \my $foo,
'bar=s' => \my $bar,
);
my #main_args = #ARGV;
# For example: perl script.pl --foo --bar XXX 1 2 3
# Produces: foo=1 bar=XXX main_args=1 2 3
print "foo=$foo bar=$bar main_args=#main_args\n";
If you want to have it written without a -, and it's also not optional (as you specifiy), then by any reasoning it isn't an option at all, but an argument. You should simply read yourself via
my $mainarg = shift
and then let Getopt do its thing. (You might want to check $#ARGV afterwards to verify that the main argument was actually given.)
I know how to use Perl's Getopt::Long, but I'm not sure how I can configure it to accept any "--key=value" pair that hasn't been explicitly defined and stick it in a hash. In other words, I don't know ahead of time what options the user may want, so there's no way for me to define all of them, yet I want to be able to parse them all.
Suggestions? Thanks ahead of time.
The Getopt::Long documentation suggests a configuration option that might help:
pass_through (default: disabled)
Options that are unknown, ambiguous or supplied
with an invalid option value are passed through
in #ARGV instead of being flagged as errors.
This makes it possible to write wrapper scripts
that process only part of the user supplied
command line arguments, and pass the remaining
options to some other program.
Once the regular options are parsed, you could use code such as that provided by runrig to parse the ad hoc options.
Getopt::Long doesn't do that. You can parse the options yourself...e.g.
my %opt;
my #OPTS = #ARGV;
for ( #OPTS ) {
if ( /^--(\w+)=(\w+)$/ ) {
$opt{$1} = $2;
shift #ARGV;
} elsif ( /^--$/ ) {
shift #ARGV;
last;
}
}
Or modify Getopt::Long to handle it (or modify the above code to handle more kinds of options if you need that).
I'm a little partial, but I've used Getopt::Whatever in the past to parse unknown arguments.
Potentially, you could use the "Options with hash values" feature.
For example, I wanted to allow users to set arbitrary filters when parsing through an array of objects.
GetOptions(my $options = {}, 'foo=s', 'filter=s%')
my $filters = $options->{filter};
And then call it like
perl ./script.pl --foo bar --filter baz=qux --filter hail=eris
Which would build something like..
$options = {
'filter' => {
'hail' => 'eris',
'baz' => 'qux'
},
'foo' => 'bar'
};
And of course $filters will have the value associated with 'filter'
Good luck! I hope someone found this helpful.
From the documentation:
Argument Callback
A special option 'name' <> can be used to designate a subroutine to handle non-option arguments. When GetOptions() encounters an argument that does not look like an option, it will immediately call this subroutine and passes it one parameter: the argument name.
Well, actually it is an object that stringifies to the argument name.
For example:
my $width = 80;
sub process { ... }
GetOptions ('width=i' => \$width, '<>' => \&process);
When applied to the following command line:
arg1 --width=72 arg2 --width=60 arg3
This will call process("arg1") while $width is 80, process("arg2") while $width is 72, and process("arg3") while $width is 60.
This feature requires configuration option permute, see section
"Configuring Getopt::Long".
This is a good time to roll your own option parser. None of the modules that I've seen on the CPAN provide this type of functionality, and you could always look at their implementations to get a good sense of how to handle the nuts and bolts of parsing.
As an aside, this type of code makes me hate Getopt variants:
use Getopt::Long;
&GetOptions(
'name' => \$value
);
The inconsistent capitalization is maddening, even for people who have seen and used this style of code for a long time.
I have a Perl script that sets up variables near the top for directories and files that it will use. It also requires a few variables to be set as command-line arguments.
Example:
use Getopt::Long;
my ($mount_point, $sub_dir, $database_name, $database_schema);
# Populate variables from the command line:
GetOptions(
'mount_point=s' => \$mount_point,
'sub_dir=s' => \$sub_dir,
'database_name=s' => \$database_name,
'database_schema=s' => \$database_schema
);
# ... validation of required arguments here
################################################################################
# Directory variables
################################################################################
my $input_directory = "/${mount_point}/${sub_dir}/input";
my $output_directory = "/${mount_point}/${sub_dir}/output";
my $log_directory = "/${mount_point}/${sub_dir}/log";
my $database_directory = "/db/${database_name}";
my $database_scripts = "${database_directory}/scripts";
################################################################################
# File variables
################################################################################
my $input_file = "${input_dir}/input_file.dat";
my $output_file = "${output_dir}/output_file.dat";
# ... etc
This works fine in my dev, test, and production environments. However, I was trying to make it easier to override certain variables (without going into the debugger) for development and testing. (For example, if I want to set my input_file = "/tmp/my_input_file.dat"). My thought was to use the GetOptions function to handle this, something like this:
GetOptions(
'input_directory=s' => \$input_directory,
'output_directory=s' => \$output_directory,
'database_directory=s' => \$database_directory,
'log_directory=s' => \$log_directory,
'database_scripts=s' => \$database_scripts,
'input_file=s' => \$input_file,
'output_file=s' => \$output_file
);
GetOptions can only be called once (as far as I know). The first 4 arguments in my first snippit are required, the last 7 directly above are optional. I think an ideal situation would be to setup the defaults as in my first code snippit, and then somehow override any of them that have been set if arguments were passed at the command line. I thought about storing all my options in a hash and then using that hash when setting up each variable with the default value unless an entry exists in the hash, but that seems to add a lot of additional logic. Is there a way to call GetOptions in two different places in the script?
Not sure if that makes any sense.
Thanks!
It sounds like you need to change your program to use configuration files rather than hard-coded configuration. I devoted an entire chapter of Mastering Perl to this. You don't want to change source code to test the program.
There are many Perl modules on CPAN that make configuration files an easy feature to add. Choose the one that works best for your input data.
Once you get a better configuration model in place, you can easily set default values, take values from multiple places (files, command-line, etc), and easily test the program with different values.
Here's another approach. It uses arrays of names and a hash to store the options. It makes all options truly optional, but validates the required ones unless you include "--debug" on the command line. Regardless of whether you use "--debug", you can override any of the others.
You could do more explicit logic checks if that's important to you, of course. I included "--debug" as an example of how to omit the basic options like "mount_point" if you're just going to override the "input_file" and "output_file" variables anyway.
The main idea here is that by keeping sets of option names in arrays, you can include logic checks against the groups with relatively little code.
use Getopt::Long;
my #required_opts = qw(
mount_point
sub_dir
database_name
database_schema
);
my #internal_opts = qw(
input_directory
output_directory
log_directory
database_directory
database_scripts
input_file
output_file
);
my #opt_spec = ("debug", map { "$_:s" } #required_opts, #internal_opts);
# Populate variables from the command line:
GetOptions( \(my %opts), #opt_spec );
# check required options unless
my #errors = grep { ! exists $opts{$_} } #required_options;
if ( #errors && ! $opts{debug} ) {
die "$0: missing required option(s): #errors\n";
}
################################################################################
# Directory variables
###############################################################################
my $opts{input_directory} ||= "/$opts{mount_point}/$opts{sub_dir}/input";
my $opts{output_directory} ||= "/$opts{mount_point}/$opts{sub_dir}/output";
my $opts{log_directory} ||= "/$opts{mount_point}/$opts{sub_dir}/log";
my $opts{database_directory} ||= "/db/$opts{database_name}";
my $opts{database_scripts} ||= "$opts{database_directory}/scripts";
################################################################################
# File variables
################################################################################
my $opts{input_file} ||= "$opts{input_directory}/input_file.dat";
my $opts{output_file} ||= "$opts{output_directory}/output_file.dat";
# ... etc
I think what I would do is set input_directory et al to "undef", and then put them in the getopts, and then afterwards, test if they're still undef and if so assign them as shown. If your users are technically sophisticated enough to understand "if I give a relative path it's relative to $mount_point/$sub_dir", then I'd do additional parsing looking for an initial "/".
GetOptions can be called with an array as its input data. Read the documentation.