perl parse command line arguments using shift command - perl

I have a question regarding parsing command line arguments and the use of the shift command in Perl.
I wanted to use this line to launch my Perl script
/home/scripts/test.pl -a --test1 -b /path/to/file/file.txt
So I want to parse the command line arguments. This is part of my script where I do that
if ($arg eq "-a") {
$main::john = shift(#arguments);
} elsif ($arg eq "-b") {
$main::doe = shift(#arguments);
}
I want to use then these arguments in a $command variable that will be executed afterwards
my $var1=$john;
my $var2=$doe;
my $command = "/path/to/tool/tool --in $line --out $outputdir $var1 $var2";
&execute($command);
Now here are two problems that I encounter:
It should not be obligatory to specify -a & -b at the command line. But what happens now is that when I don't specify -a, I get the message that I'm using an uninitialized value at the line where the variable is defined
Second problem: $var2 will now equal $doe so it will be in this case /path/to/file/file.txt. However I want $var2 to be equal to --text /path/to/file/file.txt. Where should I specify this --text. It cannot be standardly in the $command, because then it will give a problem when I don't specify -b. Should I do it when I define $doe, but how then?

You should build your command string according to the contents of the variables
Like this
my $var1 = $john;
my $var2 = $doe;
my $command = "/path/to/tool/tool --in $line --out $outputdir";
$command .= " $var1" if defined $var1;
$command .= " --text $var2" if defined $var2;
execute($command);
Also
Don't use ampersands & when you are calling Perl subroutine. That hasn't been good practice for eighteen years now
Don't use package variables like $main:xxx. Lexical variables (declared with my) are almost all that is necessary
As Alnitak says in the comment you should really be using the Getopt::Long module to avoid introducing errors into your command-line parsing

GetOpt::Long might be an option: http://search.cpan.org/~jv/Getopt-Long-2.48/lib/Getopt/Long.pm
Regarding your sample:
You didn't say what should happen if -a or -b are missing, but defaults may solve your problem:
# Use 'join' as default if $var1 is not set
my $var1 = $john // 'john';
# Use an empty value as default if $var2 is not set
my $var2 = $doe // '';
Regarding the --text prefix:
Do you want to set it always?
my $command = "/path/to/tool/tool --in $line --out $outputdir $var1 --text $var2";
Or do you want to set it if -b = $var2 has been set?
# Prefix
my $var2 = "--text $john";
# Prefix with default
my $var2 = defined $john ? "--text $john" : '';
# Same, but long format
my $var2 = ''; # Set default
if ($john) {
$var2 = "--text $john";
}

Related

(Perl) Is it possible to have interpolated variables when a string is read from a file?

I am working on a script that has some variables which are passed on to a string and then they a printed out. The initial string was only 6 lines I didn't need an external file for it but I now have a new string which can fill over 1000 lines. The new string also has some fields that are to be replaced by variables declared in the script.
The text file has something like:
Hello $name
The code is supposed to have several parts to it.
Declaration of variable
my $name = 'Foo';
Open file and read it into a string.
my $content;
open(my $fh, '<', $filename) or die "cannot open file $filename";
{
local $/;
$content = <$fh>;
}
close($fh);
Print string
print $content
Expected outcome:
Hello Foo
I am wondering if it's possible to read "Hello $name" from a file but print it as "Hello Foo" since the variable name is declared as Foo.
So you want your file to be a template. Why not use a proper template language like this one?
use Template qw( );
my %vars = (
name => "Foo",
);
my $tt = Template->new();
$tt->process($qfn, \%vars)
or die($tt->error());
Template:
Hello [% name %]
The output can be captured instead of printed by using ->process's third arg.
Simplest way:
my $foo = 'Fred';
my $bar = 'Barney';
my $string = 'Say hello to $foo and $bar';
say eval qq{"$string"}
The correct answer to the question (as you've already seen) is to use a proper templating system instead.
But it's worth noting that this is answered in the Perl FAQ.
How can I expand variables in text strings?
If you can avoid it, don't, or if you can use a templating system, such as Text::Template or Template Toolkit, do that instead. You might even be able to get the job done with sprintf or printf:
my $string = sprintf 'Say hello to %s and %s', $foo, $bar;
However, for the one-off simple case where I don't want to pull out a full templating system, I'll use a string that has two Perl scalar variables in it. In this example, I want to expand $foo and $bar to their variable's values:
my $foo = 'Fred';
my $bar = 'Barney';
$string = 'Say hello to $foo and $bar';
One way I can do this involves the substitution operator and a double /e flag. The first /e evaluates $1 on the replacement side and turns it into $foo. The second /e starts with $foo and replaces it with its value. $foo, then, turns into 'Fred', and that's finally what's left in the string:
$string =~ s/(\$\w+)/$1/eeg; # 'Say hello to Fred and Barney'
The /e will also silently ignore violations of strict, replacing undefined variable names with the empty string. Since I'm using the /e flag (twice even!), I have all of the same security problems I have with eval in its string form. If there's something odd in $foo, perhaps something like #{[ system "rm -rf /" ]}, then I could get myself in trouble.
To get around the security problem, I could also pull the values from a hash instead of evaluating variable names. Using a single /e, I can check the hash to ensure the value exists, and if it doesn't, I can replace the missing value with a marker, in this case ??? to signal that I missed something:
my $string = 'This has $foo and $bar';
my %Replacements = (
foo => 'Fred',
);
# $string =~ s/\$(\w+)/$Replacements{$1}/g;
$string =~ s/\$(\w+)/
exists $Replacements{$1} ? $Replacements{$1} : '???'
/eg;
print $string;
If you're going to be using Perl, then it's really worth your while to spend an afternoon getting to know the FAQ.

How to get array of hash arguments using Getopt::Long lib in perl?

I want to take arguments as an array of hashes by using Getopt::Long in my script.
Consider the following command line example:
perl testing.pl --systems id=sys_1 ip_address=127.0.0.1 id=sys_2 ip_address=127.0.0.2
For the sake of simplicity, I'm using two systems and only two sub arguments of each system, i.e., id and ip_address. Ideally, the number of systems is dynamic; it may contain 1, 2 or more and so with the number of arguments of each system.
My script should handle these arguments in such a way that it will store in #systems array and each element will be a hash containing id and ip_address.
Is there any way in Getopt::Long to achieve this without parsing it myself?
Following is pseudocode for what I'm trying to achieve:
testing.pl
use Getopt::Long;
my #systems;
GetOptions('systems=s' => \#systems);
foreach (#systems) {
print $_->{id},' ', $_->{ip_address};
}
Here is an attempt, there might be more elegant solutions:
GetOptions('systems=s{1,}' => \my #temp );
my #systems;
while (#temp) {
my $value1 = shift #temp;
$value1 =~ s/^(\w+)=//; my $key1 = $1;
my $value2 = shift #temp;
$value2 =~ s/^(\w+)=//; my $key2 = $1;
push #systems, { $key1 => $value1, $key2 => $value2 };
}
for (#systems) {
print $_->{id},' ', $_->{ip_address}, "\n";
}
Output:
sys_1 127.0.0.1
sys_2 127.0.0.2
I actually think this is a design problem, more than a problem with GetOpt - the notion of supporting multiple, paired arguments passed as command line arguments I think is something that you'd be far better off avoiding.
There's a reason that GetOpt doesn't really support it - it's not a scalable solution really.
How about instead just reading the values from STDIN?:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my %systems = do { local $/; <DATA> } =~ m/id=(\w+) ip_address=([\d\.]+)/mg;
print Dumper \%systems;
And then you'd be able to invoke your script as:
perl testing.pl <filename_with_args>
Or similar.
And if you really must:
my %systems = "#ARGV" =~ m/id=(\w+) ip_address=([\d\.]+)/g;
Both of the above work for multiple parameters.
However, your comment on another post:
I can't because I'm fetching parameters from database and converting them into command line and then passing it to the script using system command $cmd_lines_args = '--system --id sys_1 --ip_address 127.0.0.0.1'; system("perl testing.pl $cmd_lines_args"); $cmd_lines_args I'll generate dynamically using for loop by reading from database
.. that makes this an XY Problem.
Don't do it like that:
open ( my $script, '|-', "perl testing.pl" );
print {$script} "id=some_id ip_address=10.9.8.7\n";
print {$script} "id=sys2 ip_address=10.9.8.7\n";
etc.
What you are describing,
--systems id=sys_1 ip_address=127.0.0.1 id=sys_2 ip_address=127.0.0.2
appears to be one option that takes a variable number of arguments that are pairs, and come in multiples of two. Getopt::Long's "Options with multiple values" lets you do the following:
GetOptions('systems=s{2,4}' => \#systems);
This lets you specify 2, 3 or 4 arguments, but it does not have syntax for "any even number of arguments" (to cover an arbitrary number of pairs beyond two), and you still have to unpack the id=sys_1 manually then. You can write a user-defined subroutine that handles the processing of --systems' arguments (but does not take into account missing id=...s):
my $system;
my %systems;
GetOptions('systems=s{,}' => sub {
my $option = shift;
my $pair = shift;
my ($key, $value) = split /=/, $pair;
$system = $value if $key eq 'id';
$systems{$system} = $value if $key eq 'ip_address';
});
I would however prefer one of the following schemes:
--system sys_1 127.0.0.1 --system sys_2 127.0.0.1
--system sys_1=127.0.0.1 --system sys_2=127.0.0.1
They're achieved with the following:
GetOptions('system=s{2}', \#systems);
GetOptions('system=s%', \#systems);
I would just parse the --systems arg and quote the "hashes" like this:
perl testing.pl --systems "id=s1 ip_address=127.0.0.1 id=s2 ip_address=127.0.0.2"
Parse like perhaps so:
my($systems,#systems);
GetOptions('systems=s' => \$systems);
for(split/\s+/,$systems){
my($key,$val)=split/=/,$_,2;
push #systems, {} if !#systems or exists$systems[-1]{$key};
$systems[-1]{$key}=$val;
}
print "$_->{id} $_->{ip_address}\n" for #systems;
Prints:
sys_1 127.0.0.1
sys_2 127.0.0.2

Perl parameters passing with special characters

This is a pure Perl parameters passing issue. I cannot use Get::Opt as it is not installed on every machine.
I need to pass parameters with spaces and other special chars sometimes. Three scripts to demo the process. Is there a better way to do this?
[gliang#www stackoverflow]$ perl parameter_wrapper.pl
prep.pl #<5> parameters
prep_v2.pl #<5> parameters
<aaa_777-1>
<bbb-6666-2>
<Incomplete QA>
<-reason>
<too long, mail me at ben#example.com :)>
cat parameter_wrapper.pl
#!/usr/bin/perl -w
use strict;
# call prep.pl with 5 parameters
my $cmd = "./prep.pl aaa_777-1 bbb-6666-2 'Incomplete QA' -reason 'too long, mail me at ben\#example.com :)\n'";
system($cmd);
cat prep.pl
#!/usr/bin/perl -w
use strict;
my #parameters = #ARGV;
my $count = scalar(#parameters);
my #parameters_new = wrap_parameters(#parameters);
my $cmd = "./prep_v2.pl #parameters_new";
print "prep.pl #<$count> parameters\n";
system($cmd);
sub wrap_parameters {
my #parameters = #_;
my #parameters_new;
foreach my $var(#parameters) {
$var = quotemeta($var);
push(#parameters_new, $var);
}
return #parameters_new;
}
cat prep_v2.pl
#!/usr/bin/perl -w
use strict;
my #parameters = #ARGV;
my $count = scalar(#parameters);
print "prep_v2.pl #<$count> parameters\n";
foreach my $var (#parameters) {
#print "<$var>\n";
}
Getopt::Long has been part of the Perl core since Perl 5 was first released in 1994. Are you sure it's not available on the machines you're looking to deploy on? In your comment you refer to it as "Get::Opt", so could you have made a mistake while checking the machines?

Why is this map function so complicated?

When I run the below script I get
$VAR1 = 'ssh -o Compression=yes -o ConnectTimeout=333 remoteIp \'mbuffer -q -s 128k -m mbufferSize -4 -I mbufferPort|zfs recv recvOpt dstDataSet\'';
which leads me to think, that all $shellQuote does is converting an array to a string and adding a ' in the beginning and end. Plus adding a | between two arrays. But the purpose of the map function can't I figure out.
The script is a super simplified version of this in order to figure out what exactly $shellQuote does.
Question
$shellQuote looks very complicated. Does it do anything else I am missing?
#!/usr/bin/perl
use Data::Dumper;
use warnings;
use strict;
my $shellQuote = sub {
my #return;
for my $group (#_){
my #args = #$group;
for (#args){
s/'/'"'"'/g;
}
push #return, join ' ', map {/^[-\/#=_0-9a-z]+$/i ? $_ : qq{'$_'}} #args;
}
return join '|', #return;
};
sub buildRemoteRefArray {
my $remote = shift;
my #sshCmdArray = (qw(ssh -o Compression=yes -o), 'ConnectTimeout=' . '333');
if ($remote){
return [#sshCmdArray, $remote, $shellQuote->(#_)];
}
return #_;
};
my #recvCmd = buildRemoteRefArray('remoteIp', ['mbuffer', (qw(-q -s 128k -m)), 'mbufferSize', '-4', '-I', 'mbufferPort'], ['zfs', 'recv', 'recvOpt', 'dstDataSet']);
my $cmd = $shellQuote->(#recvCmd);
print Dumper $cmd;
The map function, by which I assume you mean
map {/^[-\/#=_0-9a-z]+$/i ? $_ : qq{'$_'}} #args
checks each argument to see if it is a legal shell token or not. Legal shell tokens are passed through; anything with a suspicious character gets enclosed on '' quotes.
Bear in mind that your example has two calls to $shellQuote, not just one; you're printing:
print Dumper($shellQuote->(
[
qw(ssh -o Compression=yes -o),
'ConnectTimeout=' . '333',
'remoteIp',
$shellQuote->(
[
'mbuffer',
(qw(-q -s 128k -m)),
'mbufferSize',
'-4',
'-I',
'mbufferPort',
],
[
'zfs',
'recv',
'recvOpt',
'dstDataSet',
],
),
]
));
Where I've indented the arguments to each shell command one step further than the command for clarity of the structure of the list. So your '' quotes are coming from the outer $shellQuote, which is recognizing that the inner $shellQuote has put spaces into its result; the | is comming from the inner $shellQuote, which is using them to combine the the two array refs passed to it.
Breaking the map function down, map { expr } #args means 'evaluation expr for each element of #args and make a list of the results.
/^[-\/#=_0-9a-z]+$/i ? $_ : qq{'$_'} is a ternary expression (Googleable term). $_ is the current element of #args, and /re/i is true if and only if $_ matches the given regular expression (Googleable term) (case insensitive). The whole expression means 'if the current element of #args contains only the listed characters (ASCII letters, ASCII digits, and the characters -, /, #, and =), return it as-is; otherwise return it wrapped in single quotes'.
The for loop, before that, replaces each ' in each element of #args with '"'"', which is a particular way of embedding a single quote into a single-quoted string in sh.
Ignore your code for a second and look at this one as it's a bit clearer.
# shell_quote_arg("foo bar") => 'foo bar'
sub shell_quote_arg {
my ($s) = #_;
return $s if $s !~ /[^-\/#=_0-9a-z]/i;
$s =~ s/'/'"'"'/g; # '
return qq{'$s'}
}
# shell_quote("echo", "foo bar") => echo 'foo bar'
sub shell_quote {
return join ' ', map { shell_quote_arg($_) } #_;
}
my $remote_shell_cmd1 = shell_quote('mbuffer', 'arg1a', 'arg1b');
my $remote_shell_cmd2 = shell_quote('zfs', 'arg2a', 'arg2b');
my $remote_shell_cmd = join(' | ', $remote_shell_cmd1, $remote_shell_cmd2);
my $local_shell_cmd = shell_quote('ssh', $host, $remote_shell_cmd);
My shell_quote is used to build a shell command from a program name and argument. For example,
shell_quote('zfs', 'recv', 'recvOpt', 'dstDataSet')
returns
zfs recv recvOpt dstDataSet
So why not just use join(' ', 'zfs', 'recv', 'recvOpt', 'dstDataSet')? Because characters such as spaces, $ and ' have special meaning to the shell. shell_quote needs to do extra work if these are present. For example,
shell_quote('echo', q{He's got $100})
returns
echo 'He'"'"'s got $100' # When run, uses echo to output: He's got $100
The shellQuote you showed does the same thing as my shell_quote, but it also does the join('|', ...) you see in my code.
By the way, notice that shellQuote is called twice. The first time, it's used to build the command to execute on the remote machine, as the following does:
my $remote_shell_cmd1 = shell_quote('mbuffer', 'arg1a', 'arg1b');
my $remote_shell_cmd2 = shell_quote('zfs', 'arg2a', 'arg2b');
my $remote_shell_cmd = join(' | ', $remote_shell_cmd1, $remote_shell_cmd2);
The second time, it's used to build the command to execute on the local machine, as the following does:
my $local_shell_cmd = shell_quote('ssh', $host, $remote_shell_cmd);

Is there a way to check, if an argument is passed in single quotes?

Is there a (best) way to check, if $uri was passed in single quotes?
#!/usr/local/bin/perl
use warnings;
use 5.012;
my $uri = shift;
# uri_check
# ...
Added this example, to make my problem more clear.
#!/usr/local/bin/perl
use warnings;
use 5.012;
use URI;
use URI::Escape;
use WWW::YouTube::Info::Simple;
use Term::Clui;
my $uri = shift;
# uri check here
$uri = URI->new( $uri );
my %params = $uri->query_form;
die "Malformed URL or missing parameter" if $params{v} eq '';
my $video_id = uri_escape( $params{v} );
my $yt = WWW::YouTube::Info::Simple->new( $video_id );
my $info = $yt->get_info();
my $res = $yt->get_resolution();
my #resolution;
for my $fmt ( sort { $a <=> $b } keys %$res ) {
push #resolution, sprintf "%d : %s", $fmt, $res->{$fmt};
}
# with an uri-argument which is not passed in single quotes
# the script doesn't get this far
my $fmt = choose( 'Resolution', #resolution );
$fmt = ( split /\s:\s/, $fmt )[0];
say $fmt;
You can't; bash parses the quotes before the string is passed to the Perl interpreter.
To expand on Blagovest's answer...
perl program http://example.com/foo?bar=23&thing=42 is interpreted by the shell as:
Execute perl and pass it the arguments program and http://example.com/foo?bar=23
Make it run in the background (that's what & means)
Interpret thing=42 as setting the environment variable thing to be 42
You should have seen an error like -bash: thing: command not found but in this case bash interpreted thing=42 as a valid instruction.
The shell handles the quoting and Perl has no knowledge of that. Perl can't issue an error message, it just sees arguments after shell processing. It never even sees the &. This is just one of those Unix things you'll have to learn to live with. The shell is a complete programming environment, for better or worse.
There are other shells which dumb things down quite a bit so you can avoid this issue, but really you're better off learning the quirks and powers of a real shell.