In Perl, how can I concisely check if a $variable is defined and contains a non zero length string? - perl

I currently use the following Perl to check if a variable is defined and contains text. I have to check defined first to avoid an 'uninitialized value' warning:
if (defined $name && length $name > 0) {
# do something with $name
}
Is there a better (presumably more concise) way to write this?

You often see the check for definedness so you don't have to deal with the warning for using an undef value (and in Perl 5.10 it tells you the offending variable):
Use of uninitialized value $name in ...
So, to get around this warning, people come up with all sorts of code, and that code starts to look like an important part of the solution rather than the bubble gum and duct tape that it is. Sometimes, it's better to show what you are doing by explicitly turning off the warning that you are trying to avoid:
{
no warnings 'uninitialized';
if( length $name ) {
...
}
}
In other cases, using some sort of null value instead of the actual data gets around the problem. With Perl 5.10's defined-or operator, give length an explicit empty string (defined, and gives back zero length) instead of the variable that would trigger the warning:
use 5.010;
if( length( $name // '' ) ) {
...
}
In Perl 5.12, it's a bit easier because length on an undefined value also returns undefined. That might seem like a bit of silliness, but that pleases the mathematician I might have wanted to be. That doesn't issue a warning, which is the reason this question exists.
use 5.012;
use warnings;
my $name;
if( length $name ) { # no warning
...
}

As mobrule indicates, you could use the following instead for a small savings:
if (defined $name && $name ne '') {
# do something with $name
}
You could ditch the defined check and get something even shorter, e.g.:
if ($name ne '') {
# do something with $name
}
But in the case where $name is not defined, although the logic flow will work just as intended, if you are using warnings (and you should be), then you'll get the following admonishment:
Use of uninitialized value in string ne
So, if there's a chance that $name might not be defined, you really do need to check for definedness first and foremost in order to avoid that warning. As Sinan Ünür points out, you can use Scalar::MoreUtils to get code that does exactly that (checks for definedness, then checks for zero length) out of the box, via the empty() method:
use Scalar::MoreUtils qw(empty);
if(not empty($name)) {
# do something with $name
}

First, since length always returns a non-negative number,
if ( length $name )
and
if ( length $name > 0 )
are equivalent.
If you are OK with replacing an undefined value with an empty string, you can use Perl 5.10's //= operator which assigns the RHS to the LHS unless the LHS is defined:
#!/usr/bin/perl
use feature qw( say );
use strict; use warnings;
my $name;
say 'nonempty' if length($name //= '');
say "'$name'";
Note the absence of warnings about an uninitialized variable as $name is assigned the empty string if it is undefined.
However, if you do not want to depend on 5.10 being installed, use the functions provided by Scalar::MoreUtils. For example, the above can be written as:
#!/usr/bin/perl
use strict; use warnings;
use Scalar::MoreUtils qw( define );
my $name;
print "nonempty\n" if length($name = define $name);
print "'$name'\n";
If you don't want to clobber $name, use default.

In cases where I don't care whether the variable is undef or equal to '', I usually summarize it as:
$name = "" unless defined $name;
if($name ne '') {
# do something with $name
}

You could say
$name ne ""
instead of
length $name > 0

It isn't always possible to do repetitive things in a simple and elegant way.
Just do what you always do when you have common code that gets replicated across many projects:
Search CPAN, someone may have already the code for you. For this issue I found Scalar::MoreUtils.
If you don't find something you like on CPAN, make a module and put the code in a subroutine:
package My::String::Util;
use strict;
use warnings;
our #ISA = qw( Exporter );
our #EXPORT = ();
our #EXPORT_OK = qw( is_nonempty);
use Carp qw(croak);
sub is_nonempty ($) {
croak "is_nonempty() requires an argument"
unless #_ == 1;
no warnings 'uninitialized';
return( defined $_[0] and length $_[0] != 0 );
}
1;
=head1 BOILERPLATE POD
blah blah blah
=head3 is_nonempty
Returns true if the argument is defined and has non-zero length.
More boilerplate POD.
=cut
Then in your code call it:
use My::String::Util qw( is_nonempty );
if ( is_nonempty $name ) {
# do something with $name
}
Or if you object to prototypes and don't object to the extra parens, skip the prototype in the module, and call it like: is_nonempty($name).

The excellent library Type::Tiny provides an framework with which to build type-checking into your Perl code. What I show here is only the thinnest tip of the iceberg and is using Type::Tiny in the most simplistic and manual way.
Be sure to check out the Type::Tiny::Manual for more information.
use Types::Common::String qw< NonEmptyStr >;
if ( NonEmptyStr->check($name) ) {
# Do something here.
}
NonEmptyStr->($name); # Throw an exception if validation fails

How about
if (length ($name || '')) {
# do something with $name
}
This isn't quite equivalent to your original version, as it will also return false if $name is the numeric value 0 or the string '0', but will behave the same in all other cases.
In perl 5.10 (or later), the appropriate approach would be to use the defined-or operator instead:
use feature ':5.10';
if (length ($name // '')) {
# do something with $name
}
This will decide what to get the length of based on whether $name is defined, rather than whether it's true, so 0/'0' will handle those cases correctly, but it requires a more recent version of perl than many people have available.

if ($name )
{
#since undef and '' both evaluate to false
#this should work only when string is defined and non-empty...
#unless you're expecting someting like $name="0" which is false.
#notice though that $name="00" is not false
}

Related

How to check if an array has an element that is not integer?

Firstly, I have a hash:
$hWriteHash{'Identifier'}{'number'} that contains = 1#2#12#A24#48
Then I split this by "#" and then put it in the #arr_of_tenors variable.
Here's the code:
use strict;
use warnings;
use Scalar::Util qw(looks_like_number);
use Data::Dumper;
$nums = $hWriteHash{'Identifier'}{'number'};
my #arr_of_tenors = split("#", $nums);
print("#arr_of_tenors\n");
The output is 1 2 12 A24 48
Now, my goal is if the array has an element that's not an integer which is A24, it will go to die function.
Here's what I've tried.
if(not looks_like_number(#arr_of_tenors)){
die "ERROR: Array has an element that's not an integer.";
}else{
print("All good");
}
Obviously, the only acceptable format should be integers only.
I've tried to use looks_like_number but it didn't work. It always goes to else statement.
I know that there is another option which is grep + regex. But as much as possible, I don't want to use regular expressions on this one if there is a function that does the same job. Also, as much as possible, I don't want to iterate each element.
How does looks_like_number works?
Am I missing something?
Here's yet another way:
use Types::Common qw( Int );
if ( Int->all( #arr_of_tenors ) ) {
# all integers
}
else {
# at least one non-integer
}
And another, because why not?
use Types::Common qw( ArrayRef Int );
if ( ArrayRef->of( Int )->check( \#arr_of_tenors ) ) {
# all integers
}
else {
# at least one non-integer
}
How does looks_like_number works? Am I missing something?
It checks one thing at a time, you fed many things (an array). You need to traverse those many things and make a decision.
You want to error out if not all of the elements look like an integer, right? Then you can use notall from the core module List::Util:
use strict;
use warnings;
use List::Util qw(notall);
my $nums = "1#2#12#A24#48"; # $hWriteHash{"Identifier"}{"number"};
my #arr_of_tenors = split("#", $nums);
if (notall { /^-?\d+\z/ } #arr_of_tenors) {
die "ERROR: Array has an element that doesn't look like an integer.";
}
else {
print "All good\n";
}
which dies with
ERROR: Array has an element that doesn't look like an integer.
The notall function performs the mentioned traversal for you and subjects the predicate (the block above) to each element of the list in turn. Returns true if not all of the elements satisfies the condition; false otherwise. It also shortcircuits, i.e., immediately returns true if it sees a noncomplying element.
Noting that i changed looks_like_number to an integer check with a regex as the former accepts more, e.g., 48.7 etc. But if you are sure the incoming values are integer-like, you can replace the regex with looks_like_number($_) in the block above.
You can use List::Util::any to check if any element of the array does not look like a number:
use warnings;
use strict;
use Scalar::Util qw(looks_like_number);
use Data::Dumper;
use List::Util qw(any);
my $sKey = 'abc';
my %hWriteHash;
$hWriteHash{$sKey}{'number'} = '1#2#12#A24#48';
my $nums = $hWriteHash{$sKey}{'number'};
my #arr_of_tenors = split("#", $nums);
print("#arr_of_tenors\n");
if (any { not looks_like_number($_) } #arr_of_tenors) {
die "ERROR: Array has an element that's not an integer.";
}else{
print("All good");
}
print "\n";
From the docs:
Many cases of using grep in a conditional can be written using any
instead, as it can short-circuit after the first true result.
This works with the input you provided. However, looks_like_number will also be true for numbers like 5.37.

Perl - How to create commands that users can input in console?

I'm just starting in Perl and I'm quite enjoying it. I'm writing some basic functions, but what I really want to be able to do is to use those functions intelligently using console commands. For example, say I have a function adding two numbers. I'd want to be able to type in console "add 2, 4" and read the first word, then pass the two numbers as parameters in an "add" function. Essentially, I'm asking for help in creating some basic scripting using Perl ^^'.
I have some vague ideas about how I might do this in VB, but Perl, I have no idea where I'd start, or what functions would be useful to me. Is there something like VB.net's "Split" function where you can break down the contents of a scalar into an array? Is there a simple way to analyse one word at a time in a scalar, or iterate through a scalar until you hit a separator, for example?
I hope you can help, any suggestions are appreciated! Bear in mind, I'm no expert, I started Perl all of a few weeks ago, and I've only been doing VB.net half a year.
Thank you!
Edit: If you're not sure what to suggest and you know any simple/intuitive resources that might be of help, that would also be appreciated.
Its rather easy to make a script which dispatches to a command by name. Here is a simple example:
#!/usr/bin/env perl
use strict;
use warnings;
# take the command name off the #ARGV stack
my $command_name = shift;
# get a reference to the subroutine by name
my $command = __PACKAGE__->can($command_name) || die "Unknown command: $command_name\n";
# execute the command, using the rest of #ARGV as arguments
# and print the return with a trailing newline
print $command->(#ARGV);
print "\n";
sub add {
my ($x, $y) = #_;
return $x + $y;
}
sub subtract {
my ($x, $y) = #_;
return $x - $y;
}
This script (say its named myscript.pl) can be called like
$ ./myscript.pl add 2 3
or
$ ./myscript.pl subtract 2 3
Once you have played with that for a while, you might want to take it further and use a framework for this kind of thing. There are several available, like App::Cmd or you can take the logic shown above and modularize as you see fit.
You want to parse command line arguments. A space serves as the delimiter, so just do a ./add.pl 2 3 Something like this:
$num1=$ARGV[0];
$num2=$ARGV[1];
print $num1 + $num2;
will print 5
Here is a short implementation of a simple scripting language.
Each statement is exactly one line long, and has the following structure:
Statement = [<Var> =] <Command> [<Arg> ...]
# This is a regular grammar, so we don't need a complicated parser.
Tokens are seperated by whitespace. A command may take any number of arguments. These can either be the contents of variables $var, a string "foo", or a number (int or float).
As these are Perl scalars, there is no visible difference between strings and numbers.
Here is the preamble of the script:
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
strict and warnings are essential when learning Perl, else too much weird stuff would be possible. The use 5.010 is a minimum version, it also defines the say builtin (like a print but appends a newline).
Now we declare two global variables: The %env hash (table or dict) associates variable names with their values. %functions holds our builtin functions. The values are anonymous functions.
my %env;
my %functions = (
add => sub { $_[0] + $_[1] },
mul => sub { $_[0] * $_[1] },
say => sub { say $_[0] },
bye => sub { exit 0 },
);
Now comes our read-eval-loop (we don't print by default). The readline operator <> will read from the file specified as the first command line argument, or from STDIN if no filename is provided.
while (<>) {
next if /^\s*\#/; # jump comment lines
# parse the line. We get a destination $var, a $command, and any number of #args
my ($var, $command, #args) = parse($_);
# Execute the anonymous sub specified by $command with the #args
my $value = $functions{ $command }->(#args);
# Store the return value if a destination $var was specified
$env{ $var } = $value if defined $var;
}
That was fairly trivial. Now comes some parsing code. Perl “binds” regexes to strings with the =~ operator. Regexes may look like /foo/ or m/foo/. The /x flags allows us to include whitespace in our regex that doesn't match actual whitespace. The /g flag matches globally. This also enables the \G assertion. This is where the last successful match ended. The /c flag is important for this m//gc style parsing to consume one match at a time, and to prevent the position of the regex engine in out string to being reset.
sub parse {
my ($line) = #_; # get the $line, which is a argument
my ($var, $command, #args); # declare variables to be filled
# Test if this statement has a variable declaration
if ($line =~ m/\G\s* \$(\w+) \s*=\s* /xgc) {
$var = $1; # assign first capture if successful
}
# Parse the function of this statement.
if ($line =~ m/\G\s* (\w+) \s*/xgc) {
$command = $1;
# Test if the specified function exists in our %functions
if (not exists $functions{$command}) {
die "The command $command is not known\n";
}
} else {
die "Command required\n"; # Throw fatal exception on parse error.
}
# As long as our matches haven't consumed the whole string...
while (pos($line) < length($line)) {
# Try to match variables
if ($line =~ m/\G \$(\w+) \s*/xgc) {
die "The variable $1 does not exist\n" if not exists $env{$1};
push #args, $env{$1};
}
# Try to match strings
elsif ($line =~ m/\G "([^"]+)" \s*/xgc) {
push #args, $1;
}
# Try to match ints or floats
elsif ($line =~ m/\G (\d+ (?:\.\d+)? ) \s*/xgc) {
push #args, 0+$1;
}
# Throw error if nothing matched
else {
die "Didn't understand that line\n";
}
}
# return our -- now filled -- vars.
return $var, $command, #args;
}
Perl arrays can be handled like linked list: shift removes and returns the first element (pop does the same to the last element). push adds an element to the end, unshift to the beginning.
Out little programming language can execute simple programs like:
#!my_little_language
$a = mul 2 20
$b = add 0 2
$answer = add $a $b
say $answer
bye
If (1) our perl script is saved in my_little_language, set to be executable, and is in the system PATH, and (2) the above file in our little language saved as meaning_of_life.mll, and also set to be executable, then
$ ./meaning_of_life
should be able to run it.
Output is obviously 42. Note that our language doesn't yet have string manipulation or simple assignment to variables. Also, it would be nice to be able to call functions with the return value of other functions directly. This requires some sort of parens, or precedence mechanism. Also, the language requires better error reporting for batch processing (which it already supports).

How can I "store" an operator inside a variable in Perl?

I'm looking for a way to do this in Perl:
$a = "60"; $b = "< 80";
if ( $a $b ) { then .... }
Here, $b "holds" an operator... can I do this? Maybe some other way?
It's nice to see how people discover functional programming. :-)
Luckily, Perl has capabilities to create and store functions on-the-fly. For example, the sample in your question will look like this:
$a = "60"; $b = sub { $_[0] < 80 };
if ( $b->($a) ) { .... }
In this example, a reference to the anonymous subroutine is stored in $b, the sub having the same syntax for argument passing as a usual one. -> is then used to call-by-reference (the same syntax you probably use for references to arrays and hashes).
But, of course, if you want just to construct Perl expressions from arbitrary strings, you might want to use eval:
$a = "60"; $b = " < 80";
if ( eval ("$a $b") ) { .... }
However, doing this via eval is not safe, if the string you're eval-ing contains parts that come as user input. Sinan Ünür explained it perfectly in his answer-comment.
How about defining a function that wraps the needed condition:
my $cond = sub { $_[0] < 80 };
if ( $cond->( $a ) ) {
...
}
This should be a comment but comments are too cramped for something like this so I am making it CW.
For the case which you showed where the contents of the variables that are going to be passed to string eval, the accepted solution is correct.
If, however, the contents of $a and $b come from user input, then take a look at the following script:
#!/usr/bin/perl
use strict; use warnings;
my $x = '80';
my $y = '; warn "evil laugh!\n"; exit';
if ( eval ($x . $y) ) {
print "it worked!!!\n";
}
If the strings are entered by the user, there is nothing preventing the user from passing to your program the string ';system "rm -rf /bin"'.
So, the correct solution to your question would require writing or using an expression parser.
BTW, you should not use $a and $b as variable names as the are magical package local variables used by sort and as such they are exempt from strict — and you must always use strict and warnings in your programs.
$a = "60"; $b = "< 80";
if( eval($a. $b)){
print "ok";
}
see perldoc eval for more
I wonder if Number::Compare is of any interest here. From the example:
Number::Compare->new(">1Ki")->test(1025); # is 1025 > 1024
my $c = Number::Compare->new(">1M");
$c->(1_200_000); # slightly terser invocation
Safer form if you trust (or can sufficiently validate) $op and don't trust the safety of the inputs:
my $compare_x = $user_input_x;
my $compare_y = $user_input_y;
my $op = <some safe non-user-input, or otherwise checked against a safe list>;
if ( eval("\$compare_x $op \$compare_y") )
{
...
}

Why does Perl complain about "Use of uninitialized value" in my CGI script?

I am cleaning my Perl code for production release and came across a weird warning in the Apache error log.
It says:
[Thu Nov 5 15:19:02 2009] Clouds.pm: Use of uninitialized value $name in substitution (s///) at /home/mike/workspace/olefa/mod-bin/OSA/Clouds.pm line 404.
The relevant code is here:
my $name = shift #_;
my $name_options = shift #_;
$name_options = $name_options eq 'unique' ? 'u'
: $name_options eq 'overwrite' ? 'o'
: $name_options eq 'enumerate' ? 'e'
: $name_options =~ m/^(?:u|o|e)$/ ? $name_options
: q();
if ($name_options ne 'e') {
$name =~ s/ /_/g;
}
So, why the warning of an uninitialized variable as it is clearly initialized?
The warning simply means that $name was never filled with a value, and you tried doing a substitution operation (s///) on it. The default value of a variable is undefined (undef).
Looking back through your script, $name gets its value from #_. This means either #_ was empty, or had its first value as undef.
Depending on what your subroutine needs to do, validate your values before you use them. In this case, since you need something in $name, croak if there isn't something in that variable. You'll get an error message from the perspective of the caller and you'll find the culprit.
Also, you can lose the complexity of the conditional operator chain by making it a hash lookup, which also gives you a chance to initialize $name_option. In your fallback case, you leave $name_option undefined:
use 5.010;
use Carp;
BEGIN {
my %valid_name_options = map {
$_
substr( $_, 0, 1 ),
} qw( unique overwrite enumerate );
some_sub {
my( $name, $name_options ) = #_;
croak( "Name is not defined!" ) unless defined $name;
$name_options = $valid_name_options{$name_options} // '';
if ($name_options ne 'e') {
$name =~ s/ /_/g;
}
...
}
}
Debug by divide and conquer
It is quite usual to run into bugs that are "obviously impossible" --- at the first glance. I usually try to confirm my assumptions with simple print statements (or equivalent ways to get some information back from the program: For CGI scripts, a simple print can ruin your headers).
So, I would put a statement like
print "testing: ", defined($name)? "defined: '$name'" : "undef", "\n";
into the code at the suspected line. You might be surprised about the possible output options:
"testing: undef" --- this means that your function was called with an undefined first argument. Unlikely but possible. You may want to use caller() to find out from where it was called and check the data there.
no output at all! Maybe you look at the wrong source file or at the wrong line. (I don't suspect that's the case here, but it does happen to me).
"testing: defined: 'some data'" --- oops. ask a question on stackoverflow.

How would I do the equivalent of Prototype's Enumerator.detect in Perl with the least amount of code?

Lately I've been thinking a lot about functional programming. Perl offers quite a few tools to go that way, however there's something I haven't been able to find yet.
Prototype has the function detect for enumerators, the descriptions is simply this:
Enumerator.detect(iterator[, context]) -> firstElement | undefined
Finds the first element for which the iterator returns true.
Enumerator in this case is any list while iterator is a reference to a function, which is applied in turn on each element of the list.
I am looking for something like this to apply in situations where performance is important, i.e. when stopping upon encountering a match saves time by disregarding the rest of the list.
I am also looking for a solution that would not involve loading any extra module, so if possible it should be done with builtins only. And if possible, it should be as concise as this for example:
my #result = map function #array;
You say you don't want a module, but this is exactly what the first function in List::Util does. That's a core module, so it should be available everywhere.
use List::Util qw(first);
my $first = first { some condition } #array;
If you insist on not using a module, you could copy the implementation out of List::Util. If somebody knew a faster way to do it, it would be in there. (Note that List::Util includes an XS implementation, so that's probably faster than any pure-Perl approach. It also has a pure-Perl version of first, in List::Util::PP.)
Note that the value being tested is passed to the subroutine in $_ and not as a parameter. This is a convenience when you're using the first { some condition} #values form, but is something you have to remember if you're using a regular subroutine. Some more examples:
use 5.010; # I want to use 'say'; nothing else here is 5.10 specific
use List::Util qw(first);
say first { $_ > 3 } 1 .. 10; # prints 4
sub wanted { $_ > 4 }; # note we're using $_ not $_[0]
say first \&wanted, 1 .. 10; # prints 5
my $want = \&wanted; # Get a subroutine reference
say first \&$want, 1 .. 10; # This is how you pass a reference in a scalar
# someFunc expects a parameter instead of looking at $_
say first { someFunc($_) } 1 .. 10;
Untested since I don't have Perl on this machine, but:
sub first(\&#) {
my $pred = shift;
die "First argument to "first" must be a sub" unless ref $pred eq 'CODE';
for my $val (#_) {
return $val if $pred->($val);
}
return undef;
}
Then use it as:
my $first = first { sub performing test } #list;
Note that this doesn't distinguish between no matches in the list and one of the elements in the list being an undefined value and having that match.
Just since its not here, a Perl function definition of first that localizes $_ for its block:
sub first (&#) {
my $code = shift;
for (#_) {return $_ if $code->()}
undef
}
my #array = 1 .. 10;
say first {$_ > 5} #array; # prints 6
While it will work fine, I don't advocate using this version, since List::Util is a core module (installed by default), and its implementation of first will usually use the XS version (written in C) which is much faster.