If-statement condtion reduction - perl

Suppose it is:
...
use Config::Properties::Simple;
...
my $p = Config::Properties::Simple->new( file => $propfile );
...
$str = $p->getProperty('prop');
...
.
Does
...
if ( defined $str and $str ne "" ) { #1
...
equal to
...
if ($str) { #2
...
?
If not, is there a way to simplify the #1 marked statement?

No, they're not the same if $str is "0".
You can simplify the statement by just checking the length:
if (length $str) { ...
In recent versions of Perl, length(undef) is undef without any warning generated. And using undef as a boolean doesn't generate a warning either.
(By "recent" I mean 5.12 and up. Previously, length(undef) would produce "Use of uninitialized value in length" if you have warnings turned on, which you should.)

No. It's different for $str=0; and $str="0"; for starters.
Maybe. Depends on what values $str can have, what you are checking for and what version of Perl you want to support. Possibilities:
if ($str)
if (length($str))
if (defined($str))

Related

Where is the error "Use of uninitialized value in string ne" coming from?

Where is the uninitialised value in the below code?
#!/usr/bin/perl
use warnings;
my #sites = (undef, "a", "b");
my $sitecount = 1;
my $url;
while (($url = $sites[$sitecount]) ne undef) {
$sitecount++;
}
Output:
Use of uninitialized value in string ne at t.pl line 6.
Use of uninitialized value in string ne at t.pl line 6.
Use of uninitialized value in string ne at t.pl line 6.
Use of uninitialized value in string ne at t.pl line 6.
You can't use undef in a string comparison without a warning.
if ("a" ne undef) { ... }
will raise a warning. If you want to test if a variable is defined or not, use:
if (defined $var) { ... }
Comments about the original question:
That's a strange way to iterate over an array. The more usual way of doing this would be:
foreach my $url (#sites) { ... }
and drop the $sitecount variable completely, and don't overwrite $url in the loop body. Also drop the undef value in that array. If you don't want to remove that undef for some reason (or expect undefined values to be inserted in there), you could do:
foreach my $url (#sites) {
next unless defined $url;
...
}
If you do want to test for undefined with your form of loop construct, you'd need:
while (defined $sites[$sitecount]) {
my $url = $sites[$sitecount];
...
$sitecount++;
}
to avoid the warnings, but beware of autovivification, and that loop would stop short if you have undefs mixed in between other live values.
The correct answers have already been given (defined is how you check a value for definedness), but I wanted to add something.
In perlop you will read this description of ne:
Binary "ne" returns true if the left argument is stringwise not equal
to the right argument.
Note the use of "stringwise". It basically means that just like with other operators, such as ==, where the argument type is pre-defined, any arguments to ne will effectively be converted to strings before the operation is performed. This is to accommodate operations such as:
if ($foo == "1002") # string "1002" is converted to a number
if ($foo eq 1002) # number 1002 is converted to a string
Perl has no fixed data types, and relies on conversion of data. In this case, undef (which coincidentally is not a value, it is a function: undef(), which returns the undefined value), is converted to a string. This conversion will cause false positives, that may be hard to detect if warnings is not in effect.
Consider:
perl -e 'print "" eq undef() ? "yes" : "no"'
This will print "yes", even though clearly the empty string "" is not equal to not defined. By using warnings, we can catch this error.
What you want is probably something like:
for my $url (#sites) {
last unless defined $url;
...
}
Or, if you want to skip to a certain array element:
my $start = 1;
for my $index ($start .. $#sites) {
last unless defined $sites[$index];
...
}
Same basic principle, but using an array slice, and avoiding indexes:
my $start = 1;
for my $url (#sites[$start .. $#sites]) {
last unless defined $url;
...
}
Note that the use of last instead of next is the logical equivalent of your while loop condition: When an undefined value is encountered, the loop is exited.
More debugging: http://codepad.org/Nb5IwX0Q
If you, like in this paste above, print out the iteration counter and the value, you will quite clearly see when the different warnings appear. You get one warning for the first comparison "a" ne undef, one for the second, and two for the last. The last warnings come when $sitecount exceeds the max index of #sites, and you are comparing two undefined values with ne.
Perhaps the message would be better to understand if it was:
You are trying to compare an uninitialized value with a string.
The uninitialized value is, of course, undef.
To explicitely check if $something is defined, you need to write
defined $something
ne is for string comparison, and undef is not a string:
#!/usr/bin/perl
use warnings;
('l' ne undef) ? 0 : 0;
Use of uninitialized value in string ne at t.pl line 3.
It does work, but you get a [slightly confusing] warning (at least with use warnings) because undef is not an "initialized value" for ne to use.
Instead, use the operator defined to find whether a value is defined:
#!/usr/bin/perl
use warnings;
my #sites = (undef, "a", "b");
my $sitecount = 1;
my $url;
while (defined $sites[$sitecount]) { # <----------
$url = $sites[$sitecount];
# ...
$sitecount++;
}
... or loop over the #sites array more conventionally, as Mat explores in his answer.

Perl throws an error message about syntax

So, building off a question about string matching (this thread), I am working on implementing that info in solution 3 into a working solution to the problem I am working on.
However, I am getting errors, specifically about this line of the below function:
next if #$args->{search_in} !~ /#$cur[1]/;
syntax error at ./db_index.pl line 16, near "next "
My question as a perl newbie is what am I doing wrong here?
sub search_for_key
{
my ($args) = #_;
foreach $row(#{$args->{search_ary}}){
print "#$row[0] : #$row[1]\n";
}
my $thiskey = NULL;
foreach $cur (#{$args->{search_ary}}){
print "\n" . #$cur[1] . "\n"
next if #$args->{search_in} !~ /#$cur[1]/;
$thiskey = #$cur[0];
last;
}
return $thiskey;
}
You left off the semicolon at the end of the previous line. That's what caused the syntax error, anyway. I think you're also misusing $args, but it's hard to be sure about that without knowing how you're calling this function.
There are several issues here.
Are you adding use strict; and use warnings; at the top of your script before you do anything else? You only posted the sub, but it is clear that you are not using these.
What is NULL? (strict will not let you use bare-words...) Be sure to read What is Truth in Perl? The more Perly way is to deal with "truth" or "false" is defined / undef or exists or specifically test for a value chosen as a convention.
Missing ; after print "\n" . #$cur[1] . "\n"
Your data structures seem way too complicated. From what I can tell, you are passing a reference to a hash of arrays, true? Why your data structures get really obscure, back up and look at what you are trying to do...
Perl gives you plenty of way to shoot yourself in the foot. It is not strictly typed and you will do yourself (and your readers) a favor by naming references as a derivative of what they refer to. So instead of $args use $ref2HoArefs for example.
Side note, are you sure you can't just use a hash for what you're doing? It seems awfully complicated do do something so simple:
my %hash = (
key1 => 'value1',
key2 => 'value2',
);
exists $hash{$search_in}; # true/false.
my $result = $hash{$search_in}; # returns 'value1' when $search_in is 'key1'
Or if you need to search by value:
my %flip = reverse %hash;
$result = $flip{$search_in};
And if you really need a regex key ( or value ) lookup:
sub string_match {
my ($lookup_hash, $key ) = #_;
for my $hash_key ( %{ $lookup_hash } ){
return $hash_key if $key =~ $lookup_hash->{$hash_key};
}
return; # not found.
}
my $k = string_match({
'whitespace at end' => qr/\s+$/,
'whitespace at start' => qr/^\s+/,
}, "Some Garbage string "); # k == whitespace at end

Why does Perl complain about "Use of uninitialized value" in my CGI script?

I am cleaning my Perl code for production release and came across a weird warning in the Apache error log.
It says:
[Thu Nov 5 15:19:02 2009] Clouds.pm: Use of uninitialized value $name in substitution (s///) at /home/mike/workspace/olefa/mod-bin/OSA/Clouds.pm line 404.
The relevant code is here:
my $name = shift #_;
my $name_options = shift #_;
$name_options = $name_options eq 'unique' ? 'u'
: $name_options eq 'overwrite' ? 'o'
: $name_options eq 'enumerate' ? 'e'
: $name_options =~ m/^(?:u|o|e)$/ ? $name_options
: q();
if ($name_options ne 'e') {
$name =~ s/ /_/g;
}
So, why the warning of an uninitialized variable as it is clearly initialized?
The warning simply means that $name was never filled with a value, and you tried doing a substitution operation (s///) on it. The default value of a variable is undefined (undef).
Looking back through your script, $name gets its value from #_. This means either #_ was empty, or had its first value as undef.
Depending on what your subroutine needs to do, validate your values before you use them. In this case, since you need something in $name, croak if there isn't something in that variable. You'll get an error message from the perspective of the caller and you'll find the culprit.
Also, you can lose the complexity of the conditional operator chain by making it a hash lookup, which also gives you a chance to initialize $name_option. In your fallback case, you leave $name_option undefined:
use 5.010;
use Carp;
BEGIN {
my %valid_name_options = map {
$_
substr( $_, 0, 1 ),
} qw( unique overwrite enumerate );
some_sub {
my( $name, $name_options ) = #_;
croak( "Name is not defined!" ) unless defined $name;
$name_options = $valid_name_options{$name_options} // '';
if ($name_options ne 'e') {
$name =~ s/ /_/g;
}
...
}
}
Debug by divide and conquer
It is quite usual to run into bugs that are "obviously impossible" --- at the first glance. I usually try to confirm my assumptions with simple print statements (or equivalent ways to get some information back from the program: For CGI scripts, a simple print can ruin your headers).
So, I would put a statement like
print "testing: ", defined($name)? "defined: '$name'" : "undef", "\n";
into the code at the suspected line. You might be surprised about the possible output options:
"testing: undef" --- this means that your function was called with an undefined first argument. Unlikely but possible. You may want to use caller() to find out from where it was called and check the data there.
no output at all! Maybe you look at the wrong source file or at the wrong line. (I don't suspect that's the case here, but it does happen to me).
"testing: defined: 'some data'" --- oops. ask a question on stackoverflow.

In Perl, how can I concisely check if a $variable is defined and contains a non zero length string?

I currently use the following Perl to check if a variable is defined and contains text. I have to check defined first to avoid an 'uninitialized value' warning:
if (defined $name && length $name > 0) {
# do something with $name
}
Is there a better (presumably more concise) way to write this?
You often see the check for definedness so you don't have to deal with the warning for using an undef value (and in Perl 5.10 it tells you the offending variable):
Use of uninitialized value $name in ...
So, to get around this warning, people come up with all sorts of code, and that code starts to look like an important part of the solution rather than the bubble gum and duct tape that it is. Sometimes, it's better to show what you are doing by explicitly turning off the warning that you are trying to avoid:
{
no warnings 'uninitialized';
if( length $name ) {
...
}
}
In other cases, using some sort of null value instead of the actual data gets around the problem. With Perl 5.10's defined-or operator, give length an explicit empty string (defined, and gives back zero length) instead of the variable that would trigger the warning:
use 5.010;
if( length( $name // '' ) ) {
...
}
In Perl 5.12, it's a bit easier because length on an undefined value also returns undefined. That might seem like a bit of silliness, but that pleases the mathematician I might have wanted to be. That doesn't issue a warning, which is the reason this question exists.
use 5.012;
use warnings;
my $name;
if( length $name ) { # no warning
...
}
As mobrule indicates, you could use the following instead for a small savings:
if (defined $name && $name ne '') {
# do something with $name
}
You could ditch the defined check and get something even shorter, e.g.:
if ($name ne '') {
# do something with $name
}
But in the case where $name is not defined, although the logic flow will work just as intended, if you are using warnings (and you should be), then you'll get the following admonishment:
Use of uninitialized value in string ne
So, if there's a chance that $name might not be defined, you really do need to check for definedness first and foremost in order to avoid that warning. As Sinan Ünür points out, you can use Scalar::MoreUtils to get code that does exactly that (checks for definedness, then checks for zero length) out of the box, via the empty() method:
use Scalar::MoreUtils qw(empty);
if(not empty($name)) {
# do something with $name
}
First, since length always returns a non-negative number,
if ( length $name )
and
if ( length $name > 0 )
are equivalent.
If you are OK with replacing an undefined value with an empty string, you can use Perl 5.10's //= operator which assigns the RHS to the LHS unless the LHS is defined:
#!/usr/bin/perl
use feature qw( say );
use strict; use warnings;
my $name;
say 'nonempty' if length($name //= '');
say "'$name'";
Note the absence of warnings about an uninitialized variable as $name is assigned the empty string if it is undefined.
However, if you do not want to depend on 5.10 being installed, use the functions provided by Scalar::MoreUtils. For example, the above can be written as:
#!/usr/bin/perl
use strict; use warnings;
use Scalar::MoreUtils qw( define );
my $name;
print "nonempty\n" if length($name = define $name);
print "'$name'\n";
If you don't want to clobber $name, use default.
In cases where I don't care whether the variable is undef or equal to '', I usually summarize it as:
$name = "" unless defined $name;
if($name ne '') {
# do something with $name
}
You could say
$name ne ""
instead of
length $name > 0
It isn't always possible to do repetitive things in a simple and elegant way.
Just do what you always do when you have common code that gets replicated across many projects:
Search CPAN, someone may have already the code for you. For this issue I found Scalar::MoreUtils.
If you don't find something you like on CPAN, make a module and put the code in a subroutine:
package My::String::Util;
use strict;
use warnings;
our #ISA = qw( Exporter );
our #EXPORT = ();
our #EXPORT_OK = qw( is_nonempty);
use Carp qw(croak);
sub is_nonempty ($) {
croak "is_nonempty() requires an argument"
unless #_ == 1;
no warnings 'uninitialized';
return( defined $_[0] and length $_[0] != 0 );
}
1;
=head1 BOILERPLATE POD
blah blah blah
=head3 is_nonempty
Returns true if the argument is defined and has non-zero length.
More boilerplate POD.
=cut
Then in your code call it:
use My::String::Util qw( is_nonempty );
if ( is_nonempty $name ) {
# do something with $name
}
Or if you object to prototypes and don't object to the extra parens, skip the prototype in the module, and call it like: is_nonempty($name).
The excellent library Type::Tiny provides an framework with which to build type-checking into your Perl code. What I show here is only the thinnest tip of the iceberg and is using Type::Tiny in the most simplistic and manual way.
Be sure to check out the Type::Tiny::Manual for more information.
use Types::Common::String qw< NonEmptyStr >;
if ( NonEmptyStr->check($name) ) {
# Do something here.
}
NonEmptyStr->($name); # Throw an exception if validation fails
How about
if (length ($name || '')) {
# do something with $name
}
This isn't quite equivalent to your original version, as it will also return false if $name is the numeric value 0 or the string '0', but will behave the same in all other cases.
In perl 5.10 (or later), the appropriate approach would be to use the defined-or operator instead:
use feature ':5.10';
if (length ($name // '')) {
# do something with $name
}
This will decide what to get the length of based on whether $name is defined, rather than whether it's true, so 0/'0' will handle those cases correctly, but it requires a more recent version of perl than many people have available.
if ($name )
{
#since undef and '' both evaluate to false
#this should work only when string is defined and non-empty...
#unless you're expecting someting like $name="0" which is false.
#notice though that $name="00" is not false
}

How do I tell if a variable has a numeric value in Perl?

Is there a simple way in Perl that will allow me to determine if a given variable is numeric? Something along the lines of:
if (is_number($x))
{ ... }
would be ideal. A technique that won't throw warnings when the -w switch is being used is certainly preferred.
Use Scalar::Util::looks_like_number() which uses the internal Perl C API's looks_like_number() function, which is probably the most efficient way to do this.
Note that the strings "inf" and "infinity" are treated as numbers.
Example:
#!/usr/bin/perl
use warnings;
use strict;
use Scalar::Util qw(looks_like_number);
my #exprs = qw(1 5.25 0.001 1.3e8 foo bar 1dd inf infinity);
foreach my $expr (#exprs) {
print "$expr is", looks_like_number($expr) ? '' : ' not', " a number\n";
}
Gives this output:
1 is a number
5.25 is a number
0.001 is a number
1.3e8 is a number
foo is not a number
bar is not a number
1dd is not a number
inf is a number
infinity is a number
See also:
perldoc Scalar::Util
perldoc perlapi for looks_like_number
The original question was how to tell if a variable was numeric, not if it "has a numeric value".
There are a few operators that have separate modes of operation for numeric and string operands, where "numeric" means anything that was originally a number or was ever used in a numeric context (e.g. in $x = "123"; 0+$x, before the addition, $x is a string, afterwards it is considered numeric).
One way to tell is this:
if ( length( do { no warnings "numeric"; $x & "" } ) ) {
print "$x is numeric\n";
}
If the bitwise feature is enabled, that makes & only a numeric operator and adds a separate string &. operator, you must disable it:
if ( length( do { no if $] >= 5.022, "feature", "bitwise"; no warnings "numeric"; $x & "" } ) ) {
print "$x is numeric\n";
}
(bitwise is available in perl 5.022 and above, and enabled by default if you use 5.028; or above.)
Check out the CPAN module Regexp::Common. I think it does exactly what you need and handles all the edge cases (e.g. real numbers, scientific notation, etc). e.g.
use Regexp::Common;
if ($var =~ /$RE{num}{real}/) { print q{a number}; }
Usually number validation is done with regular expressions. This code will determine if something is numeric as well as check for undefined variables as to not throw warnings:
sub is_integer {
defined $_[0] && $_[0] =~ /^[+-]?\d+$/;
}
sub is_float {
defined $_[0] && $_[0] =~ /^[+-]?\d+(\.\d+)?$/;
}
Here's some reading material you should look at.
A simple (and maybe simplistic) answer to the question is the content of $x numeric is the following:
if ($x eq $x+0) { .... }
It does a textual comparison of the original $x with the $x converted to a numeric value.
Not perfect, but you can use a regex:
sub isnumber
{
shift =~ /^-?\d+\.?\d*$/;
}
A slightly more robust regex can be found in Regexp::Common.
It sounds like you want to know if Perl thinks a variable is numeric. Here's a function that traps that warning:
sub is_number{
my $n = shift;
my $ret = 1;
$SIG{"__WARN__"} = sub {$ret = 0};
eval { my $x = $n + 1 };
return $ret
}
Another option is to turn off the warning locally:
{
no warnings "numeric"; # Ignore "isn't numeric" warning
... # Use a variable that might not be numeric
}
Note that non-numeric variables will be silently converted to 0, which is probably what you wanted anyway.
rexep not perfect... this is:
use Try::Tiny;
sub is_numeric {
my ($x) = #_;
my $numeric = 1;
try {
use warnings FATAL => qw/numeric/;
0 + $x;
}
catch {
$numeric = 0;
};
return $numeric;
}
Try this:
If (($x !~ /\D/) && ($x ne "")) { ... }
I found this interesting though
if ( $value + 0 eq $value) {
# A number
push #args, $value;
} else {
# A string
push #args, "'$value'";
}
Personally I think that the way to go is to rely on Perl's internal context to make the solution bullet-proof. A good regexp could match all the valid numeric values and none of the non-numeric ones (or vice versa), but as there is a way of employing the same logic the interpreter is using it should be safer to rely on that directly.
As I tend to run my scripts with -w, I had to combine the idea of comparing the result of "value plus zero" to the original value with the no warnings based approach of #ysth:
do {
no warnings "numeric";
if ($x + 0 ne $x) { return "not numeric"; } else { return "numeric"; }
}
You can use Regular Expressions to determine if $foo is a number (or not).
Take a look here:
How do I determine whether a scalar is a number
There is a highly upvoted accepted answer around using a library function, but it includes the caveat that "inf" and "infinity" are accepted as numbers. I see some regex stuff for answers too, but they seem to have issues. I tried my hand at writing some regex that would work better (I'm sorry it's long)...
/^0$|^[+-]?[1-9][0-9]*$|^[+-]?[1-9][0-9]*(\.[0-9]+)?([eE]-?[1-9][0-9]*)?$|^[+-]?[0-9]?\.[0-9]+$|^[+-]?[1-9][0-9]*\.[0-9]+$/
That's really 5 patterns separated by "or"...
Zero: ^0$
It's a kind of special case. It's the only integer that can start with 0.
Integers: ^[+-]?[1-9][0-9]*$
That makes sure the first digit is 1 to 9 and allows 0 to 9 for any of the following digits.
Scientific Numbers: ^[+-]?[1-9][0-9]*(\.[0-9]+)?([eE]-?[1-9][0-9]*)?$
Uses the same idea that the base number can't start with zero since in proper scientific notation you start with the highest significant bit (meaning the first number won't be zero). However, my pattern allows for multiple digits left of the decimal point. That's incorrect, but I've already spent too much time on this... you could replace the [1-9][0-9]* with just [0-9] to force a single digit before the decimal point and allow for zeroes.
Short Float Numbers: ^[+-]?[0-9]?\.[0-9]+$
This is like a zero integer. It's special in that it can start with 0 if there is only one digit left of the decimal point. It does overlap the next pattern though...
Long Float Numbers: ^[+-]?[1-9][0-9]*\.[0-9]+$
This handles most float numbers and allows more than one digit left of the decimal point while still enforcing that the higher number of digits can't start with 0.
The simple function...
sub is_number {
my $testVal = shift;
return $testVal =~ /^0$|^[+-]?[1-9][0-9]*$|^[+-]?[1-9][0-9]*(\.[0-9]+)?([eE]-?[1-9][0-9]*)?$|^[+-]?[0-9]?\.[0-9]+$|^[+-]?[1-9][0-9]*\.[0-9]+$/;
}
if ( defined $x && $x !~ m/\D/ ) {}
or
$x = 0 if ! $x;
if ( $x !~ m/\D/) {}
This is a slight variation on Veekay's answer but let me explain my reasoning for the change.
Performing a regex on an undefined value will cause error spew and will cause the code to exit in many if not most environments. Testing if the value is defined or setting a default case like i did in the alternative example before running the expression will, at a minimum, save your error log.