Change meaning of the operator "+" in perl - perl

Currently "+" in perl means addition, in my project, we do string concatenation a lot. I know we can concatention with "." operator, like:
$x = $a . $b; #will concatenate string $a, and string $b
But "+" feels better. Wonder if there is a magic to make the following do concatenation.
$x = $a + $b;
Even better, make the it check the operator type, if both variables ($a, $b) are numbers, then do "addition" in the usual sense, otherwise, do concatenation.
I know in C++, one can overload the operator. Hope there is something similar in perl.
Thanks.

Yes, Perl too offers operator overloading.
package UnintuitiveString;
use Scalar::Util qw/looks_like_number/;
use overload '+' => \&concat,
'.' => \&concat,
'""' => \&as_string;
# Additionally, the following operators *have* to be overridden
# I suggest you raise an exception if an implementation does not make sense
# - * / % ** << >> x
# <=> cmp
# & | ^ ~
# atan2 cos sin exp log sqrt int
# 0+ bool
# ~~
sub new {
my ($class, $val) = #_;
return bless \$val => $class;
}
sub concat {
my ($self, $other, $swap) = #_;
# check for append mode
if (not defined $swap) {
$$self .= "$other";
return $self;
}
($self, $other) = ($other, $self) if $swap;
return UnintuitiveString->new("$self" . "$other");
}
sub as_string {
my ($self) = #_;
return $$self;
}
sub as_number {
my ($self) = #_;
return 0+$$self if looks_like_number $$self;
return undef;
}
Now we can do weird stuff like:
my $foo = UnintuitiveString->new(4);
my $bar = UnintuitiveString->new(2);
print $foo + $bar, "\n"; # "42"
my ($num_x, $num_y) = map { $_->as_number } $foo, $bar;
print $num_x + $num_y, "\n"; # "6"
$foo += 6;
print $foo + "\n"; # "46"
But just because we can do such things does not at all mean that we should:
Perl already has a concatenation operator: .. It's perfectly fine to use that.
Operator overloading comes at a massive performance cost. What previously was a single opcode in perl's VM is now a series of method calls and intermediate copies.
Changing the meaning of your operators is extremely confusing for people who actually know Perl. I stumbled a few times with the test cases above, when I was surprised that $foo + 6 wouldn't produce 10.
Perl's scalars are not a number or a string, they are both at the same time and are interpreted as one or the other depending on their usage context. This is actually half-true, and the scalars have different representations. They could be a string (PV), an integer (IV), a float (NV). However, once a PV is used in a numerical context like addition, a numerical value is determined and saved alongside the string, and we get an PVIV or PVNV. The reverse is also true: when a number is used in a stringy context, the formatted string is saved alongside the number. The looks_like_number function mentioned above determines whether a given string could represent a valid number like "42" or "NaN". Because just using a scalar in some context can change the representation, checking that a given scalar is a PV does not guarantee that it was intended to be a string, and an IV does not guarantee that it was intended to be an integer.
Perl has two sets of operators for a very good reason: If the “type” of a scalar is fluid, we need another way to explicitly request certain behavior. E.g. Perl has numeric comparison operators < <= == != >= > <=> and stringy comparison operators lt le eq ne ge gt cmp which can behave very differently: 4 XXX 12 will be -1 for <=> (because 4 is numerically smaller than 12), but 1 for cmp (because 4 comes later than 1 in most collation orders).
Other languages suffer a lot from having operators coerce their operands to required types but not offering two sets of operators. E.g. in Java, + is overloaded to concat strings. However, this leads to a loss of commutativity and associativity. Given three values x, y, z which can be either strings or numbers, we get different results for:
x + y and y + x – string concatenation is not commutative, whereas numeric addition is.
(x + y) + z and x + (y + z) – the + is not associative as soon as one string enters the playing field. Consider x = 1, y = 2, z = "4". Then the first evaluation order leads to "34", whereas the second leads to "124".
In Java, this is not a problem, because the language is statically typed, and because there are very few coercions (autoboxing, autounboxing, widening conversions, and stringification in concatenation). However, JavaScript (which is dynamically typed and will perform conversions from strings to numbers for other operators) shows the exact same behavior. Oops.
Stop this madness. Now. Perl's set of operators (barring smartmatch) is one of the best designed parts of the language (and its type system one of the worst parts from a modern viewpoint). If you dislike Perl because its operators make sense, you are free to use PHP instead (which, by the way, also uses . for concatenation to avoid such issues) :P

Related

Perl dereferencing in non-strict mode

In Perl, if I have:
no strict;
#ARY = (58, 90);
To operate on an element of the array, say it, the 2nd one, I would write (possibly as part of a larger expression):
$ARY[1] # The most common way found in Perldoc's idioms.
Though, for some reason these also work:
#ARY[1]
#{ARY[1]}
Resulting all in the same object:
print (\$ARY[1]);
print (\#ARY[1]);
print (\#{ARY[1]});
Output:
SCALAR(0x9dbcdc)
SCALAR(0x9dbcdc)
SCALAR(0x9dbcdc)
What is the syntax rules that enable this sort of constructs? How far could one devise reliable program code with each of these constructs, or with a mix of all of them either? How interchangeable are these expressions? (always speaking in a non-strict context).
On a concern of justifying how I come into this question, I agree "use strict" as a better practice, still I'm interested at some knowledge on build-up non-strict expressions.
In an attemp to find myself some help to this uneasiness, I came to:
The notion on "no strict;" of not complaining about undeclared
variables and quirk syntax.
The prefix dereference having higher precedence than subindex [] (perldsc § "Caveat on precedence").
The clarification on when to use # instead of $ (perldata § "Slices").
The lack of "[]" (array subscript / slice) description among the Perl's operators (perlop), which lead me to think it is not an
operator... (yet it has to be something else. But, what?).
For what I learned, none of these hints, put together, make me better understand my issue.
Thanks in advance.
Quotation from perlfaq4:
What is the difference between $array[1] and #array[1]?
The difference is the sigil, that special character in front of the array name. The $ sigil means "exactly one item", while the # sigil means "zero or more items". The $ gets you a single scalar, while the # gets you a list.
Please see: What is the difference between $array[1] and #array[1]?
#ARY[1] is indeed a slice, in fact a slice of only one member. The difference is it creates a list context:
#ar1[0] = qw( a b c ); # List context.
$ar2[0] = qw( a b c ); # Scalar context, the last value is returned.
print "<#ar1> <#ar2>\n";
Output:
<a> <c>
Besides using strict, turn warnings on, too. You'll get the following warning:
Scalar value #ar1[0] better written as $ar1[0]
In perlop, you can read that "Perl's prefix dereferencing operators are typed: $, #, %, and &." The standard syntax is SIGIL { ... }, but in the simple cases, the curly braces can be omitted.
See Can you use string as a HASH ref while "strict refs" in use? for some fun with no strict refs and its emulation under strict.
Extending choroba's answer, to check a particular context, you can use wantarray
sub context { return wantarray ? "LIST" : "SCALAR" }
print $ary1[0] = context(), "\n";
print #ary1[0] = context(), "\n";
Outputs:
SCALAR
LIST
Nothing you did requires no strict; other than to hide your error of doing
#ARY = (58, 90);
when you should have done
my #ARY = (58, 90);
The following returns a single element of the array. Since EXPR is to return a single index, it is evaluated in scalar context.
$array[EXPR]
e.g.
my #array = qw( a b c d );
my $index = 2;
my $ele = $array[$index]; # my $ele = 'c';
The following returns the elements identified by LIST. Since LIST is to return 0 or more elements, it must be evaluated in list context.
#array[LIST]
e.g.
my #array = qw( a b c d );
my #indexes ( 1, 2 );
my #slice = $array[#indexes]; # my #slice = qw( b c );
\( $ARY[$index] ) # Returns a ref to the element returned by $ARY[$index]
\( #ARY[#indexes] ) # Returns refs to each element returned by #ARY[#indexes]
${foo} # Weird way of writing $foo. Useful in literals, e.g. "${foo}bar"
#{foo} # Weird way of writing #foo. Useful in literals, e.g. "#{foo}bar"
${foo}[...] # Weird way of writing $foo[...].
Most people don't even know you can use these outside of string literals.

Perl function protoypes

Why do we use function protoypes in Perl?
What are the different prototypes available? How to use them?
Example: $$,$#,\## what do they mean?
You can find the description in the official documentation: http://perldoc.perl.org/perlsub.html#Prototypes
But more important: read why you should not use function prototytpes" Why are Perl 5's function prototypes bad?
To write some functions, prototypes are absolutely neccessary, as they change the way arguments are passed, the sub invocations are parsed, and in what context the arguments are evaluated.
Below are discussions on prototypes with the builtins open and bless, as well as the effect on user-written code like a fold_left subroutine. I come to the conclusion that there are a few scenarios where they are useful, but they are generally not a good mechanism to cope with signatures.
Example: CORE::open
Some builtin functions have prototypes, e.g open. You can get the prototype of any function like say prototype "CORE::open". We get *;$#. This means:
The * takes a bareword, glob, globref or scalar. E.g. STDOUT or my $fh.
The ; makes the following arguments optional.
The $ evaluates the next item in scalar context. We'll see in a minute why this is good.
The # allows any number of arguments.
This allows invocations like
open FOO; (very bad style, equivalent to open FOO, our $FOO)
open my $fh, #array;, which parses as open my $fh, scalar(#array). Useless
open my $fh, "<foo.txt"; (bad style, allows shell injection)
open my $fh, "<", "foo.txt"; (good three-arg-open)
open my $fh, "-|", #command; (now #command is evaluated in list context, i.e. is flattened)
So why should the second argument have scalar context? (1) either you use traditional two-arg-open. Then it isn't difficult to access the first element. (2) Or you want 3-arg-open (rather: multiarg). Then having an explicit mode in the source code is neccessary, which is good style and reduces action at a distance. So this forces you to decide between the outdated flexible 2-arg or the safe multi-arg.
Further restrictions, like that the < mode can only take one filename, while -| takes at least one string (the command) plus any number of arguments, are implemented on a non-syntactic level.
Example: CORE::bless
Another interesting example is the bless function. Its prototype is $;$. I.e. takes one or two scalars.
This allows bless $self; (blesses into current package), or the better bless $self, $class. However, my #array = ($self, $class); bless #array does not work, as scalar context is imposed on the first arg. So the first argument is not a reference, but the number 2. This reduces action at a distance, and fails rather than providing a probably wrong interpretation: both bless $array[0], $array[1] or bless \#array could have been meant here. So prototypes help and augment input validation, but are no substitute for it.
Example fold_left
Let us define a function fold_left that takes a list and an action as arguments. It performs this action on the first two values of the list, and replaces them with the result. This loops until only one element, the return value is left.
Simple implementation:
sub fold_left {
my $code = shift;
while ($#_) { # loop while more than one element
my ($x, $y) = splice #_, 0, 2;
unshift #_, $code->($x, $y);
}
return $_[0];
}
This can be called like
my $sum = fold_left sub{ $_[0] + $_[1] }, 1 .. 10;
my $str = fold_left sub{ "$_[0] $_[1]" }, 1 .. 10;
my $undef = fold_left;
my $runtime_error = fold_left \"foo", 1..10;
But this is unsatisfactory: we know that the first argument is a sub, so the sub keyword is redundant. Also, We can call it without a sub, which we want to be illegal. With prototypes, we can work around that:
sub fold_left (&#) { ... }
The & states that we'll take a coderef. If this is the first argument, this allows the sub keyword and the comma after the sub block to be omitted. Now we can do
my $sum = fold_left { $_[0] + $_[1] } 1 .. 10; # aka List::Util::sum(1..10);
my $str = fold_left { "$_[0] $_[1]" } 1 .. 10; # aka join " ", 1..10;
my $compile_error1 = fold_left; # ERROR: not enough arguments
my $compile_error2 = fold_left "foo", 1..10; # ERROR: type of arg 1 must be sub{} or block.
which is reminiscent of map {...} #list
On backslash prototypes
Backslash prototypes allow to capture typed references to arguments without imposing context. This is good when we want to pass an array without flattening it. E.g.
sub mypush (\##) {
my ($arrayref, #push_these) = #_;
my $len = #$arrayref;
#$arrayref[$len .. $len + $#push_these] = #push_these;
}
my #array;
mypush #array, 1, 2, 3;
You can think of the \ protecting the # like in regexes, thus requiring a literal # character on the argument. This is where prototypes are a sad story: Requiring literal characters is a bad idea. We can't even pass a reference directly, we have to dereference it first:
my $array = [];
mypush #$array, 1, 2, 3;
even though the called code sees and wants exactly that reference. From v14 on, the + can be used instead. It accepts an array, arrayref, hash or hashref (actually, it's like $ on scalar arguments, and \[#%] on hashes and arrays). This proto does no type validation, It'll just make sure you receive a reference unless the argument already is scalar.
sub mypush (+#) { ... }
my #array;
mypush #array, 1, 2, 3;
my $array_ref = [];
mypush $array_ref, 1, 2, 3; # works as well! yay
my %hash;
mypush %hash, 1, 2, 3; # syntactically legal, but will throw fatal on dereferencing.
mypush "foo", 1, 2, 3; # ditto
Conclusion
Prototypes are a great way to bend Perl to your will. Recently I was investigating how pattern matching from functional languages can be implemented in Perl. The match itself has the prototype $% (one scalar thing which is to be matched, and an even number of further arguments. These are pairs of patterns and code).
They are also a great way to shoot yourself in the foot, and can be downright ugly. From List::MoreUtils:
sub each_array (\#;\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#) {
return each_arrayref(#_);
}
This allows you to call it as each_array #a, #b, #c ..., but it isn't much effort to directly do each_arrayref \#a, \#b, \#c, ..., which imposes no limit on the number of parameters, and is more flexible.
Especially parameters like sub foo ($$$$$$;$$) indicate a code smell, and that you should move to named parameters, Method::Signatures, or Params::Validate.
In my experience, good prototypes are
#, % to slurp any (or an even) number of args. Note that # as sole prototype is equivalent to no prototype at all.
& leading codeblocks for nicer syntax.
$ iff you need to pad a slurpy # or %, but not on their own.
I actively dislike \# etc, and have yet to see a good use for _ aside from length (_ can be the last required argument in a prototype. If no explicit value is given, $_ is used.)
Having a good documentation and requiring the user of your subs to include the occasional backslash before your arguments is generally preferable to unexpected action at a distance or having scalar context imposed surprisingly.
Prototypes can be overridden like &foo(#args), and aren't honoured on method calls, so they are already useless here.

How exactly does Perl handle operator chaining?

So I have this bit of code that does not work:
print $userInput."\n" x $userInput2; #$userInput = string & $userInput2 is a integer
It prints it out once fine if the number is over 0 of course, but it doesn't print out the rest if the number is greater than 1. I come from a java background and I assume that it does the concatenation first, then the result will be what will multiply itself with the x operator. But of course that does not happen. Now it works when I do the following:
$userInput .= "\n";
print $userInput x $userInput2;
I am new to Perl so I'd like to understand exactly what goes on with chaining, and if I can even do so.
You're asking about operator precedence. ("Chaining" usually refers to chaining of method calls, e.g. $obj->foo->bar->baz.)
The Perl documentation page perlop starts off with a list of all the operators in order of precedence level. x has the same precedence as other multiplication operators, and . has the same precedence as other addition operators, so of course x is evaluated first. (i.e., it "has higher precedence" or "binds more tightly".)
As in Java you can resolve this with parentheses:
print(($userInput . "\n") x $userInput2);
Note that you need two pairs of parentheses here. If you'd only used the inner parentheses, Perl would treat them as indicating the arguments to print, like this:
# THIS DOESN'T WORK
print($userInput . "\n") x $userInput2;
This would print the string once, then duplicate print's return value some number of times. Putting space before the ( doesn't help since whitespace is generally optional and ignored. In a way, this is another form of operator precedence: function calls bind more tightly than anything else.
If you really hate having more parentheses than strictly necessary, you can defeat Perl with the unary + operator:
print +($userInput . "\n") x $userInput2;
This separates the print from the (, so Perl knows the rest of the line is a single expression. Unary + has no effect whatsoever; its primary use is exactly this sort of situation.
This is due to precedence of . (concatenation) operator being less than the x operator. So it ends up with:
use strict;
use warnings;
my $userInput = "line";
my $userInput2 = 2;
print $userInput.("\n" x $userInput2);
And outputs:
line[newline]
[newline]
This is what you want:
print (($userInput."\n") x $userInput2);
This prints out:
line
line
As has already been mentioned, this is a precedence issue, in that the repetition operator x has higher precedence than the concatenation operator .. However, that is not all that's going on here, and also, the issue itself comes from a bad solution.
First off, when you say
print (($foo . "\n") x $count);
What you are doing is changing the context of the repetition operator to list context.
(LIST) x $count
The above statement really means this (if $count == 3):
print ( $foo . "\n", $foo . "\n", $foo . "\n" ); # list with 3 elements
From perldoc perlop:
Binary "x" is the repetition operator. In scalar context or if the left operand is not enclosed in parentheses, it returns a string consisting of the left operand repeated the number of times specified by the right operand. In list context, if the left operand is enclosed in parentheses or is a list formed by qw/STRING/, it repeats the list. If the right operand is zero or negative, it returns an empty string or an empty list, depending on the context.
The solution works as intended because print takes list arguments. However, if you had something else that takes scalar arguments, such as a subroutine:
foo(("text" . "\n") x 3);
sub foo {
# #_ is now the list ("text\n", "text\n", "text\n");
my ($string) = #_; # error enters here
# $string is now "text\n"
}
This is a subtle difference which might not always give the desired result.
A better solution for this particular case is to not use the concatenation operator at all, because it is redundant:
print "$foo\n" x $count;
Or even use more mundane methods:
for (0 .. $count) {
print "$foo\n";
}
Or
use feature 'say'
...
say $foo for 0 .. $count;

Data types for parameters of subroutines / functions?

In Perl, can one specifiy data types for the parameters of subroutines? E.g. when using a dualvar in a numeric context like exit:
use constant NOTIFY_DIE_MAIL_SEND_FAILED => dualvar 3, 'NOTIFY_DIE_MAIL_SEND_FAILED';
exit NOTIFY_DIE_MAIL_SEND_FAILED;
How does Perl in that case know, that exit expects a numeric parameter? I didn't see a way to define data types for the parameters of subroutines like you do it in Java? (where I could understand how the data type is known as it is explicitely defined)
The whole point of the dualvar is that it behaves as a number or text depending on what you want. In cases where that's not obvious (to you more importantly than to perl) then make it clear.
exit 0 + NOTIFY_DIE_MAIL_SEND_FAILED;
As for explicitly typing parameters, that's not something built in. Perl is a much more dynamic language than Java so it's not common to check/force the type of every parameter or variable. In particular, a perl sub can accept different numbers of parameters and even different structures.
If you want to validate parameters (for an external API for example) try something like Params::Validate
In addition, Moose and Moo allow a certain level of attribute typing and even coercion.
In Perl, scalars are both numeric and stringy at the same time. It is not the variables themselves that distinguish between strings and numbers, but the operators you work with. While the addition + only uses a number, the concatenation . only uses strings.
In more strongly typing languages, e.g. Java, the addition operator doubles as addition and concatenation operator, because it can access type information.
"1" + 2 + 3 is still sick in Java, whereas Perl can cleanly distinguish between "1" + 2 + 3 == 6 and "1" . 2 . 3 eq "123".
You can force numeric or stringy context of a variable by adding 0 or concatenating the empty string:
sub foo {
my ($var) = #_;
$var += 0; # $var is numeric
$var .= ""; # $var is stringy now
}
Perl is quite different from Java in that - Perl is dynamically typed language, because it does not requires its variables to be typed at compile time..
Whereas, Java is statically typed (as you know already)
Perl determines the type of the variable depending upon the context it is used..
There can be only two context: -
List Context
Scalar Context
And the context is defined by the operator or function that is used..
For EG:-
# Define a list
#arr = qw/rohit jain/;
# Define a scalar
$num = 2
# Here perl will evaluate #arr in scalar context and take its length..
# so, below code will evaluate to : - value = 2 / 2
$value = #arr / $num;
# Here since it is used with a foreach loop, #arr will be taken as in list context
foreach (#arr) {
say $_;
}
# Above foreach loop will output: - `rohit` \n `jain` to the console..
You can force the type by:
use Scalar::Util qw(dualvar);
use constant NOTIFY_DIE_MAIL_SEND_FAILED => dualvar 3, 'NOTIFY_DIE_MAIL_SEND_FAILED';
say NOTIFY_DIE_MAIL_SEND_FAILED;
say int(NOTIFY_DIE_MAIL_SEND_FAILED);
output:
NOTIFY_DIE_MAIL_SEND_FAILED
3
How does Perl in that case know, that exit expects a numeric parameter?
exit expect a number as is part of its specification and its behaviour is kind of undefined if you pass it a non-integer value (i.e. you should not do it.
Now, in this particular case, how does dualvar manages to return either value type depending of the context?
I don't know how Scalar::Util's dualvar is implemented but you can write something similar with overload instead.
You certainly can modify the behaviour for a blessed object:
#!/usr/bin/env perl
use strict;
use warnings;
{package Dualvar;
use overload
fallback => 1,
'0+' => sub { $_[0]->{INT_VAL} },
'""' => sub { $_[0]->{STR_VAL} };
sub new {
my $class = shift;
my $self = { INT_VAL => shift, STR_VAL => shift };
bless($self,$class);
}
1;
}
my $x = Dualvar->new(31,'Therty-One');
print $x . " + One = ",$x + 1,"\n"; # Therty-One + One = 32
From the docs, it seems that overload actually changes the behaviour within the declaration scope so you should be able to change the behaviour of some common operators locally for any operand.
If exit does use one of those overloadable operations to evaluate its parameter into a integer then this solution would do.
I didn't see a way to define data types for the parameters of subroutines like you do it in Java?
As already said by others... this is not the case in Perl, at least not at compilation time, except for subroutine prototypes but these don't offer much type granularity (like int vs strings or different object classes).
Richard has mentioned some run-time alternatives you may use. I personally would recommend Moose if you don't mind the performance penalty.
What Rohit Jain said is correct. A function that wants input to follow certain rules simply has to explicitly check that the input is valid.
For example
sub foo
{
my ($param1,$param2) = shift;
$param1 =~ /^\d+$/ or die "Parameter 1 must be a positive integer.";
$param2 =~ /^(bar|baz)$/ or die "Parameter 2 must be either 'bar' or 'baz'";
...
}
This may seem like a pain, but:
The extra flexibility gained generally outweighs the work involved in doing this.
Simply having the correct data type is often not enough to ensure that you valid input, so you end up doing a lot this anyway even in a language like Java.

How do I tell if a variable has a numeric value in Perl?

Is there a simple way in Perl that will allow me to determine if a given variable is numeric? Something along the lines of:
if (is_number($x))
{ ... }
would be ideal. A technique that won't throw warnings when the -w switch is being used is certainly preferred.
Use Scalar::Util::looks_like_number() which uses the internal Perl C API's looks_like_number() function, which is probably the most efficient way to do this.
Note that the strings "inf" and "infinity" are treated as numbers.
Example:
#!/usr/bin/perl
use warnings;
use strict;
use Scalar::Util qw(looks_like_number);
my #exprs = qw(1 5.25 0.001 1.3e8 foo bar 1dd inf infinity);
foreach my $expr (#exprs) {
print "$expr is", looks_like_number($expr) ? '' : ' not', " a number\n";
}
Gives this output:
1 is a number
5.25 is a number
0.001 is a number
1.3e8 is a number
foo is not a number
bar is not a number
1dd is not a number
inf is a number
infinity is a number
See also:
perldoc Scalar::Util
perldoc perlapi for looks_like_number
The original question was how to tell if a variable was numeric, not if it "has a numeric value".
There are a few operators that have separate modes of operation for numeric and string operands, where "numeric" means anything that was originally a number or was ever used in a numeric context (e.g. in $x = "123"; 0+$x, before the addition, $x is a string, afterwards it is considered numeric).
One way to tell is this:
if ( length( do { no warnings "numeric"; $x & "" } ) ) {
print "$x is numeric\n";
}
If the bitwise feature is enabled, that makes & only a numeric operator and adds a separate string &. operator, you must disable it:
if ( length( do { no if $] >= 5.022, "feature", "bitwise"; no warnings "numeric"; $x & "" } ) ) {
print "$x is numeric\n";
}
(bitwise is available in perl 5.022 and above, and enabled by default if you use 5.028; or above.)
Check out the CPAN module Regexp::Common. I think it does exactly what you need and handles all the edge cases (e.g. real numbers, scientific notation, etc). e.g.
use Regexp::Common;
if ($var =~ /$RE{num}{real}/) { print q{a number}; }
Usually number validation is done with regular expressions. This code will determine if something is numeric as well as check for undefined variables as to not throw warnings:
sub is_integer {
defined $_[0] && $_[0] =~ /^[+-]?\d+$/;
}
sub is_float {
defined $_[0] && $_[0] =~ /^[+-]?\d+(\.\d+)?$/;
}
Here's some reading material you should look at.
A simple (and maybe simplistic) answer to the question is the content of $x numeric is the following:
if ($x eq $x+0) { .... }
It does a textual comparison of the original $x with the $x converted to a numeric value.
Not perfect, but you can use a regex:
sub isnumber
{
shift =~ /^-?\d+\.?\d*$/;
}
A slightly more robust regex can be found in Regexp::Common.
It sounds like you want to know if Perl thinks a variable is numeric. Here's a function that traps that warning:
sub is_number{
my $n = shift;
my $ret = 1;
$SIG{"__WARN__"} = sub {$ret = 0};
eval { my $x = $n + 1 };
return $ret
}
Another option is to turn off the warning locally:
{
no warnings "numeric"; # Ignore "isn't numeric" warning
... # Use a variable that might not be numeric
}
Note that non-numeric variables will be silently converted to 0, which is probably what you wanted anyway.
rexep not perfect... this is:
use Try::Tiny;
sub is_numeric {
my ($x) = #_;
my $numeric = 1;
try {
use warnings FATAL => qw/numeric/;
0 + $x;
}
catch {
$numeric = 0;
};
return $numeric;
}
Try this:
If (($x !~ /\D/) && ($x ne "")) { ... }
I found this interesting though
if ( $value + 0 eq $value) {
# A number
push #args, $value;
} else {
# A string
push #args, "'$value'";
}
Personally I think that the way to go is to rely on Perl's internal context to make the solution bullet-proof. A good regexp could match all the valid numeric values and none of the non-numeric ones (or vice versa), but as there is a way of employing the same logic the interpreter is using it should be safer to rely on that directly.
As I tend to run my scripts with -w, I had to combine the idea of comparing the result of "value plus zero" to the original value with the no warnings based approach of #ysth:
do {
no warnings "numeric";
if ($x + 0 ne $x) { return "not numeric"; } else { return "numeric"; }
}
You can use Regular Expressions to determine if $foo is a number (or not).
Take a look here:
How do I determine whether a scalar is a number
There is a highly upvoted accepted answer around using a library function, but it includes the caveat that "inf" and "infinity" are accepted as numbers. I see some regex stuff for answers too, but they seem to have issues. I tried my hand at writing some regex that would work better (I'm sorry it's long)...
/^0$|^[+-]?[1-9][0-9]*$|^[+-]?[1-9][0-9]*(\.[0-9]+)?([eE]-?[1-9][0-9]*)?$|^[+-]?[0-9]?\.[0-9]+$|^[+-]?[1-9][0-9]*\.[0-9]+$/
That's really 5 patterns separated by "or"...
Zero: ^0$
It's a kind of special case. It's the only integer that can start with 0.
Integers: ^[+-]?[1-9][0-9]*$
That makes sure the first digit is 1 to 9 and allows 0 to 9 for any of the following digits.
Scientific Numbers: ^[+-]?[1-9][0-9]*(\.[0-9]+)?([eE]-?[1-9][0-9]*)?$
Uses the same idea that the base number can't start with zero since in proper scientific notation you start with the highest significant bit (meaning the first number won't be zero). However, my pattern allows for multiple digits left of the decimal point. That's incorrect, but I've already spent too much time on this... you could replace the [1-9][0-9]* with just [0-9] to force a single digit before the decimal point and allow for zeroes.
Short Float Numbers: ^[+-]?[0-9]?\.[0-9]+$
This is like a zero integer. It's special in that it can start with 0 if there is only one digit left of the decimal point. It does overlap the next pattern though...
Long Float Numbers: ^[+-]?[1-9][0-9]*\.[0-9]+$
This handles most float numbers and allows more than one digit left of the decimal point while still enforcing that the higher number of digits can't start with 0.
The simple function...
sub is_number {
my $testVal = shift;
return $testVal =~ /^0$|^[+-]?[1-9][0-9]*$|^[+-]?[1-9][0-9]*(\.[0-9]+)?([eE]-?[1-9][0-9]*)?$|^[+-]?[0-9]?\.[0-9]+$|^[+-]?[1-9][0-9]*\.[0-9]+$/;
}
if ( defined $x && $x !~ m/\D/ ) {}
or
$x = 0 if ! $x;
if ( $x !~ m/\D/) {}
This is a slight variation on Veekay's answer but let me explain my reasoning for the change.
Performing a regex on an undefined value will cause error spew and will cause the code to exit in many if not most environments. Testing if the value is defined or setting a default case like i did in the alternative example before running the expression will, at a minimum, save your error log.