How do I tell if a variable has a numeric value in Perl? - perl

Is there a simple way in Perl that will allow me to determine if a given variable is numeric? Something along the lines of:
if (is_number($x))
{ ... }
would be ideal. A technique that won't throw warnings when the -w switch is being used is certainly preferred.

Use Scalar::Util::looks_like_number() which uses the internal Perl C API's looks_like_number() function, which is probably the most efficient way to do this.
Note that the strings "inf" and "infinity" are treated as numbers.
Example:
#!/usr/bin/perl
use warnings;
use strict;
use Scalar::Util qw(looks_like_number);
my #exprs = qw(1 5.25 0.001 1.3e8 foo bar 1dd inf infinity);
foreach my $expr (#exprs) {
print "$expr is", looks_like_number($expr) ? '' : ' not', " a number\n";
}
Gives this output:
1 is a number
5.25 is a number
0.001 is a number
1.3e8 is a number
foo is not a number
bar is not a number
1dd is not a number
inf is a number
infinity is a number
See also:
perldoc Scalar::Util
perldoc perlapi for looks_like_number

The original question was how to tell if a variable was numeric, not if it "has a numeric value".
There are a few operators that have separate modes of operation for numeric and string operands, where "numeric" means anything that was originally a number or was ever used in a numeric context (e.g. in $x = "123"; 0+$x, before the addition, $x is a string, afterwards it is considered numeric).
One way to tell is this:
if ( length( do { no warnings "numeric"; $x & "" } ) ) {
print "$x is numeric\n";
}
If the bitwise feature is enabled, that makes & only a numeric operator and adds a separate string &. operator, you must disable it:
if ( length( do { no if $] >= 5.022, "feature", "bitwise"; no warnings "numeric"; $x & "" } ) ) {
print "$x is numeric\n";
}
(bitwise is available in perl 5.022 and above, and enabled by default if you use 5.028; or above.)

Check out the CPAN module Regexp::Common. I think it does exactly what you need and handles all the edge cases (e.g. real numbers, scientific notation, etc). e.g.
use Regexp::Common;
if ($var =~ /$RE{num}{real}/) { print q{a number}; }

Usually number validation is done with regular expressions. This code will determine if something is numeric as well as check for undefined variables as to not throw warnings:
sub is_integer {
defined $_[0] && $_[0] =~ /^[+-]?\d+$/;
}
sub is_float {
defined $_[0] && $_[0] =~ /^[+-]?\d+(\.\d+)?$/;
}
Here's some reading material you should look at.

A simple (and maybe simplistic) answer to the question is the content of $x numeric is the following:
if ($x eq $x+0) { .... }
It does a textual comparison of the original $x with the $x converted to a numeric value.

Not perfect, but you can use a regex:
sub isnumber
{
shift =~ /^-?\d+\.?\d*$/;
}

A slightly more robust regex can be found in Regexp::Common.
It sounds like you want to know if Perl thinks a variable is numeric. Here's a function that traps that warning:
sub is_number{
my $n = shift;
my $ret = 1;
$SIG{"__WARN__"} = sub {$ret = 0};
eval { my $x = $n + 1 };
return $ret
}
Another option is to turn off the warning locally:
{
no warnings "numeric"; # Ignore "isn't numeric" warning
... # Use a variable that might not be numeric
}
Note that non-numeric variables will be silently converted to 0, which is probably what you wanted anyway.

rexep not perfect... this is:
use Try::Tiny;
sub is_numeric {
my ($x) = #_;
my $numeric = 1;
try {
use warnings FATAL => qw/numeric/;
0 + $x;
}
catch {
$numeric = 0;
};
return $numeric;
}

Try this:
If (($x !~ /\D/) && ($x ne "")) { ... }

I found this interesting though
if ( $value + 0 eq $value) {
# A number
push #args, $value;
} else {
# A string
push #args, "'$value'";
}

Personally I think that the way to go is to rely on Perl's internal context to make the solution bullet-proof. A good regexp could match all the valid numeric values and none of the non-numeric ones (or vice versa), but as there is a way of employing the same logic the interpreter is using it should be safer to rely on that directly.
As I tend to run my scripts with -w, I had to combine the idea of comparing the result of "value plus zero" to the original value with the no warnings based approach of #ysth:
do {
no warnings "numeric";
if ($x + 0 ne $x) { return "not numeric"; } else { return "numeric"; }
}

You can use Regular Expressions to determine if $foo is a number (or not).
Take a look here:
How do I determine whether a scalar is a number

There is a highly upvoted accepted answer around using a library function, but it includes the caveat that "inf" and "infinity" are accepted as numbers. I see some regex stuff for answers too, but they seem to have issues. I tried my hand at writing some regex that would work better (I'm sorry it's long)...
/^0$|^[+-]?[1-9][0-9]*$|^[+-]?[1-9][0-9]*(\.[0-9]+)?([eE]-?[1-9][0-9]*)?$|^[+-]?[0-9]?\.[0-9]+$|^[+-]?[1-9][0-9]*\.[0-9]+$/
That's really 5 patterns separated by "or"...
Zero: ^0$
It's a kind of special case. It's the only integer that can start with 0.
Integers: ^[+-]?[1-9][0-9]*$
That makes sure the first digit is 1 to 9 and allows 0 to 9 for any of the following digits.
Scientific Numbers: ^[+-]?[1-9][0-9]*(\.[0-9]+)?([eE]-?[1-9][0-9]*)?$
Uses the same idea that the base number can't start with zero since in proper scientific notation you start with the highest significant bit (meaning the first number won't be zero). However, my pattern allows for multiple digits left of the decimal point. That's incorrect, but I've already spent too much time on this... you could replace the [1-9][0-9]* with just [0-9] to force a single digit before the decimal point and allow for zeroes.
Short Float Numbers: ^[+-]?[0-9]?\.[0-9]+$
This is like a zero integer. It's special in that it can start with 0 if there is only one digit left of the decimal point. It does overlap the next pattern though...
Long Float Numbers: ^[+-]?[1-9][0-9]*\.[0-9]+$
This handles most float numbers and allows more than one digit left of the decimal point while still enforcing that the higher number of digits can't start with 0.
The simple function...
sub is_number {
my $testVal = shift;
return $testVal =~ /^0$|^[+-]?[1-9][0-9]*$|^[+-]?[1-9][0-9]*(\.[0-9]+)?([eE]-?[1-9][0-9]*)?$|^[+-]?[0-9]?\.[0-9]+$|^[+-]?[1-9][0-9]*\.[0-9]+$/;
}

if ( defined $x && $x !~ m/\D/ ) {}
or
$x = 0 if ! $x;
if ( $x !~ m/\D/) {}
This is a slight variation on Veekay's answer but let me explain my reasoning for the change.
Performing a regex on an undefined value will cause error spew and will cause the code to exit in many if not most environments. Testing if the value is defined or setting a default case like i did in the alternative example before running the expression will, at a minimum, save your error log.

Related

How to convert number to one type in perl [duplicate]

How do I fix this code so that 1.1 + 2.2 == 3.3? What is actually happening here that's causing this behavior? I'm vaguely familiar with rounding problems and floating point math, but I thought that applied to division and multiplication only and would be visible in the output.
[me#unixbox1:~/perltests]> cat testmathsimple.pl
#!/usr/bin/perl
use strict;
use warnings;
check_math(1, 2, 3);
check_math(1.1, 2.2, 3.3);
sub check_math {
my $one = shift;
my $two = shift;
my $three = shift;
if ($one + $two == $three) {
print "$one + $two == $three\n";
} else {
print "$one + $two != $three\n";
}
}
[me#unixbox1:~/perltests]> perl testmathsimple.pl
1 + 2 == 3
1.1 + 2.2 != 3.3
Edit:
Most of the answers thus far are along the lines of "it's a floating point problem, duh" and are providing workarounds for it. I already suspect that to be the problem. How do I demonstrate it? How do I get Perl to output the long form of the variables? Storing the $one + $two computation in a temp variable and printing it doesn't demonstrate the problem.
Edit:
Using the sprintf technique demonstrated by aschepler, I'm now able to "see" the problem. Further, using bignum, as recommended by mscha and rafl, fixes the problem of the comparison not being equal. However, the sprintf output still indicates that the numbers aren't "correct". That's leaving a modicum of doubt about this solution.
Is bignum a good way to resolve this? Are there any possible side effects of bignum that we should look out for when integrating this into a larger, existing, program?
See What Every Computer Scientist Should Know About Floating-Point Arithmetic.
None of this is Perl specific: There are an uncountably infinite number of real numbers and, obviously, all of them cannot be represented using only a finite number of bits.
The specific "solution" to use depends on your specific problem. Are you trying to track monetary amounts? If so, use the arbitrary precision numbers (use more memory and more CPU, get more accurate results) provided by bignum. Are you doing numeric analysis? Then, decide on the precision you want to use, and use sprintf (as shown below) and eq to compare.
You can always use:
use strict; use warnings;
check_summation(1, $_) for [1, 2, 3], [1.1, 2.2, 3.3];
sub check_summation {
my $precision = shift;
my ($x, $y, $expected) = #{ $_[0] };
my $result = $x + $y;
for my $n ( $x, $y, $expected, $result) {
$n = sprintf('%.*f', $precision, $n);
}
if ( $expected eq $result ) {
printf "%s + %s = %s\n", $x, $y, $expected;
}
else {
printf "%s + %s != %s\n", $x, $y, $expected;
}
return;
}
Output:
1.0 + 2.0 = 3.0
1.1 + 2.2 = 3.3
"What Every Computer Scientist Should Know About Floating-Point Arithmetic"
Basically, Perl is dealing with floating-point numbers, while you are probably expecting it to use fixed-point. The simplest way to handle this situation is to modify your code so that you are using whole integers everywhere except, perhaps, in a final display routine. For example, if you're dealing with USD currency, store all dollar amounts in pennies. 123 dollars and 45 cents becomes "12345". That way there is no floating point ambiguity during add and subtract operations.
If that's not an option, consider Matt Kane's comment. Find a good epsilon value and use it whenever you need to compare values.
I'd venture to guess that most tasks don't really need floating point, however, and I'd strongly suggest carefully considering whether or not it is the right tool for your task.
A quick way to fix floating points is to use bignum. Simply add a line
use bignum;
to the top of your script.
There are performance implications, obviously, so this may not be a good solution for you.
A more localized solution is to use Math::BigFloat explicitly where you need better accuracy.
From The Floating-Point Guide:
Why don’t my numbers, like 0.1 + 0.2 add up to a nice round 0.3, and
instead I get a weird result like
0.30000000000000004?
Because internally, computers use a
format (binary floating-point) that
cannot accurately represent a number
like 0.1, 0.2 or 0.3 at all.
When the code is compiled or
interpreted, your “0.1” is already
rounded to the nearest number in that
format, which results in a small
rounding error even before the
calculation happens.
What can I do to avoid this problem?
That depends on what kind of
calculations you’re doing.
If you really need your results to add up exactly, especially when you
work with money: use a special decimal
datatype.
If you just don’t want to see all those extra decimal places: simply
format your result rounded to a fixed
number of decimal places when
displaying it.
If you have no decimal datatype available, an alternative is to work
with integers, e.g. do money
calculations entirely in cents. But
this is more work and has some
drawbacks.
Youz could also use a "fuzzy compare" to determine whether two numbers are close enough to assume they'd be the same using exact math.
To see precise values for your floating-point scalars, give a big precision to sprintf:
print sprintf("%.60f", 1.1), $/;
print sprintf("%.60f", 2.2), $/;
print sprintf("%.60f", 3.3), $/;
I get:
1.100000000000000088817841970012523233890533447265625000000000
2.200000000000000177635683940025046467781066894531250000000000
3.299999999999999822364316059974953532218933105468750000000000
Unfortunately C99's %a conversion doesn't seem to work. perlvar mentions an obsolete variable $# which changes the default format for printing a number, but it breaks if I give it a %f, and %g refuses to print "non-significant" digits.
abs($three - ($one + $two)) < $some_Very_small_number
Use sprintf to convert your variable into a formatted string, and then compare the resulting string.
# equal( $x, $y, $d );
# compare the equality of $x and $y with precision of $d digits below the decimal point.
sub equal {
my ($x, $y, $d) = #_;
return sprintf("%.${d}g", $x) eq sprintf("%.${d}g", $y);
}
This kind of problem occurs because there is no perfect fixed-point representation for your fractions (0.1, 0.2, etc). So the value 1.1 and 2.2 are actually stored as something like 1.10000000000000...1 and 2.2000000....1, respectively (I am not sure if it becomes slightly bigger or slightly smaller. In my example I assume they become slightly bigger). When you add them together, it becomes 3.300000000...3, which is larger than 3.3 which is converted to 3.300000...1.
Number::Fraction lets you work with rational numbers (fractions) instead of decimals, something like this (':constants' is imported to automatically convert strings like '11/10' into Number::Fraction objects):
use strict;
use warnings;
use Number::Fraction ':constants';
check_math(1, 2, 3);
check_math('11/10', '22/10', '33/10');
sub check_math {
my $one = shift;
my $two = shift;
my $three = shift;
if ($one + $two == $three) {
print "$one + $two == $three\n";
} else {
print "$one + $two != $three\n";
}
}
which prints:
1 + 2 == 3
11/10 + 11/5 == 33/10

Change meaning of the operator "+" in perl

Currently "+" in perl means addition, in my project, we do string concatenation a lot. I know we can concatention with "." operator, like:
$x = $a . $b; #will concatenate string $a, and string $b
But "+" feels better. Wonder if there is a magic to make the following do concatenation.
$x = $a + $b;
Even better, make the it check the operator type, if both variables ($a, $b) are numbers, then do "addition" in the usual sense, otherwise, do concatenation.
I know in C++, one can overload the operator. Hope there is something similar in perl.
Thanks.
Yes, Perl too offers operator overloading.
package UnintuitiveString;
use Scalar::Util qw/looks_like_number/;
use overload '+' => \&concat,
'.' => \&concat,
'""' => \&as_string;
# Additionally, the following operators *have* to be overridden
# I suggest you raise an exception if an implementation does not make sense
# - * / % ** << >> x
# <=> cmp
# & | ^ ~
# atan2 cos sin exp log sqrt int
# 0+ bool
# ~~
sub new {
my ($class, $val) = #_;
return bless \$val => $class;
}
sub concat {
my ($self, $other, $swap) = #_;
# check for append mode
if (not defined $swap) {
$$self .= "$other";
return $self;
}
($self, $other) = ($other, $self) if $swap;
return UnintuitiveString->new("$self" . "$other");
}
sub as_string {
my ($self) = #_;
return $$self;
}
sub as_number {
my ($self) = #_;
return 0+$$self if looks_like_number $$self;
return undef;
}
Now we can do weird stuff like:
my $foo = UnintuitiveString->new(4);
my $bar = UnintuitiveString->new(2);
print $foo + $bar, "\n"; # "42"
my ($num_x, $num_y) = map { $_->as_number } $foo, $bar;
print $num_x + $num_y, "\n"; # "6"
$foo += 6;
print $foo + "\n"; # "46"
But just because we can do such things does not at all mean that we should:
Perl already has a concatenation operator: .. It's perfectly fine to use that.
Operator overloading comes at a massive performance cost. What previously was a single opcode in perl's VM is now a series of method calls and intermediate copies.
Changing the meaning of your operators is extremely confusing for people who actually know Perl. I stumbled a few times with the test cases above, when I was surprised that $foo + 6 wouldn't produce 10.
Perl's scalars are not a number or a string, they are both at the same time and are interpreted as one or the other depending on their usage context. This is actually half-true, and the scalars have different representations. They could be a string (PV), an integer (IV), a float (NV). However, once a PV is used in a numerical context like addition, a numerical value is determined and saved alongside the string, and we get an PVIV or PVNV. The reverse is also true: when a number is used in a stringy context, the formatted string is saved alongside the number. The looks_like_number function mentioned above determines whether a given string could represent a valid number like "42" or "NaN". Because just using a scalar in some context can change the representation, checking that a given scalar is a PV does not guarantee that it was intended to be a string, and an IV does not guarantee that it was intended to be an integer.
Perl has two sets of operators for a very good reason: If the “type” of a scalar is fluid, we need another way to explicitly request certain behavior. E.g. Perl has numeric comparison operators < <= == != >= > <=> and stringy comparison operators lt le eq ne ge gt cmp which can behave very differently: 4 XXX 12 will be -1 for <=> (because 4 is numerically smaller than 12), but 1 for cmp (because 4 comes later than 1 in most collation orders).
Other languages suffer a lot from having operators coerce their operands to required types but not offering two sets of operators. E.g. in Java, + is overloaded to concat strings. However, this leads to a loss of commutativity and associativity. Given three values x, y, z which can be either strings or numbers, we get different results for:
x + y and y + x – string concatenation is not commutative, whereas numeric addition is.
(x + y) + z and x + (y + z) – the + is not associative as soon as one string enters the playing field. Consider x = 1, y = 2, z = "4". Then the first evaluation order leads to "34", whereas the second leads to "124".
In Java, this is not a problem, because the language is statically typed, and because there are very few coercions (autoboxing, autounboxing, widening conversions, and stringification in concatenation). However, JavaScript (which is dynamically typed and will perform conversions from strings to numbers for other operators) shows the exact same behavior. Oops.
Stop this madness. Now. Perl's set of operators (barring smartmatch) is one of the best designed parts of the language (and its type system one of the worst parts from a modern viewpoint). If you dislike Perl because its operators make sense, you are free to use PHP instead (which, by the way, also uses . for concatenation to avoid such issues) :P

How to validate number in perl?

I know that there is a library that do that
use Scalar::Util qw(looks_like_number);
yet I want to do it using perl regular expression. And I want it to work for double numbers not for only integers.
so I want something better than this
$var =~ /^[+-]?\d+$/
thanks.
Constructing a single regular expression to validate a number is really difficult. There simply are too many criteria to consider. Perlfaq4 contains a section "How do I determine whether a scalar is a number/whole/integer/float?
The code from that documentation shows the following tests:
if (/\D/) {print "has nondigits\n" }
if (/^\d+$/) {print "is a whole number\n" }
if (/^-?\d+$/) {print "is an integer\n" }
if (/^[+-]?\d+$/) {print "is a +/- integer\n" }
if (/^-?\d+\.?\d*$/) {print "is a real number\n" }
if (/^-?(?:\d+(?:\.\d*)?|\.\d+)$/) {print "is a decimal number\n"}
if (/^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/) {
print "is a C float\n"
}
The first test disqualifies an unsigned integer.
The second test qualifies a whole number.
The third test qualifies an integer.
The fourth test qualifies a positive/negatively signed integer.
The fifth test qualifies a real number.
The sixth test qualifies a decimal number.
The seventh test qualifies a number in c-style scientific notation.
So if you were using those tests (excluding the first one) you would have to verify that one or more of the tests passes. Then you've got a number.
Another method, since you don't want to use the module Scalar::Util, you can learn from the code IN Scalar::Util. The looks_like_number() function is set up like this:
sub looks_like_number {
local $_ = shift;
# checks from perlfaq4
return $] < 5.009002 unless defined;
return 1 if (/^[+-]?\d+$/); # is a +/- integer
return 1 if (/^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/); # a C float
return 1 if ($] >= 5.008 and /^(Inf(inity)?|NaN)$/i)
or ($] >= 5.006001 and /^Inf$/i);
0;
}
You should be able to use the portions of that function that are applicable to your situation.
I would like to point out, however, that Scalar::Util is a core Perl module; it ships with Perl, just like strict does. The best practice of all is probably to just use it.
You should use Regexp::Common, most patterns are more complicated than you realize.
use Regexp::Common;
my $real = 3.14159;
print "Real" if $real =~ /$RE{num}{real}/;
However, the pattern is not anchored by default, so a stricter version is:
my $real_pat = $RE{num}{real};
my $real = 3.14159;
print "Real" if $real =~ /^$real_pat$/;
Well first you should make sure that the number does not contain any commas so you do this:
$var =~ s/,//g; # remove all the commas
Then create another variable to do the rest of the compare.
$var2=$var;
Then remove the . from the new variable yet only once occurrence.
$var2 =~ s/.//; # replace . with nothing to compare yet only once.
now var2 should look like an integer with no "."
so do this:
if($var2 !~ /^[+-]?\d+$/){
print "not valid";
}else{
#use var1
}
you can fix this code and write it as a function if you need to use it more than once.
Cheers!

Best way to avoid "isn't numeric in numeric eq (==)"-warning

#!/usr/bin/env perl
use warnings;
use 5.12.2;
my $c = 'f'; # could be a number too
if ( $c eq 'd' || $c == 9 ) {
say "Hello, world!";
}
What is the best way, to avoid the 'Argument "f" isn't numeric in numeric eq (==) at ./perl.pl line 7.'-warning?
I suppose in this case I could use "eq" two times, but that doesn't look good.
use Scalar::Util 'looks_like_number';
if ( $c eq 'd' || ( looks_like_number($c) && $c == 9 ) ) {
say "Hello, world!";
}
You could also disable this category of warnings temporarily:
{
no warnings 'numeric';
# your code
}
Not sure why you want to avoid the warning. The warning is telling you that there's a potential problem in your program.
If you're going to compare a number with a string that contains unknown data, then you're either going to have to use 'eq' for the comparison or clean up the data in some way so that you know it looks like a number.
The obvious way to avoid a warning about comparing a non-numeric to a numeric is not to do it! Warnings are there for your benefit - they should not be ignored, or worked around.
To answer what is the best way you need to provide more context - i.e. what does $c represent, and why is it necessary to compare it do 'd' or 9 (and why not use $c eq '9')?
Using a regular expression to see if that is a number:
if(($num=~/\d/) && ($num >= 0) && ($num < 10))
{
# to do thing number
}

Is 999...9 a real number in Perl?

sub is_integer {
defined $_[0] && $_[0] =~ /^[+-]?\d+$/;
}
sub is_float {
defined $_[0] && $_[0] =~ /^[+-]?\d+(\.\d+)?$/;
}
For the code mentioned above, if we give input as 999999999999999999999999999999999999999999, it is giving output as not real number.
Why it is behaving like that?
I forgot to mention one more thing:
If I am using this code for $x as the above value:
if($x > 0 || $x <= 0 ) {
print "Real";
}
Output is real.
How is this possible?
$ perl -e 'print 999999999999999999999999999999999999999999'
1e+42
i.e. Perl uses scientific representation for this number and that is why your regexp doesn't match.
Use the looks_like_number function from Scalar::Util (which is a core module).
use Scalar::Util qw( looks_like_number );
say "Number" if looks_like_number 999999999999999999999999999999999999999999;
# above prints "Number"
Just to add one more thing. As others have explained, the number you are working with is out of range for a Perl integer (unless you are on a 140 bit machine). Therefore, the variable will be stored as a floating point number. Regular expressions operate on strings. Therefore, the number is converted to its string representation before the regular expression operates on it.
Others have explained what is going on: out of the box, Perl can't handle numbers that large without using scientific notation.
If you need to work with large numbers, take a look at bignum or its components, such as Math::BigInt. For example:
use strict;
use warnings;
use Math::BigInt;
my $big_str = '900000000000000000000000000000000000000';
my $big_num = Math::BigInt->new($big_str);
$big_num ++;
print "Is integer: $big_num\n" if is_integer($big_num);
sub is_integer {
defined $_[0] && $_[0] =~ /^[+-]?\d+$/;
}
Also, you may want to take a look at bignum in the Perl documentation.