Why don't I get zero when I subtract the same floating point number from itself in Perl? [duplicate] - perl

This question already has answers here:
Closed 13 years ago.
Possible Duplicates:
Why is floating point arithmetic in C# imprecise?
Why does ghci say that 1.1 + 1.1 + 1.1 > 3.3 is True?
#!/usr/bin/perl
$l1 = "0+0.590580+0.583742+0.579787+0.564928+0.504538+0.459805+0.433273+0.384211+0.3035810";
$l2 = "0+0.590580+0.583742+0.579788+0.564928+0.504538+0.459805+0.433272+0.384211+0.3035810";
$val1 = eval ($l1);
$val2 = eval ($l2);
$diff = (($val1 - $val2)/$val1)*100;
print " (($val1 - $val2)/$val1)*100 ==> $diff\n";
Surprisingly the output ended up to be
((4.404445 - 4.404445)/4.404445)*100 ==> -2.01655014354845e-14.
Is it not supposed to be a ZERO????
Can any one explain this please......

What every computer scientist should know about floating point arithmetic
See Why is floating point arithmetic in C# imprecise?
This isn't Perl related, but floating point related.

It's pretty close to zero, which is what I'd expect.
Why's it supposed to be zero? 0.579787 != 0.579788 and 0.433273 != 0.433272. It's likely that none of these have an exact floating point representation so you should expect some inaccuracies.

From perlfaq4's answer to Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?:
Internally, your computer represents floating-point numbers in binary. Digital (as in powers of two) computers cannot store all numbers exactly. Some real numbers lose precision in the process. This is a problem with how computers store numbers and affects all computer languages, not just Perl.
perlnumber shows the gory details of number representations and conversions.
To limit the number of decimal places in your numbers, you can use the printf or sprintf function. See the "Floating Point Arithmetic" for more details.
printf "%.2f", 10/3;
my $number = sprintf "%.2f", 10/3;

When you change the two strings to be equal (there are 2 digits different between $l1 and $l2) it does indeed result in zero.
What it is demonstrating is that you can create 2 different floating point numbers ($val1 and $val2) that look the same when printed out, but internally have a tiny difference. These differences can be magnified up if you're not careful.
Vinko Vrsalovic posted some good links to explain why.

Related

Simple addition in Perl causing result to be off by small fraction [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicates:
Why is floating point arithmetic in C# imprecise?
Why does ghci say that 1.1 + 1.1 + 1.1 > 3.3 is True?
#!/usr/bin/perl
$l1 = "0+0.590580+0.583742+0.579787+0.564928+0.504538+0.459805+0.433273+0.384211+0.3035810";
$l2 = "0+0.590580+0.583742+0.579788+0.564928+0.504538+0.459805+0.433272+0.384211+0.3035810";
$val1 = eval ($l1);
$val2 = eval ($l2);
$diff = (($val1 - $val2)/$val1)*100;
print " (($val1 - $val2)/$val1)*100 ==> $diff\n";
Surprisingly the output ended up to be
((4.404445 - 4.404445)/4.404445)*100 ==> -2.01655014354845e-14.
Is it not supposed to be a ZERO????
Can any one explain this please......
What every computer scientist should know about floating point arithmetic
See Why is floating point arithmetic in C# imprecise?
This isn't Perl related, but floating point related.
It's pretty close to zero, which is what I'd expect.
Why's it supposed to be zero? 0.579787 != 0.579788 and 0.433273 != 0.433272. It's likely that none of these have an exact floating point representation so you should expect some inaccuracies.
From perlfaq4's answer to Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?:
Internally, your computer represents floating-point numbers in binary. Digital (as in powers of two) computers cannot store all numbers exactly. Some real numbers lose precision in the process. This is a problem with how computers store numbers and affects all computer languages, not just Perl.
perlnumber shows the gory details of number representations and conversions.
To limit the number of decimal places in your numbers, you can use the printf or sprintf function. See the "Floating Point Arithmetic" for more details.
printf "%.2f", 10/3;
my $number = sprintf "%.2f", 10/3;
When you change the two strings to be equal (there are 2 digits different between $l1 and $l2) it does indeed result in zero.
What it is demonstrating is that you can create 2 different floating point numbers ($val1 and $val2) that look the same when printed out, but internally have a tiny difference. These differences can be magnified up if you're not careful.
Vinko Vrsalovic posted some good links to explain why.

How to stop matlab truncating long numbers

These two long numbers are the same except for the last digit.
test = [];
test(1) = 33777100285870080;
test(2) = 33777100285870082;
but the last digit is lost when the numbers are put in the array:
unique(test)
ans = 3.3777e+16
How can I prevent this? The numbers are ID codes and losing the last digit is screwing everything up.
Matlab uses 64-bit floating point representation by default for numbers. Those have a base-10 16-digit precision (more or less) and your numbers seem to exceed that.
Use something like uint64 to store your numbers:
> test = [uint64(33777100285870080); uint64(33777100285870082)];
> disp(test(1));
33777100285870080
> disp(test(2));
33777100285870082
This is really a rounding error, not a display error. To get the correct strings for output purposes, use int2str, because, again, num2str uses a 64-bit floating point representation, and that has rounding errors in this case.
To add more explanation to #rubenvb's solution, your values are greater than flintmax for IEEE 754 double precision floating-point, i.e, greater than 2^53. After this point not all integers can be exactly represented as doubles. See also this related question.

sprintf : fixed point, big numbers and precision loss

I need to read some numbers in a database and write them into a text file using Perl.
In the table where are the numbers, the data format is defined as numeric (25,5) (it reads 25 digits, including 5 decimals).
I format the numbers in my file with a sprintf "%.5f", $myvalue to force 5 decimals and I just noticed that for greats values, there is a precision loss for numbers with more than 17 digits :
db = 123.12345
file = 123.12345 (OK)
db = 12345678901234891.12345
file = 12345678901234892.00000 (seems to be rounded to upper integer)
db = 12345678901234567890.12345
file = 12345678901234567000.00000 (truncation ?)
What is Perl's greatest precision for fixed decimal numbers?
I am aware of the concepts and limitations of floating point arithmetic in general, but I am not a Perl monk and I do not know the internals of Perl so I don't know if it is normal (or if it is related at all to floating point). I am not sure either if it is a internal limitation of Perl, or a problem related to the sprintf processing.
Is there a workaround or a dedicated module that could help with that problem?
Some notable points :
this is an additional feature of a system that already uses Perl, so using another tool is not an option
the data being crunched is financial so I need to keep every cent and I cannot cope with a +/- 10 000 units precision :^S
Once again, I am finding a solution right after asking SO. I am putting my solution here, to help a future visitor :
replace
$myout = sprintf "%.5f", $myvalue;
by
use Math::BigFloat;
$myout = Math::BigFloat->new($myvalue)->ffround( -5 )->bstr;
Without modules like Math::BigFloat, everything above 16 digits is pure magic... e.g.
perl -e 'printf "*10^%02d: %-.50g\n", $_, log(42)*(10**$_) for (0..20)'
produces
*10^00: 3.7376696182833684112267746968427672982215881347656
*10^01: 37.376696182833683224089327268302440643310546875
*10^02: 373.76696182833683224089327268302440643310546875
*10^03: 3737.6696182833684360957704484462738037109375
*10^04: 37376.6961828336861799471080303192138671875
*10^05: 373766.96182833681814372539520263671875
*10^06: 3737669.6182833681814372539520263671875
*10^07: 37376696.18283368647098541259765625
*10^08: 373766961.82833683490753173828125
*10^09: 3737669618.283368587493896484375
*10^10: 37376696182.83368682861328125
*10^11: 373766961828.33685302734375
*10^12: 3737669618283.36865234375
*10^13: 37376696182833.6875
*10^14: 373766961828336.8125
*10^15: 3737669618283368.5
*10^16: 37376696182833688
*10^17: 373766961828336832
*10^18: 3737669618283368448
*10^19: 37376696182833684480
*10^20: 373766961828336828416
What is Perl's greatest precision for fixed decimal numbers?
Perl doesn't have fixed point decimal numbers. Very few languages do, actually. You could use a module like Math::FixedPoint, though
Perl is storing your values as floating-point numbers internally.1 The precision is dependent on how your version of Perl is compiled, but it's probably a 64-bit double.
C:\>perl -MConfig -E "say $Config::Config{doublesize}"
8
A 64-bit double-precision float2 has a 53-bit significand (a.k.a. fraction or mantissa) which gives it approximately 16 decimal characters of precision. Your database is defined as storing 25 characters of precision. You'll be fine if you treat the data as a string but if you treat it as a number you'll lose precision.
Perl's bignum pragma provides transparent support for arbitrarily large numbers. It can slow things down considerably so limit its use to the smallest possible scope. If you want big floats only (without making other numeric types "big") use Math::BigFloat instead.
1. Internally, perl uses a datatype called an SV that can hold floats, ints, and/or strings simultaneously.
2. Assuming IEEE 754 format.
Alternatively, if you're just transferring the values from the database to a text file and not operating on them as numbers, then have the DB format them as strings. Then read and print them as strings (perhaps using "printf '%s'"). For example:
select Big_fixed_point_col(format '-Z(24)9.9(5)')(CHAR(32))

Perl for loop going haywire [duplicate]

This question already has answers here:
Does scientific notation affect Perl's precision?
(1 answer)
Perl int function and padding zeros
(2 answers)
Closed 10 years ago.
I have a simple for loop in Perl
for ($i=0; $i <= 360; $i += 0.01)
{
print "$i ";
}
Why is it that when I run this code I get the following output, where as soon as it gets to 0.81 it suddenly starts to add in a load more decimal places? I know I could simply round up to avoid this issue but I was wondering why it happens. An increment of 0.01 does not seem at all crazy to do.
0.77
0.78
0.79
0.8
0.81
0.820000000000001
0.830000000000001
0.840000000000001
0.850000000000001
0.860000000000001
0.870000000000001
Computers use binary representations. Not all decimal floating point numbers have exact representations in binary notation, so some error can occur (its actually a rounding difference). This is the same reason why you shouldn't use floating point numbers for monetary values:
(Picture taken from dailywtf)
Most elegant way to get around this issue is using integers for calculations, dividing them to the correct number of decimal places and using sprintf to limit the number of decimal places printed. This will both make sure:
There's always to correct result printed
The rounding error doesn't accumulate
Try this code:
#!/usr/bin/perl
for ($i=0; $i <= 360*100; $i += 1) {
printf "%.2f \n", $i/100;
}
Basically, because the decimal number 0.01 does not have an exact representation in binary floating point, so over time, adding the best approximation to 0.01 deviates from the answer you'd like.
This is basic property of (binary) floating point arithmetic and not peculiar to Perl. What Every Computer Scientist Should Know About Floating-Point Arithmetic is the standard reference, and you can find it very easily with a Google search.
See also: C compiler bug (floating point arithmetic) and no doubt a myriad other questions.
Kernighan & Plauger say, in their old but classic book "The Elements of Programming Style", that:
A wise old programmer once said "floating point numbers are like little piles of sand; every time you move one, you lose a little sand and gain a little dirt".
They also say:
10 * 0.1 is hardly ever 1.0
Both sayings point out that floating point arithmetic is not precise.
Note that some modern CPUs (IBM PowerPC) have IEEE 754:2008 decimal floating point arithmetic built-in. If Perl used the correct types (it probably doesn't), then your calculation would be exact.
To demonstrate Jonathan's answer, if you code up the same loop structure in C, you will get the same results. Although C and Perl may compile differently and be ran on different machines the underlying floating point arithmetic rules should cause consistent outputs. Note: Perl uses double-precision floating point for its floating point representation whereas in C the coder explicitly chooses float or double.
Loop in C:
#include <stdio.h>
int main() {
double i;
for(i=0;i<=1;i+=.01) {
printf("%.15f\n",i);
}
}
Output:
0.790000000000000
0.800000000000000
0.810000000000000
0.820000000000001
0.830000000000001
0.840000000000001
0.850000000000001
To demonstrate the point even further, code the loop in C but now use single-precision floating point arithmetic and see that the output is less precise and even more erratic.
Output:
0.000000000000000
0.009999999776483
0.019999999552965
0.029999999329448
0.039999999105930
0.050000000745058
0.060000002384186
0.070000000298023
0.079999998211861
0.089999996125698
0.099999994039536
1/10 is periodic in binary just like 1/3 is periodic in decimal. As such, it cannot be accurately stored in a floating point number.
>perl -E"say sprintf '%.17f', 0.1"
0.10000000000000001
Either work with integers
for (0*100..360*100) {
my $i = $_/100;
print "$i ";
}
Or do lots of rounding
for (my $i=0; $i <= 360; $i = sprintf('%.2f', $i + 0.01)) {
print "$i ";
}

What's the smallest non-zero, positive floating-point number in Perl?

I have a program in Perl that works with probabilities that can occasionally be very small. Because of rounding error, sometimes one of the probabilities comes out to be zero. I'd like to do a check for the following:
use constant TINY_FLOAT => 1e-200;
my $prob = calculate_prob();
if ( $prob == 0 ) {
$prob = TINY_FLOAT;
}
This works fine, but I actually see Perl producing numbers that are smaller than 1e-200 (I just saw a 8.14e-314 fly by). For my application I can change calculate_prob() so that it returns the maximum of TINY_FLOAT and the actual probability, but this made me curious about how floating point numbers are handled in Perl.
What's the smallest positive floating-point value in Perl? Is it platform-dependent? If so, is there a quick program that I can use to figure it out on my machine?
According to perldoc perlnumber, Perl uses the native floating point format where native is defined as whatever the C compiler that was used to compile it used. If you are more worried about precision/accuracy than speed, take a look at bignum.
The other answers are good. Here is how to find out the approximate ε if you did not know any of that information and could not post your question on SO ;-)
#!/usr/bin/perl
use strict;
use warnings;
use constant MAX_COUNT => 2000;
my ($x, $c);
for (my $y = 1; $y; $y /= 2) {
$x = $y;
# guard against too many iterations
last if ++$c > MAX_COUNT;
}
printf "%d : %.20g\n", $c, $x;
Output:
C:\Temp> thj
1075 : 4.9406564584124654e-324
It may be important to note that that smallest number is what's called a subnormal number, and math done on it may produce surprising results:
$ perl -wle'$x = 4.94e-324; print for $x, $x*1.4, $x*.6'
4.94065645841247e-324
4.94065645841247e-324
4.94065645841247e-324
That's because it uses the smallest allowed (base-2) exponent and a mantissa of the form (base-2) 0.0000000...0001. Larger, but still subnormal, numbers will also have a mantissa beginning 0. and an increasing range of precision.
I actually don't know how perl represents floating point numbers (and I think this is something you configure when you build perl), but if we assume that IEEE 754 is used then epsilon for a 64 bit floating point number is 4.94065645841247E-324.