This question already has answers here:
Does scientific notation affect Perl's precision?
(1 answer)
Perl int function and padding zeros
(2 answers)
Closed 10 years ago.
I have a simple for loop in Perl
for ($i=0; $i <= 360; $i += 0.01)
{
print "$i ";
}
Why is it that when I run this code I get the following output, where as soon as it gets to 0.81 it suddenly starts to add in a load more decimal places? I know I could simply round up to avoid this issue but I was wondering why it happens. An increment of 0.01 does not seem at all crazy to do.
0.77
0.78
0.79
0.8
0.81
0.820000000000001
0.830000000000001
0.840000000000001
0.850000000000001
0.860000000000001
0.870000000000001
Computers use binary representations. Not all decimal floating point numbers have exact representations in binary notation, so some error can occur (its actually a rounding difference). This is the same reason why you shouldn't use floating point numbers for monetary values:
(Picture taken from dailywtf)
Most elegant way to get around this issue is using integers for calculations, dividing them to the correct number of decimal places and using sprintf to limit the number of decimal places printed. This will both make sure:
There's always to correct result printed
The rounding error doesn't accumulate
Try this code:
#!/usr/bin/perl
for ($i=0; $i <= 360*100; $i += 1) {
printf "%.2f \n", $i/100;
}
Basically, because the decimal number 0.01 does not have an exact representation in binary floating point, so over time, adding the best approximation to 0.01 deviates from the answer you'd like.
This is basic property of (binary) floating point arithmetic and not peculiar to Perl. What Every Computer Scientist Should Know About Floating-Point Arithmetic is the standard reference, and you can find it very easily with a Google search.
See also: C compiler bug (floating point arithmetic) and no doubt a myriad other questions.
Kernighan & Plauger say, in their old but classic book "The Elements of Programming Style", that:
A wise old programmer once said "floating point numbers are like little piles of sand; every time you move one, you lose a little sand and gain a little dirt".
They also say:
10 * 0.1 is hardly ever 1.0
Both sayings point out that floating point arithmetic is not precise.
Note that some modern CPUs (IBM PowerPC) have IEEE 754:2008 decimal floating point arithmetic built-in. If Perl used the correct types (it probably doesn't), then your calculation would be exact.
To demonstrate Jonathan's answer, if you code up the same loop structure in C, you will get the same results. Although C and Perl may compile differently and be ran on different machines the underlying floating point arithmetic rules should cause consistent outputs. Note: Perl uses double-precision floating point for its floating point representation whereas in C the coder explicitly chooses float or double.
Loop in C:
#include <stdio.h>
int main() {
double i;
for(i=0;i<=1;i+=.01) {
printf("%.15f\n",i);
}
}
Output:
0.790000000000000
0.800000000000000
0.810000000000000
0.820000000000001
0.830000000000001
0.840000000000001
0.850000000000001
To demonstrate the point even further, code the loop in C but now use single-precision floating point arithmetic and see that the output is less precise and even more erratic.
Output:
0.000000000000000
0.009999999776483
0.019999999552965
0.029999999329448
0.039999999105930
0.050000000745058
0.060000002384186
0.070000000298023
0.079999998211861
0.089999996125698
0.099999994039536
1/10 is periodic in binary just like 1/3 is periodic in decimal. As such, it cannot be accurately stored in a floating point number.
>perl -E"say sprintf '%.17f', 0.1"
0.10000000000000001
Either work with integers
for (0*100..360*100) {
my $i = $_/100;
print "$i ";
}
Or do lots of rounding
for (my $i=0; $i <= 360; $i = sprintf('%.2f', $i + 0.01)) {
print "$i ";
}
Related
This question already has answers here:
Closed 13 years ago.
Possible Duplicates:
Why is floating point arithmetic in C# imprecise?
Why does ghci say that 1.1 + 1.1 + 1.1 > 3.3 is True?
#!/usr/bin/perl
$l1 = "0+0.590580+0.583742+0.579787+0.564928+0.504538+0.459805+0.433273+0.384211+0.3035810";
$l2 = "0+0.590580+0.583742+0.579788+0.564928+0.504538+0.459805+0.433272+0.384211+0.3035810";
$val1 = eval ($l1);
$val2 = eval ($l2);
$diff = (($val1 - $val2)/$val1)*100;
print " (($val1 - $val2)/$val1)*100 ==> $diff\n";
Surprisingly the output ended up to be
((4.404445 - 4.404445)/4.404445)*100 ==> -2.01655014354845e-14.
Is it not supposed to be a ZERO????
Can any one explain this please......
What every computer scientist should know about floating point arithmetic
See Why is floating point arithmetic in C# imprecise?
This isn't Perl related, but floating point related.
It's pretty close to zero, which is what I'd expect.
Why's it supposed to be zero? 0.579787 != 0.579788 and 0.433273 != 0.433272. It's likely that none of these have an exact floating point representation so you should expect some inaccuracies.
From perlfaq4's answer to Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?:
Internally, your computer represents floating-point numbers in binary. Digital (as in powers of two) computers cannot store all numbers exactly. Some real numbers lose precision in the process. This is a problem with how computers store numbers and affects all computer languages, not just Perl.
perlnumber shows the gory details of number representations and conversions.
To limit the number of decimal places in your numbers, you can use the printf or sprintf function. See the "Floating Point Arithmetic" for more details.
printf "%.2f", 10/3;
my $number = sprintf "%.2f", 10/3;
When you change the two strings to be equal (there are 2 digits different between $l1 and $l2) it does indeed result in zero.
What it is demonstrating is that you can create 2 different floating point numbers ($val1 and $val2) that look the same when printed out, but internally have a tiny difference. These differences can be magnified up if you're not careful.
Vinko Vrsalovic posted some good links to explain why.
I use mod() to compare if a number's 0.01 digit is 2 or not.
if mod(5.02*100, 10) == 2
...
end
The result is mod(5.02*100, 10) = 2 returns 0;
However, if I use mod(1.02*100, 10) = 2 or mod(20.02*100, 10) = 2, it returns 1.
The result of mod(5.02*100, 10) - 2 is
ans =
-5.6843e-14
Could it be possible that this is a bug for matlab?
The version I used is R2013a. version 8.1.0
This is not a bug in MATLAB. It is a limitation of floating point arithmetic and conversion between binary and decimal numbers. Even a simple decimal number such as 0.1 has cannot be exactly represented as a binary floating point number with finite precision.
Computer floating point arithmetic is typically not exact. Although we are used to dealing with numbers in decimal format (base10), computers store and process numbers in binary format (base2). The IEEE standard for double precision floating point representation (see http://en.wikipedia.org/wiki/Double-precision_floating-point_format, what MATLAB uses) specifies the use of 64 bits to represent a binary number. 1 bit is used for the sign, 52 bits are used for the mantissa (the actual digits of the number), and 11 bits are used for the exponent and its sign (which specifies where the decimal place goes).
When you enter a number into MATLAB, it is immediately converted to binary representation for all manipulations and arithmetic and then converted back to decimal for display and output.
Here's what happens in your example:
Convert to binary (keeping only up to 52 digits):
5.02 => 1.01000001010001111010111000010100011110101110000101e2
100 => 1.1001e6
10 => 1.01e3
2 => 1.0e1
Perform multiplication:
1.01000001010001111010111000010100011110101110000101 e2
x 1.1001 e6
--------------------------------------------------------------
0.000101000001010001111010111000010100011110101110000101
0.101000001010001111010111000010100011110101110000101
+ 1.01000001010001111010111000010100011110101110000101
-------------------------------------------------------------
1.111101011111111111111111111111111111111111111111111101e8
Cutting off at 52 digits gives 1.111101011111111111111111111111111111111111111111111e8
Note that this is not the same as 1.11110110e8 which would be 502.
Perform modulo operation: (there may actually be additional error here depending on what algorithm is used within the mod() function)
mod( 1.111101011111111111111111111111111111111111111111111e8, 1.01e3) = 1.111111111111111111111111111111111111111111100000000e0
The error is exactly -2-44 which is -5.6843x10-14. The conversion between decimal and binary and the rounding due to finite precision have caused a small error. In some cases, you get lucky and rounding errors cancel out and you might still get the 'right' answer which is why you got what you expect for mod(1.02*100, 10), but In general, you cannot rely on this.
To use mod() correctly to test the particular digit of a number, use round() to round it to the nearest whole number and compensate for floating point error.
mod(round(5.02*100), 10) == 2
What you're encountering is a floating point error or artifact, like the commenters say. This is not a Matlab bug; it's just how floating point values work. You'd get the same results in C or Java. Floating point values are "approximate" types, so exact equality comparisons using == without some rounding or tolerance are prone to error.
>> isequal(1.02*100, 102)
ans =
1
>> isequal(5.02*100, 502)
ans =
0
It's not the case that 5.02 is the only number this happens for; several around 0 are affected. Here's an example that picks out several of them.
x = 1.02:1000.02;
ix = mod(x .* 100, 10) ~= 2;
disp(x(ix))
To understand the details of what's going on here (and in many other situations you'll encounter working with floats), have a read through the Wikipedia entry for "floating point", or my favorite article on it, "What Every Computer Scientist Should Know About Floating-Point Arithmetic". (That title is hyperbole; this article goes deep and I don't understand half of it. But it's a great resource.) This stuff is particularly relevant to Matlab because Matlab does everything in floating point by default.
I need to read some numbers in a database and write them into a text file using Perl.
In the table where are the numbers, the data format is defined as numeric (25,5) (it reads 25 digits, including 5 decimals).
I format the numbers in my file with a sprintf "%.5f", $myvalue to force 5 decimals and I just noticed that for greats values, there is a precision loss for numbers with more than 17 digits :
db = 123.12345
file = 123.12345 (OK)
db = 12345678901234891.12345
file = 12345678901234892.00000 (seems to be rounded to upper integer)
db = 12345678901234567890.12345
file = 12345678901234567000.00000 (truncation ?)
What is Perl's greatest precision for fixed decimal numbers?
I am aware of the concepts and limitations of floating point arithmetic in general, but I am not a Perl monk and I do not know the internals of Perl so I don't know if it is normal (or if it is related at all to floating point). I am not sure either if it is a internal limitation of Perl, or a problem related to the sprintf processing.
Is there a workaround or a dedicated module that could help with that problem?
Some notable points :
this is an additional feature of a system that already uses Perl, so using another tool is not an option
the data being crunched is financial so I need to keep every cent and I cannot cope with a +/- 10 000 units precision :^S
Once again, I am finding a solution right after asking SO. I am putting my solution here, to help a future visitor :
replace
$myout = sprintf "%.5f", $myvalue;
by
use Math::BigFloat;
$myout = Math::BigFloat->new($myvalue)->ffround( -5 )->bstr;
Without modules like Math::BigFloat, everything above 16 digits is pure magic... e.g.
perl -e 'printf "*10^%02d: %-.50g\n", $_, log(42)*(10**$_) for (0..20)'
produces
*10^00: 3.7376696182833684112267746968427672982215881347656
*10^01: 37.376696182833683224089327268302440643310546875
*10^02: 373.76696182833683224089327268302440643310546875
*10^03: 3737.6696182833684360957704484462738037109375
*10^04: 37376.6961828336861799471080303192138671875
*10^05: 373766.96182833681814372539520263671875
*10^06: 3737669.6182833681814372539520263671875
*10^07: 37376696.18283368647098541259765625
*10^08: 373766961.82833683490753173828125
*10^09: 3737669618.283368587493896484375
*10^10: 37376696182.83368682861328125
*10^11: 373766961828.33685302734375
*10^12: 3737669618283.36865234375
*10^13: 37376696182833.6875
*10^14: 373766961828336.8125
*10^15: 3737669618283368.5
*10^16: 37376696182833688
*10^17: 373766961828336832
*10^18: 3737669618283368448
*10^19: 37376696182833684480
*10^20: 373766961828336828416
What is Perl's greatest precision for fixed decimal numbers?
Perl doesn't have fixed point decimal numbers. Very few languages do, actually. You could use a module like Math::FixedPoint, though
Perl is storing your values as floating-point numbers internally.1 The precision is dependent on how your version of Perl is compiled, but it's probably a 64-bit double.
C:\>perl -MConfig -E "say $Config::Config{doublesize}"
8
A 64-bit double-precision float2 has a 53-bit significand (a.k.a. fraction or mantissa) which gives it approximately 16 decimal characters of precision. Your database is defined as storing 25 characters of precision. You'll be fine if you treat the data as a string but if you treat it as a number you'll lose precision.
Perl's bignum pragma provides transparent support for arbitrarily large numbers. It can slow things down considerably so limit its use to the smallest possible scope. If you want big floats only (without making other numeric types "big") use Math::BigFloat instead.
1. Internally, perl uses a datatype called an SV that can hold floats, ints, and/or strings simultaneously.
2. Assuming IEEE 754 format.
Alternatively, if you're just transferring the values from the database to a text file and not operating on them as numbers, then have the DB format them as strings. Then read and print them as strings (perhaps using "printf '%s'"). For example:
select Big_fixed_point_col(format '-Z(24)9.9(5)')(CHAR(32))
This question already has answers here:
Closed 13 years ago.
Possible Duplicates:
Why is floating point arithmetic in C# imprecise?
Why does ghci say that 1.1 + 1.1 + 1.1 > 3.3 is True?
#!/usr/bin/perl
$l1 = "0+0.590580+0.583742+0.579787+0.564928+0.504538+0.459805+0.433273+0.384211+0.3035810";
$l2 = "0+0.590580+0.583742+0.579788+0.564928+0.504538+0.459805+0.433272+0.384211+0.3035810";
$val1 = eval ($l1);
$val2 = eval ($l2);
$diff = (($val1 - $val2)/$val1)*100;
print " (($val1 - $val2)/$val1)*100 ==> $diff\n";
Surprisingly the output ended up to be
((4.404445 - 4.404445)/4.404445)*100 ==> -2.01655014354845e-14.
Is it not supposed to be a ZERO????
Can any one explain this please......
What every computer scientist should know about floating point arithmetic
See Why is floating point arithmetic in C# imprecise?
This isn't Perl related, but floating point related.
It's pretty close to zero, which is what I'd expect.
Why's it supposed to be zero? 0.579787 != 0.579788 and 0.433273 != 0.433272. It's likely that none of these have an exact floating point representation so you should expect some inaccuracies.
From perlfaq4's answer to Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?:
Internally, your computer represents floating-point numbers in binary. Digital (as in powers of two) computers cannot store all numbers exactly. Some real numbers lose precision in the process. This is a problem with how computers store numbers and affects all computer languages, not just Perl.
perlnumber shows the gory details of number representations and conversions.
To limit the number of decimal places in your numbers, you can use the printf or sprintf function. See the "Floating Point Arithmetic" for more details.
printf "%.2f", 10/3;
my $number = sprintf "%.2f", 10/3;
When you change the two strings to be equal (there are 2 digits different between $l1 and $l2) it does indeed result in zero.
What it is demonstrating is that you can create 2 different floating point numbers ($val1 and $val2) that look the same when printed out, but internally have a tiny difference. These differences can be magnified up if you're not careful.
Vinko Vrsalovic posted some good links to explain why.
I have a program in Perl that works with probabilities that can occasionally be very small. Because of rounding error, sometimes one of the probabilities comes out to be zero. I'd like to do a check for the following:
use constant TINY_FLOAT => 1e-200;
my $prob = calculate_prob();
if ( $prob == 0 ) {
$prob = TINY_FLOAT;
}
This works fine, but I actually see Perl producing numbers that are smaller than 1e-200 (I just saw a 8.14e-314 fly by). For my application I can change calculate_prob() so that it returns the maximum of TINY_FLOAT and the actual probability, but this made me curious about how floating point numbers are handled in Perl.
What's the smallest positive floating-point value in Perl? Is it platform-dependent? If so, is there a quick program that I can use to figure it out on my machine?
According to perldoc perlnumber, Perl uses the native floating point format where native is defined as whatever the C compiler that was used to compile it used. If you are more worried about precision/accuracy than speed, take a look at bignum.
The other answers are good. Here is how to find out the approximate ε if you did not know any of that information and could not post your question on SO ;-)
#!/usr/bin/perl
use strict;
use warnings;
use constant MAX_COUNT => 2000;
my ($x, $c);
for (my $y = 1; $y; $y /= 2) {
$x = $y;
# guard against too many iterations
last if ++$c > MAX_COUNT;
}
printf "%d : %.20g\n", $c, $x;
Output:
C:\Temp> thj
1075 : 4.9406564584124654e-324
It may be important to note that that smallest number is what's called a subnormal number, and math done on it may produce surprising results:
$ perl -wle'$x = 4.94e-324; print for $x, $x*1.4, $x*.6'
4.94065645841247e-324
4.94065645841247e-324
4.94065645841247e-324
That's because it uses the smallest allowed (base-2) exponent and a mantissa of the form (base-2) 0.0000000...0001. Larger, but still subnormal, numbers will also have a mantissa beginning 0. and an increasing range of precision.
I actually don't know how perl represents floating point numbers (and I think this is something you configure when you build perl), but if we assume that IEEE 754 is used then epsilon for a 64 bit floating point number is 4.94065645841247E-324.