How to eliminate Perl rounding errors - perl

Consider the following program:
$x=12345678901.234567000;
$y=($x-int($x))*1000000000;
printf("%f:%f\n",$x,$y);
Here's what is prints:
12345678901.234568:234567642.211914
I was expecting:
12345678901.234567:234567000
This appears to be some sort of rounding issue in Perl.
How could I change it to get 234567000 instead?
Did I do something wrong?

This is a frequently-asked question.
Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?
Internally, your computer represents floating-point numbers in binary. Digital (as in powers of two) computers cannot store all numbers exactly. Some real numbers lose precision in the process. This is a problem with how computers store numbers and affects all computer languages, not just Perl.
perlnumber shows the gory details of number representations and conversions.
To limit the number of decimal places in your numbers, you can use the printf or sprintf function. See the Floating Point Arithmetic for more details.
printf "%.2f", 10/3;
my $number = sprintf "%.2f", 10/3;

Make "use bignum;" the first line of your program.
Other answers explain what to expect when using floating point arithmetic -- that some digits towards the end are not really part of the answer. This is to make the computations do-able in a reasonable amount of time and space. If you are willing to use unbounded time and space to work with numbers, then you can use arbitrary-precision numbers and math, which is what "use bignum" enables. It's slower and uses more memory, but it works like math you learned in elementary school.
In general, it's best to learn more about how floating point math works before converting your program to arbitrary-precision math. It's only needed in very strange situations.

The whole issue of floating point precision has been answered, but you're still seeing the problem despite bignum. Why? The culprit is printf. bignum is a shallow pragma. It only affects how numbers are represented in variables and math operations. Even though bignum makes Perl do the math right, printf is still implemented in C. %f takes your precise number and turns it right back into an imprecise floating point number.
Print your numbers with just print and they should do fine. You'll have to format them manually.
The other thing you can do is to recompile Perl with -Duse64bitint -Duselongdouble which will force Perl to internally use 64 bit integers and long double floating point numbers. This will give you a lot more accuracy, more consistently and almost no performance cost (bignum is a bit of a performance hog for math intensive code). Its not 100% accurate like bignum, but it will affect things like printf. However, recompiling Perl this way makes it binary incompatible, so you're going to have to recompile all your extensions. If you do this, I suggest installing a fresh Perl in a different location (/usr/local/perl/64bit or something) rather than trying to manage parallel Perl installs sharing the same library.

Homework (Googlework?) for you: How are floating point numbers represented by computers?
You can only have a limited number of precise digits, everything beyond that is just the noise from base conversion (binary to decimal). That is also why the last digit of your $x appears to be 8.
$x - (int($x) is 0.23456linenoise, which is also a floating point number. Multiplied by 1000000000, it gives another floating point number, with more random digits pulled from the incommensurability of the bases.

Perl does not do arbitrary precision arithmetic for its built-in floating point types. So your initial variable $x is an approximation. You can see this by doing:
$ perl -e 'printf "%.10f", 12345678901.234567000'
12345678901.2345676422

This answer works on my x64 platform, by accommodating the scale of the errors
sub safe_eq {
my($var1,$var2)=#_;
return 1 if($var1==$var2);
my $dust;
if($var2==0) { $dust=abs($var1); }
else { $dust= abs(($var1/$var2)-1); }
return 0 if($dust>5.32907051820076e-15 ); # 5.32907051820075e-15
return 1;
}
You can build on the above to solve most of your problems.
Avoid bignum if you can - it's stupendously slow - plus it will not solve any problems if you've got to store your numbers anyplace like a DB or in JSON etc.

This has to do with the (limited) accuracy of the floating point computations a computer does. Generally when comparing floating point numbers you should compare with a suitable epsilon:
$value1 == $value2 or warn;
won't work as expected in most cases. You should do
use constant EPSILON => 1.0e-10;
abs($value1 - $value2) < EPSILON or warn;
EPSILON should be chosen such that it takes into account the complexity of the computations for valueX. A large computation might lead to a much, much larger EPSILON.
The other option is, as suggested by others:
sprintf("%.5f", value1) eq sprintf("%.5f", value2) or warn;
Or use an arbitrary precision math library.

Related

Numerical convergence and minimum number size

I have a program which calculates probability values
(p-values),
but it is entering a very large negative number into the
exp function
exp(-626294.830) which evaluates to zero instead of the very small
positive number that it should be.
How can I get this to evaluate as a very small floating point number?
I have tried
Math::BigFloat,
bignum, and
bigrat
but all have failed.
Wolfram Alpha says that exp(-626294.830) is 4.08589×10^-271997... zero is a pretty close approximation to that ;-) Although you've edited and removed the context from your question, do you really need to work with such tiny numbers, or perhaps there is some way you could optimize your algorithm or scale your numbers?
Anyway, you are correct that code like Math::BigFloat->new("-626294.830")->bexp seems to take quite some time, even with the support of use Math::BigFloat lib => 'GMP';.
The only alternative I can offer at the moment is Math::Prime::Util::GMP's expreal, although you need to specify a precision to it.
use Math::Prime::Util::GMP qw/expreal/;
use Math::BigFloat;
my $e = Math::BigFloat->new(expreal(-626294.830,272000));
print $e->bnstr,"\n";
__END__
4.086e-271997
But on my machine, even that still takes ~20s to run, which brings us back to the question of potential optimization in other places.
Floating point numbers do not have infinite precision. Assuming the number is represented as an IEEE 754 double, we have 52 bits for a fraction, 11 bits for the exponent, and one bit for the sign. Due to the way exponents are encoded, the smallest positive number that can be represented is 2^-1022.
If we look at your number e^-626294.830, we can do a change of base and see that it equals 2^(log_2 e · -626294.830) = 2^-903552.445, which is significantly smaller than 2^-1022. Approximating your number as zero is therefore correct.
Instead of calculating this value using arbitrary-precision numerics, you are likely better off solving the necessary equations by hand, then coding this in a way that does not require extreme precision. For example, it is unlikely that you need the exact value of e^-626294.830, but perhaps just the magnitude. Then, you can calculate the logarithm instead of using exp().

Is it possible to predict when Perl's decimal/float math will be wrong? [duplicate]

This question already has answers here:
Why can't decimal numbers be represented exactly in binary?
(22 answers)
Closed 7 years ago.
In one respect, I understand that Perl's floats are inexact binary representations, which causes Perl's math to sometimes be wrong. What I don't understand, is why sometimes these floats seem to give exact answers, and other times, not. Is it possible to predict when Perl's float math will give the wrong (i.e. inexact answer)?
For instance, in the below code, Perl's math is wrong 1 time when the subtraction is "16.12 - 15.13", wrong 2 times when the problem is "26.12 - 25.13", and wrong 20 times when the problem is "36.12 - 35.13". Furthermore, for some reason, in all of the above mentioned test cases, the result of our subtraction problem (i.e. $subtraction_problem) starts out as being wrong, but will tend to become more correct, the more we add or subtract from it (with $x). This makes no sense, why is it that the more we add to or subtract from our arithmetic problem, the more likely it becomes that the value is correct (i.e. exact)?
my $subtraction_problem = 16.12 - 15.13;
my $perl_math_failures = 0;
for (my $x = -25; $x< 25; $x++){
my $result = $subtraction_problem +$x;
print "$result\n";
$perl_math_failures++ if length $result > 6;
}
print "There were $perl_math_failures perl math failures!\n";
None of this is Perl specific. See Goldberg:
Rounding Error
Squeezing infinitely many real numbers into a finite number of bits requires an approximate representation. Although there are infinitely many integers, in most programs the result of integer computations can be stored in 32 bits. In contrast, given any fixed number of bits, most calculations with real numbers will produce quantities that cannot be exactly represented using that many bits. Therefore the result of a floating-point calculation must often be rounded in order to fit back into its finite representation. This rounding error is the characteristic feature of floating-point computation. The section Relative Error and Ulps describes how it is measured.
Since most floating-point calculations have rounding error anyway, does it matter if the basic arithmetic operations introduce a little bit more rounding error than necessary? That question is a main theme throughout this section. The section Guard Digits discusses guard digits, a means of reducing the error when subtracting two nearby numbers. Guard digits were considered sufficiently important by IBM that in 1968 it added a guard digit to the double precision format in the System/360 architecture (single precision already had a guard digit), and retrofitted all existing machines in the field. Two examples are given to illustrate the utility of guard digits.
The IEEE standard goes further than just requiring the use of a guard digit. It gives an algorithm for addition, subtraction, multiplication, division and square root, and requires that implementations produce the same result as that algorithm. Thus, when a program is moved from one machine to another, the results of the basic operations will be the same in every bit if both machines support the IEEE standard. This greatly simplifies the porting of programs. Other uses of this precise specification are given in Exactly Rounded Operations.

Rounding Pi in Perl to the 100 millionth decimal place?

For a Science Fair project, I am testing how your choice of programming language could affect performance. I am doing this by making scripts in Java, Ruby, Perl, and Python to calculate Pi to the 100 millionth decimal place. I'm starting with Perl, since I'm most familiar with Perl. However, this brings an interesting problem to the table. I need to round Pi to the 100 millionth digit in Perl, but as far as I can see, Perl has no good rounding method for this situation. There's only stuff like
use Math::Round;
$rounded = nearest(0.1, $numb);
And that's a bit of a problem, since I don't want to sit at my computer typing 100 million zeros. As far as I know, sprintf and printf aren't any better; plus, they have that annoying half to even thing. Can anyone help out?
P.S. I'm planning to use the Chudnovsky Formula, if it matters to anyone.
I don't think any programming language can natively do what you are asking. Even bignum libraries like Math::BigRat (default 40 digits) and Math::Bignum cannot do 100 million digits.
To make it happen, you will have to create your own custom way to represent such big numbers and how to round them.
Think about the problem in another way. You need to round to 100 million (1E8) digits but you don't need to process all 1E8 digits in one go to do that.
Instead,
Use the Chudnovsky Formula to calculate 1E8 +1 digits.
Store the digits in a string (if you have the memory) or a file.
Select the last n (something small like 8 or even 2) digits.
If they aren't all 9 round to n-1 digits.
If they are then convert them to (n-1) * 0 digits. Then read the next n digits from the end and repeat 4 and 5.
However, if the goal is to test relative performance of languages by generating 1E8 digits of Pi then why bother focus on the rather artificial constraint of rounding that number. If you use the same algorithm then any language should produce the same result. And you have a 50% chance of generating a rounded number anyway.
This is one step closer (though I haven't tested whether it can handle 100 million zeros). You'll need to use bignum to handle those sorts of numbers.
use bignum;
use Math::Round;
$rounded = nearest(1e-100_000_001, $numb);
Also, bignum has its own pi function with an accuracy parameter:
$rounded = bignum::bpi(100_000_001);

comparing float and double and printing them

I have a quick question. So, say I have a really big number up to like 15 digits, and I would take the input and assign it to two variables, one float and one double if I were to compare two numbers, how would you compare them? I think double has the precision up to like 15 digits? and float has 8? So, do I simply compare them while the float only contains 8 digits and pad the rest or do I have the float to print out all 15 digits and then make the comparison? Also, if I were asked to print out the float number, is the standard way of doing it is just printing it up to 8 digits? which is its max precision
thanks
Most languages will do some form of type promotion to let you compare types that are not identical, but reasonably similar. For details, you would have to indicate what language you are referring to.
Of course, the real problem with comparing floating point numbers is that the results might be unexpected due to rounding errors. Most mathematical equivalences don't hold for floating point artihmetic, so two sequences of operations which SHOULD yield the same value might actually yield slightly different values (or even very different values if you aren't careful).
EDIT: as for printing, the "standard way" is based on what you need. If, for some reason, you are doing monetary computations in floating point, chances are that you'll only want to print 2 decimal digits.
Thinking in terms of digits may be a problem here. Floats can have a range from negative infinity to positive infinity. In C# for example the range is ±1.5 × 10^−45 to ±3.4 × 10^38 with a precision of 7 digits.
Also, IEEE 754 defines floats and doubles.
Here is a link that might help http://en.wikipedia.org/wiki/IEEE_floating_point
Your question is the right one. You want to consider your approach, though.
Whether at 32 or 64 bits, the floating-point representation is not meant to compare numbers for equality. For example, the assertion 2.0/7.0 == 60.0/210.0 may or may not be true in the CPU's view. Conceptually, the floating-point is inherently meant to be imprecise.
If you wish to compare numbers for equality, use integers. Consider again the ratios of the last paragraph. The assertion that 2*210 == 7*60 is always true -- noting that those are the integral versions of the same four numbers as before, only related using multiplication rather than division. One suspects that what you are really looking for is something like this.

Matlab precision: simple subtraction is not zero

I compute this simple sum on Matlab:
2*0.04-0.5*0.4^2 = -1.387778780781446e-017
but the result is not zero. What can I do?
Aabaz and Jim Clay have good explanations of what's going on.
It's often the case that, rather than exactly calculating the value of 2*0.04 - 0.5*0.4^2, what you really want is to check whether 2*0.04 and 0.5*0.4^2 differ by an amount that is small enough to be within the relevant numerical precision. If that's the case, than rather than checking whether 2*0.04 - 0.5*0.4^2 == 0, you can check whether abs(2*0.04 - 0.5*0.4^2) < thresh. Here thresh can either be some arbitrary smallish number, or an expression involving eps, which gives the precision of the numerical type you're working with.
EDIT:
Thanks to Jim and Tal for suggested improvement. Altered to compare the absolute value of the difference to a threshold, rather than the difference.
Matlab uses double-precision floating-point numbers to store real numbers. These are numbers of the form m*2^e where m is an integer between 2^52 and 2^53 (the mantissa) and e is the exponent. Let's call a number a floating-point number if it is of this form.
All numbers used in calculations must be floating-point numbers. Often, this can be done exactly, as with 2 and 0.5 in your expression. But for other numbers, most notably most numbers with digits after the decimal point, this is not possible, and an approximation has to be used. What happens in this case is that the number is rounded to the nearest floating-point number.
So, whenever you write something like 0.04 in Matlab, you're really saying "Get me the floating-point number that is closest to 0.04. In your expression, there are 2 numbers that need to be approximated: 0.04 and 0.4.
In addition, the exact result of operations like addition and multiplication on floating-point numbers may not be a floating-point number. Although it is always of the form m*2^e the mantissa may be too large. So you get an additional error from rounding the results of operations.
At the end of the day, a simple expression like yours will be off by about 2^-52 times the size of the operands, or about 10^-17.
In summary: the reason your expression does not evaluate to zero is two-fold:
Some of the numbers you start out with are different (approximations) to the exact numbers you provided.
The intermediate results may also be approximations of the exact results.
What you are seeing is quantization error. Matlab uses doubles to represent numbers, and while they are capable of a lot of precision, they still cannot represent all real numbers because there are an infinite number of real numbers. I'm not sure about Aabaz's trick, but in general I would say there isn't anything you can do, other than perhaps massaging your inputs to be double-friendly numbers.
I do not know if it is applicable to your problem but often the simplest solution is to scale your data.
For example:
a=0.04;
b=0.2;
a-0.2*b
ans=-6.9389e-018
c=a/min(abs([a b]));
d=b/min(abs([a b]));
c-0.2*d
ans=0
EDIT: of course I did not mean to give a universal solution to these kind of problems but it is still a good practice that can make you avoid a few problems in numerical computation (curve fitting, etc ...). See Jim Clay's answer for the reason why you are experiencing these problems.
I'm pretty sure this is a case of ye olde floating point accuracy issues.
Do you need 1e-17 accuracy? Is this merely a case of wanting 'pretty' output?
In that case, you can just use a formatted sprintf to display the number of significant digits you want.
Realize that this is not a matlab problem, but a fundamental limitation of how numbers are represented in binary.
For fun, work out what .1 is in binary...
Some references:
http://en.wikipedia.org/wiki/Floating_point#Accuracy_problems
http://www.mathworks.com/support/tech-notes/1100/1108.html