Why does 0.1 * 10.0 == 1 - matlab

In MATLAB the following is true
0.1 * 10.0 == 1
But 0.1 is not represented exactly in floating point, so I expected it to not be true. Did I just get lucky and the error happened to be smaller than eps, so it got rounded to 1?
MATLAB implements IEEE 754, so I think it should apply to all languages. But this post makes me think it might be something specific to MATLAB.

Your specific example is true for any language which uses IEEE754 floating point arithmetic (well, 64-bit at least).
The literal 0.1 is exactly
0.1000000000000000055511151231257827021181583404541015625
10.0 is exactly 10
Their product is therefore
1.000000000000000055511151231257827021181583404541015625
The two closest floating point values are:
1.0
1.000000000000000222044604925031308084726333618164062
of which the first is closest, so the result is rounded to that.
(I'm not 100% sure what is going on in that example you link to: I suspect it has to do with C# using higher intermediate precision)
In general, however, this sort of thing isn't true. e.g. 0.51255*1e5 isn't 51255 (though MATLAB may lie when printing, try 0.51255*1e5-51255).

Related

Why does the multiplicative inverse of 0 result in infinity?

I am writing a program in Swift that takes the multiplicative inverse of random bytes. Sometimes, the byte is 0, and when the multiplicative inverse is taken, it results in inf.
The multiplicative inverse is being determined using
powf(Float(byte), -1.0)
byte is of type UInt8. If byte is equal to 0, the result is inf as mentioned earlier. How would the multiplicative inverse of 0 be infinity? Wouldn't the multiplicative inverse also be 0 since 0/0's multiplicative inverse is 0/0?
Short answer: By definition. In Swift (and many other languages), floating point numbers are backed by IEEE-754 definition of floats, which is directly implemented by the underlying hardware in most cases and thus quite fast. And according to that standard, division by 0 for floats is defined to be Infinity, and Swift is merely returning that result back to you. (To be precise, 0/0 is defined to be NaN, any positive number divided by 0 is defined to be Infinity, and any negative number divided by 0 is defined to be -Infinity.)
An interesting question to ask might be "why?" Why does IEEE-754 define division by 0 to be Infinity for floats, where one can reasonably also expect the machine to throw an error, or maybe define it as NaN (not-a-number), or perhaps maybe even 0? For an analysis of this, you should really read Kahan's (the designer of the semantics behind IEEE-754) own notes regarding this matter. Starting on page 10 of the linked document, he discusses why the choice of Infinity is preferable for division-by-zero, which essentially boils down to efficient implementation of numerical algorithms since this convention allows skipping of expensive tests in iterative numerical analysis. Start reading on page 10, and go through the examples he discusses, which ends on top of page 14.
To sum up: Floating point division by 0 is defined to be Infinity by the IEEE-754 standard, and there are good reasons for making this choice. Of course, one can imagine different systems adopting a different answer as well, depending on their particular need or application area; but then they wouldn't be IEEE-754 compliant.
Plugging in 0 just means it is 0 divided by some positive number. Then, the multiplicative inverse will be dividing by 0. As you probably know, this is undefined in mathematics, but in swift, it tries to calculate it. Essentially, it keeps subtracting 0 from the number, but never gets a result, so it will output infinity.
Edit: As Alias pointed out, Swift is not actually going through that process of continually subtracting 0. It will just return infinity anytime it is supposed to divide by 0.

Result of Subtraction Two Float number C#

Consider two number like 1 and 0.99 and i want to sub these number in C#
float s = 0.99f - 1f;
Console.WriteLine(s.toString());
result is : -0.0099999
what can i do that result equal to -0.01 ?
Try this;
decimal d = 0.99m - 1m;
Console.WriteLine(Math.Round(d, 2));
Computers aren't able to perfectly represent fractional numbers. They can only approximate them which is why you're seeing -0.0099999 instead of the expected -0.01.
For anything that you require close approximations you'd typically use an arbitrary precision type and round where appropriate. The most common rounding for currency is bankers rounding as it doesn't skew results heavily in either direction.
See also:
What is the best data type to use for money in c#?
http://wiki.c2.com/?BankersRounding
Floating point numbers are often an approximation. There is a whole field of study about how to use floating numbers in a responsible way in computers and believe me, it is not trivial!
What most programmers do is live with it and make sure their code is 'robust' against the small deviations you get from using floating point numbers.
Wrong:
if (my_float== -0.01)
Right:
if (my_float>= -0.00999 && my_float<=-0.01001)
(The numbers are just as example).
If you need exact numbers you can e.g. use integers. You can use rounding but that is not done halfway calculations as you are likely to make the result more unreliable. Rounding is normally when you print the end result. After all as a good engineer you should know how many digits are relevant at the end.

When can I compare NetLogo values as integers?

Is there official documentation to resolve the apparent conflict between these two statements from the NetLogo 5.0.5 Programming Guide:
"A patch's coordinates are always integers" (from the Agents section)
"All numbers in NetLogo are stored internally as double precision floating point numbers" (from the Math section on the same page.)
Here's why I ask: if the integer patch coordinates are stored as floating point numbers that are very close to integer values then I should avoid comparisons for equality. For example, if there are really no integers, instead of
if pxcor = pycor...
I should use the usual tolerance-checking, like
if abs( pxcor – pycor) < 0.1 ...
Is there some official word that the more complicated code is unnecessary?
The Math section also seems to imply the absence of integer literals: "No distinction is made between 3 and 3.0". So is the official policy to avoid comparisons for equality with constants? For example, is there official sanction for writing code like
if pxcor = 3...
?
Are sliders defined somewhere to produce floating point values? If so, it seems invalid to compare slider values for equality, also. That is, if so, one should avoid writing code like
if pxcor = slider-value
even when the minimum, maximum, and increment values for the slider look like integers.
The focus on official sources in this question arises because I'm not just trying to write a working program. Rather, I'm seeking to tell students how they should program. I'd hate to mislead them, so thanks for any good advice.
NetLogo isn't the only language that works this way, with all numbers stored internally as double precision floating point. The best known other such language is JavaScript.
Math in NetLogo follows IEEE 754, so what follows isn't actually specific to NetLogo, but applies to IEEE 754 generally.
There's no contradiction in the User Manual because mathematically, some floating point numbers are integers, exactly. If the fractional part is exactly zero, then mathematically, it's an integer, and IEEE 754 guarantees that arithmetic and comparison operations will behave as you would expect. If you add 2 and 2 you'll always get 4, never 3.999... or 4.00...01. Integers in, integers out. That holds for comparison, addition, subtraction, multiplication, and divisions that divide evenly. (It may not hold for other operations, so e.g. log 1000 10 isn't exactly 3, and cos 90 isn't exactly 0.)
Therefore if pxcor = 3 is completely valid, correct code. pxcor never has a fractional part, and 3 doesn't have one, either, so no issue of floating point imprecision can arise.
As for NetLogo sliders, if the slider's min, max, and increment are all integers, then there's nothing to worry about; the value of the slider is also always an integer.
(Note: I am the lead developer of NetLogo, and I wrote the sections of the User Manual that you are quoting.)
Just to stress what Seth writes:
Integers in, integers out. That holds for comparison, addition,
subtraction, multiplication, and divisions that divide evenly (emphasis added).
Here's a classic instance of floating point imprecision:
observer> show (2 + 1) / 10
observer: 0.3
observer> show 2 / 10 + 1 / 10
observer: 0.30000000000000004
For nice links that explain why, check out http://0.30000000000000004.com/

Losing accuracy with double division

I am having a problem with a simple division from two integers. I need it to be as accurate as possible, but for some reason the double type is working strange.
For example, if I execute the following code:
double res = (29970.0/1000.0);
The result is 29.969999999999999, when it should be 29.970.
Any idea why this is happening?
Thanks
Any idea why this is happening?
Because double representation is finite. For example, IEEE754 double-precision standard has 52 bits for fraction. So, not all the real numbers are covered. So, some of the values can not be ideally precise. In your case the result is 10^-15 away from the ideal.
I need it to be as accurate as possible
You shouldn't use doubles, then. In Java, for example, you would use BigDecimal instead (most languages provide a similar facility). double operations are intrinsically inaccurate to some degree. This is due to the internal representation of floating point numbers.
floating point numbers of type float and double are stored in binary format. Therefore numbers cant have precise decimal values. Those values are instead quantisized. If you hypothetically had only 2 bits fraction number type you would be able to represent only 2^-2 quantums: 0.00 0.25 0.50 0.75, nothing between.
I need it to be as accurate as possible
There is no silver bullet, but if you want only basic arithmetic operations (which map ℚ to ℚ), and you REALLY want exact results, then your best bet is rational type composed of two unlimited integers (a.k.a. BigInteger, BigInt, etc.) - but even then, memory is not infinite, and you must think about it.
For the rest of the question, please read about fixed size floating-point numbers, there's plenty of good sources.

iOS - rounding a float with roundf() is not working properly

I am having issue with rounding a float in iPhone application.
float f=4.845;
float s= roundf(f * 100.0)/100;
NSLog(#"Output-1: %.2f",s);
s= roundf(484.5)/100;
NSLog(#"Output-2: %.2f",s);
Output-1: 4.84
Output-2: 4.85
Let me know whats problem in this and how to solve this.
The problem is that you don't yet realise one of the inherent problems with floating point: the fact that most numbers cannot be represented exactly (a).
This means that 4.845 is likely to be, in reality, something like 4.8449999999999 which, when you round it, gives you 4.84 rather than what you expect, 4.85.
And what value you end up with also depends on how you calculate it, which is why you're getting a different result.
And, of course, no floating point "inaccuracy" answer would be complete on SO without the authoritative What Every Computer Scientist Should Know About Floating-Point Arithmetic.
(a) Only sums of exact powers of two, within a certain similar range, can be exactly rendered in IEEE754. So, for example, 484.5 is
256 + 128 + 64 + 32 + 4 + 0.5 (28 + 27 + 26 + 25 + 22 + 2-1).
See this answer for a more detailed look into the IEEE754 format.
As to solving it, you have a few choices. One is to use double instead of float. That gives you more precision and greater range of numbers but only moves the problem further away rather than really solving it. Since 0.1 is a repeating fraction in IEEE754, no amount of bits (short of infinity) can exactly represent it.
Another choice is to use a custom library like a big decimal type, which can represent decimals of arbitrary precision (that's not infinite precision as some people are wont to suggest, since it's limited by memory). This will reduce the errors caused by the binary/decimal mismatch.
You may also want to look into NSDecimalNumber - this doesn't give you arbitrary precision but it does give a large range with accurate decimal representation.
There'll still be numbers you can't represent, like PI or the square root of 2 or any other irrational number, but it should cover most cases. If you really need to handle those other values, you need to switch to symbolic numeric representations.
Unlike 484.5 which can be represented exactly as a float* , 4.845 is represented as 4.8449998 (see this calculator if you wish to try other numbers). Multiplying by one hundred keeps the number at 484.49998, which correctly rounds to 484.
* An exact representation is possible because its fractional part 0.5 is a power of two (i.e. 2^-1).