Why NumberLong(9007199254740993) matches NumberLong(9007199254740992) in MongoDB from mongo shell? - mongodb

This situation happens when the given number is big enough (greater than 9007199254740992), along with more tests, I even found many adjacent numbers could match a single number.
Not only NumberLong(9007199254740996) would match NumberLong("9007199254740996"), but also NumberLong(9007199254740995) and NumberLong(9007199254740997).
When I want to act upon a record using its number, I could actually use three different adjacent numbers to get back the same record.

The accepted answer from here makes sense, I quote the most relevant part below:
Caveat: Don't try to invoke the constructor with a too large number, i.e. don't try db.foo.insert({"t" : NumberLong(1234657890132456789)}); Since that number is way too large for a double, it will cause roundoff errors. Above number would be converted to NumberLong("1234657890132456704"), which is wrong, obviously.
Here are some supplements to make things more clear:
Firstly, Mongo shell is a JavaScript shell. And JS does not distinguish between integer and floating-point values. All numbers in JS are represented as floating point values. This means mongo shell uses 64 bit floating point number by default. If shell sees "9007199254740995", it will treat this as a string and convert it to long long. But when we omit the double quotes, mongo shell will see unquoted 9007199254740995 and treat it as a floating-point number.
Secondly, JS uses the 64 bit floating-point format defined in IEEE 754 standard to represent numbers, the maximum it can represent is:
, and the minimum is:
There are an infinite number of real numbers, but only a limited number of real numbers can be accurately represented in the JS floating point format. This means that when you deal with real numbers in JS, the representation of the numbers will usually be an approximation of the actual numbers.
This brings the so-called rounding error issue. Because integers are also represented in binary floating-point format, the reason for the loss of trailing digits precision is actually the same as that of decimals.
The JS number format allows you to accurately represent all integers between
and
Here, since the numbers are bigger than 9007199254740992, the rounding error certainly occurs. The binary representation of NumberLong(9007199254740995), NumberLong(9007199254740996) and NumberLong(9007199254740997) are the same. So when we query with these three numbers in this way, we are practically asking for the same thing. As a result, we will get back the same record.
I think understanding that this problem is not specific to JS is important: it affects any programming language that uses binary floating point numbers.

You are misusing the NumberLong constructor.
The correct usage is to give it a string argument, as stated in the relevant documentation.
NumberLong("2090845886852")

Related

comparing float and double and printing them

I have a quick question. So, say I have a really big number up to like 15 digits, and I would take the input and assign it to two variables, one float and one double if I were to compare two numbers, how would you compare them? I think double has the precision up to like 15 digits? and float has 8? So, do I simply compare them while the float only contains 8 digits and pad the rest or do I have the float to print out all 15 digits and then make the comparison? Also, if I were asked to print out the float number, is the standard way of doing it is just printing it up to 8 digits? which is its max precision
thanks
Most languages will do some form of type promotion to let you compare types that are not identical, but reasonably similar. For details, you would have to indicate what language you are referring to.
Of course, the real problem with comparing floating point numbers is that the results might be unexpected due to rounding errors. Most mathematical equivalences don't hold for floating point artihmetic, so two sequences of operations which SHOULD yield the same value might actually yield slightly different values (or even very different values if you aren't careful).
EDIT: as for printing, the "standard way" is based on what you need. If, for some reason, you are doing monetary computations in floating point, chances are that you'll only want to print 2 decimal digits.
Thinking in terms of digits may be a problem here. Floats can have a range from negative infinity to positive infinity. In C# for example the range is ±1.5 × 10^−45 to ±3.4 × 10^38 with a precision of 7 digits.
Also, IEEE 754 defines floats and doubles.
Here is a link that might help http://en.wikipedia.org/wiki/IEEE_floating_point
Your question is the right one. You want to consider your approach, though.
Whether at 32 or 64 bits, the floating-point representation is not meant to compare numbers for equality. For example, the assertion 2.0/7.0 == 60.0/210.0 may or may not be true in the CPU's view. Conceptually, the floating-point is inherently meant to be imprecise.
If you wish to compare numbers for equality, use integers. Consider again the ratios of the last paragraph. The assertion that 2*210 == 7*60 is always true -- noting that those are the integral versions of the same four numbers as before, only related using multiplication rather than division. One suspects that what you are really looking for is something like this.

Irrational number representation in computer

We can write a simple Rational Number class using two integers representing A/B with B != 0.
If we want to represent an irrational number class (storing and computing), the first thing came to my mind is to use floating point, which means use IEEE 754 standard (binary fraction). This is because irrational number must be approximated.
Is there another way to write irrational number class other than using binary fraction (whether they conserve memory space or not) ?
I studied jsbeuno's solution using Python: Irrational number representation in any programming language?
He's still using the built-in floating point to store.
This is not homework.
Thank you for your time.
With a cardinality argument, there are much more irrational numbers than rational ones. (and the number of IEEE754 floating point numbers is finite, probably less than 2^64).
You can represent numbers with something else than fractions (e.g. logarithmically).
jsbeuno is storing the number as a base and a radix and using those when doing calcs with other irrational numbers; he's only using the float representation for output.
If you want to get fancier, you can define the base and the radix as rational numbers (with two integers) as described above, or make them themselves irrational numbers.
To make something thoroughly useful, though, you'll end up replicating a symbolic math package.
You can always use symbolic math, where items are stored exactly as they are and calculations are deferred until they can be performed with precision above some threshold.
For example, say you performed two operations on a non-irrational number like 2, one to take the square root and then one to square that. With limited precision, you may get something like:
(√2)²
= 1.414213562²
= 1.999999999
However, storing symbolic math would allow you to store the result of √2 as √2 rather than an approximation of it, then realise that (√x)² is equivalent to x, removing the possibility of error.
Now that obviously involves a more complicated encoding that simple IEEE754 but it's not impossible to achieve.

Are there any real-world uses for converting numbers between different bases?

I know that we need to convert decimal, octal, and hexadecimal into binary, but I am confused about conversion of decimal to octal or octal to hexadecimal or decimal to hexadecimal.
Why and where we need these types of conversion?
Different bases are good for different purposes.
Decimal is obviously what most people know how to deal with, so is good for output of real quantities to end users.
Hex is short and has an even ratio of exactly 2 characters per byte, so it's good for expressing large numbers like SHA1 hashes or private keys and the like in a type-able format, particularly since those numbers don't really represent a quantity, so users don't need to be able to understand them as numbers.
Octal is mostly for legacy reasons -- UNIX file permission codes are traditionally expressed as octal numbers, for example, because three bits per digit corresponds nicely to the three bits per user-category of the UNIX permission encoding scheme.
One sometimes will want to use numbers in one base for a purpose where another base is desired. Thus, the various conversion functions available. In truth, however, my experience is that in practice you almost never convert from one base to another much, except to convert numbers from some non-binary base into binary (in the form of your language of choice's native integral type) and back out into whatever base you need to output. Most of the time one goes from one non-binary base to another is when learning about bases and getting a feel for what numbers in different bases look like, or when debugging using hexadecimal output. Even then, if a computer does it the main method is to convert to binary and then back out, because current computers are just inherently good at dealing with base-2 numbers and not-so-good at anything else.
One important place you see numbers actually stored and operated on in decimal is in some financial applications or others where it's important that "number-of-decimal-place" level precision be preserved. Sometimes fixed-point arithmetic can work for currency, but not always, and if it doesn't using binary-floating-point is a bad idea. Older systems actually had built in support for this in the form of binary-coded-decimal arithmetic. In BCD, each 4 bits acts as a decimal digit, so you give up a chunk of every 4 bits of storage in exchange for maintaining your level of precision in the base-of-choice of the non-computing world.
Oddly enough, there is one common use case for other bases that's a bit hidden. Modern languages with large number support (e.g. Python 2.x's long type or Java's BigInteger and BigDecimal type) will usually store the numbers internally in an array with each element being a digit in some base. Then they implement the math they support on strings of digits of that base. Really efficient bigint implementations may actually use use a base approaching 2^(bits in machine native word size); a base 2^64 number is obviously impossible to usefully output in that form, but doing the calculations in chunks of that size ends up making the best use of space and the CPU. (I don't know if that's the best base; it may be best to use a base of half that number of bits to simplify overflow handling from one digit to the next. It's been awhile since I wrote my own bigint and I never implemented the faster/more-complicated versions of multiplication and division.)
MIME uses hexadecimal system for Quoted Printable encoding (e.g. mail subject in Unicode) abd 64-based system for Base64 encoding.
If your workplace is stuck in IPv4 CIDR - you'll be doing quite a lot of bin -> hex -> decimal conversions managing most of the networking equipment until you get them memorized (or just use some random, simple tool).
Even that usage is a bit few-and-far-between - most businesses just adopt the lazy "/24 everything" approach.
If you do a lot of graphics work - there's the chance you'll want to convert colors between systems and need to convert from hex -> dec... most tools have this built in to the color picker, though.
I suppose there's no practical reason to be able to do other than it's really simple and there's no point not learning how to do it. :)
... unless, for some reason, you're trying to do mantissa binary math in your head.
All of these bases have their uses. Hexadecimal in particular is useful as a shorthand for binary. Every hexadecimal digit is equivalent to 4 bits, so you can write a full 32-bit value as a string of 8 hex digits. Likewise, octal digits are equivalent to 3 bits, and are used frequently as a shorthand for things like Unix file permissions (777 = set read, write, execute bits for user/group/other).
No one base is special--they all have their (obscure) uses. Decimal is special to us because it reflects human experience (10 fingers) but that's really the only reason.
A real world use case: a program prints error code in decimal, to get info from a database or the internet you need the hexadecimal format, because the bits of the error 'number' convey extra info you need to look at it in binary.
I'm there are occasional uses for this. One use case would be a little app that allows user who wants to convert decimal to octal ... like you can with lots of calculators.
But I'm not sure I understand the point of the question. Standard libraries typically don't provide methods like String toOctal(String decimal). Instead, you would normally convert from a decimal String to a primitive integer and then from the primitive integer to (say) an octal String.

Arbitrary precision Float numbers on JavaScript

I have some inputs on my site representing floating point numbers with up to ten precision digits (in decimal). At some point, in the client side validation code, I need to compare a couple of those values to see if they are equal or not, and here, as you would expect, the intrinsics of IEEE754 make that simple check fails with things like (2.0000000000==2.0000000001) = true.
I may break the floating point number in two longs for each side of the dot, make each side a 64 bit long and do my comparisons manually, but it looks so ugly!
Any decent Javascript library to handle arbitrary (or at least guaranteed) precision float numbers on Javascript?
Thanks in advance!
PS: A GWT based solution has a ++
There is the GWT-MATH library at http://code.google.com/p/gwt-math/.
However, I warn you, it's a GWT jsni overlay of a java->javascript automated conversion of java.BigDecimal (actually the old com.ibm.math.BigDecimal).
It works, but speedy it is not. (Nor lean. It will pad on a good 70k into your project).
At my workplace, we are working on a fixed point simple decimal, but nothing worth releasing yet. :(
Use an arbitrary precision integer library such as silentmatt’s javascript-biginteger, which can store and calculate with integers of any arbitrary size.
Since you want ten decimal places, you’ll need to store the value n as n×10^10. For example, store 1 as 10000000000 (ten zeroes), 1.5 as 15000000000 (nine zeroes), etc. To display the value to the user, simply place a decimal point in front of the tenth-last character (and then cut off any trailing zeroes if you want).
Alternatively you could store a numerator and a denominator as bigintegers, which would then allow you arbitrarily precise fractional values (but beware – fractional values tend to get very big very quickly).

Problem with very small numbers?

I tried to assign a very small number to a double value, like so:
double verySmall = 0.000000001;
9 fractional digits. For some reason, when I multiplicate this value by 10, I get something like 0.000000007. I slighly remember there were problems writing big numbers like this in plain text into source code. Do I have to wrap it in some function or a directive in order to feed it correctly to the compiler? Or is it fine to type in such small numbers in text?
The problem is with floating point arithmetic not with writing literals in source code. It is not designed to be exact. The best way around is to not use the built in double - use integers only (if possible) with power of 10 coefficients, sum everything up and display the final useful figure after rounding.
Standard floating point numbers are not stored in a perfect format, they're stored in a format that's fairly compact and fairly easy to perform math on. They are imprecise at surprisingly small precision levels. But fast. More here.
If you're dealing with very small numbers, you'll want to see if Objective-C or Cocoa provides something analagous to the java.math.BigDecimal class in Java. This is precisely for dealing with numbers where precision is more important than speed. If there isn't one, you may need to port it (the source to BigDecimal is available and fairly straightforward).
EDIT: iKenndac points out the NSDecimalNumber class, which is the analogue for java.math.BigDecimal. No port required.
As usual, you need to read stuff like this in order to learn more about how floating-point numbers work on computers. You cannot expect to be able to store any random fraction with perfect results, just as you can't expect to store any random integer. There are bits at the bottom, and their numbers are limited.