As a general understanding, numeric operations are faster than string operations. It is faster to compute 2 < 5 than "hello" < "world".
Also, since the length of the _id is 12. If it is stored as a string where a character can take 2 bytes and thus the entire string would take 24 bytes. However a number of 8 bytes can very easily store this value of 12 digits. Basically, a number type would take lesser space.
So with these logics, why does mongodb store _id as string type ObjectIDs and not numeric ObjectID?
Your understanding is incorrect.
It is much better than Strings. It just takes 12 bytes in memory as mentioned in the documentation
Java implementation for the same.
Refer toHexString in the github link, that's what returned to you when you ask for String. It actually takes 24bytes whereas ObjectId takes only 12bytes.
Related
I'm having a problem handling integers with Mongodb that are too large for the standard Int64 format. To give an example of the sorts of numbers we're talking about here: I mean something like 10683171315666225459389678813960790592
Now, I've considered turning this into a Decimal128 and that seems to work. The problem I run into is that I think MongoDB is truncating these values when I convert them. I'm using:
db.NuAIS.find().forEach( function (x) { x.HilbertVal = parseFloat(x.HilbertVal); db.NuAIS.save(x); });
So I read the db, convert each of those string values to Decimal and get: 1.0683171315666225e+37. My assumption was that MongoDB was storing the actual value despite what it declared.
Here's the thing though: these values seem truncated since, when I try to search using the original string on MongoCompass, I get multiple values returned for a supposedly unique string. I assume that mongoDB has truncated the decimals after X strings and so is just returning ones that match the first X digits of the search string, which means you get multiple matches instead of one.
Is my assumption here right? Is there a way to avoid this with decimals or just to skip the whole problem altogether by having ints larger than 64 bits? I'd leave it as a string but I need to work with them as digits, find things that fall within a range and such.
Thank you
This question already has answers here:
Shorten MongoDB ID in javascript
(2 answers)
Mongo ids leads to scary URLs
(4 answers)
Closed 4 years ago.
An ObjectId in MongoDB is a 12-byte BSON type. In the 12-byte
structure, the first 4 bytes of the ObjectId represent the time in
seconds since the UNIX epoch. The next 3 bytes of the ObjectId
represent the machine identifier. The next 2 bytes of the ObjectId
represent the process ID. And the last 3 bytes of the ObjectId
represent a random counter value.
These 12 bytes altogether uniquely identifies a document within the
MongoDB collection and serves as a primary key for that document.
ObjectId is the default value of _id field of each document and its
complexity helps to fetch a unique _id field for a particular document
in the MongoDB.
A client want to use the same identifier but for some reason they can't use the 5813eed6e6893b80c9ae5bba 24 long identifier, so in a moment of desperation the tec lead sugest cut off the 4 first digit
5813 -- eed6e6893b80c9ae5bba, but I dont know how a good idea is this and what is the probability of collision if this idea is apply, we are assuming the 4 first digit are corresponding to "the random counter value part" as expressed in the docs.
So my question is which is the probability of collision of this shorted id eed6e6893b80c9ae5bba ? if it is a bad idea, how to convert the id into a equivalent of 20 characters long ?.
I know how the _id column contains a representation of timestamp when the document has been inserted into the collection. here is an online utility to convert it to timestamp: http://steveridout.github.io/mongo-object-time/
What I'm wondering is if the object id string itself is guaranteed maintain the ascending order or not? i.e. does this comparison always return true?
"newest object id" > "second newest object id"
No, there is no guarantee whatsoever. From the official documentation (at the time of the original answer):
The relationship between the order of ObjectId values and generation time is not strict within a single second. If multiple systems, or multiple processes or threads on a single system generate values, within a single second; ObjectId values do not represent a strict insertion order. Clock skew between clients can also result in non-strict ordering even for values, because client drivers generate ObjectId values, not the mongod process.
And from the latest docs
While ObjectId values should increase over time, they are not necessarily monotonic. This is because they:
Only contain one second of temporal resolution, so ObjectId values created within the same second do not have a guaranteed ordering, and
Are generated by clients, which may have differing system clocks.
For mongo version >= 3.4, the Objectid generation is changed a little.
Its structs are:
a 4-byte value representing the seconds since the Unix epoch,
a 5-byte random value, and
a 3-byte counter, starting with a random value.
So the first 4 bytes are still the seconds since the Unix epoch, it is still almost ascending but not strictly.
https://docs.mongodb.com/manual/reference/bson-types/#objectid
_id: ObjectId(4 bytes timestamp, 3 bytes machine id, 2 bytes process id, 3 bytes incrementer)
This is the id structure. So only last 3 bytes will increment uniquely. So the answer of your question is yes.
I am using Mongoid, which is on top of the Ruby MongDB driver. Even though my Map's emit is giving out a parseInt(num), and the Reduce's return is giving back also a parseInt(num), the final results still are floats.
Is that particular to MongoDB? Any way to make it integer instead?
The parseInt function officially takes a string as parameter. This string is parsed as if it were an integer, thus ignoring everything after the first non-numeric character. If you provide a floating point number, it will be converted to a string before it is parsed.
The parseInt functions returns a Number, not an integer. Number is the only numeric data type in JavaScript; there is no distinction between integers and floats.
So while parseInt will remove any decimals, the data type doesn't change. Therefore Mongoid doesn't know whether to treat the result as a float or an integer. You're responsible for converting the result to an integer, as you can see in this example.
Update
I came across the NumberLong type, which represents a 64-bit integer. If you return new NumberLong(num) from your reduce function, Mongoid may treat it as an integer type.
Note that you'll need MongoDB 1.6 for this to work in the MongoDB Shell. I don't know whether Mongoid supports it yet.
I need to know what is size of the hash of MongoDB. Can't find it on wikipedia or official site.
MongoDB uses 12-byte binary value (an ObjectId) -- it can be converted to 24-byte hex string.
An ObjectId, the default value for the _id field, is a 12-byte value; it is not a hash nor a string -- it is stored as binary value. Many drivers will show it as a hex string, so it can be easily printed.
It is comprised of a timestamp (in secs), a host id, process id and counter; this means that it is increasing over time of creation, and encodes the time of creation (insertion).
http://www.mongodb.org/display/DOCS/Object+IDs
Most drivers have helper methods to convert to and from the hex string representation, as well as creating one based on just the parts you are interested in -- i.e. a timestamp you might use for a range query. You can also easily extract the timestamp portion.