Can MongoDB's _id fields be compared? - mongodb

I've been reading about MongoDB using timestamps of object's creation to create ids. Is it valid to simply compare these and find out which object's been created earlier?

You can compare ObjectIDs with the .equals(). See the documentation.
ObjectId is a hexadecimal string which represents a 12-byte number.
a 4-byte timestamp value, representing the ObjectId's creation,
measured in seconds since the Unix epoch
a 5-byte random value
a 3-byte incrementing counter, initialized to a random value
Since the time stamp is the most significant part of an ObjectId, yes you can.
Selecting the most significant four bytes of the ObjectId as the time stamp.
Also see ObjectId.getTimestamp() documentation.

generally, it is possible to compare Objects' creation by ObjectId: for more info, refer this link.
-- citing this link: https://steveridout.github.io/
Why generate an ObjectId from a timestamp?
To query documents by creation date.
e.g. to find all comments created after 2013-11-01:
db.comments.find({_id: {$gt: ObjectId("5272e0f00000000000000000")}})
-- another helpful and explanatory link:
uses for mongodb ObjectId creation time
best regards

Related

CastError: Cast to ObjectId failed for value "1225589795" at path "_id" in TypeScript REST and Mongoose

I am trying to GET data based on deviceId but I am getting this error. The API has been designed in TypeScript using TypeScript REST.
The error is self explanatory. Path _id has to be in a specific format and 1225589795 doesn't it.
You can check here how do you get the error but not here.
Mongo _id has to be an ObjectId which is compouned (as docs explain) by:
a 4-byte timestamp value, representing the ObjectId’s creation, measured in seconds since the Unix epoch
a 5-byte random value
a 3-byte incrementing counter, initialized to a random value
So the value you use has to match this and your number doesn't.

Is MongoDB _id (ObjectId) generated in an ascending order?

I know how the _id column contains a representation of timestamp when the document has been inserted into the collection. here is an online utility to convert it to timestamp: http://steveridout.github.io/mongo-object-time/
What I'm wondering is if the object id string itself is guaranteed maintain the ascending order or not? i.e. does this comparison always return true?
"newest object id" > "second newest object id"
No, there is no guarantee whatsoever. From the official documentation (at the time of the original answer):
The relationship between the order of ObjectId values and generation time is not strict within a single second. If multiple systems, or multiple processes or threads on a single system generate values, within a single second; ObjectId values do not represent a strict insertion order. Clock skew between clients can also result in non-strict ordering even for values, because client drivers generate ObjectId values, not the mongod process.
And from the latest docs
While ObjectId values should increase over time, they are not necessarily monotonic. This is because they:
Only contain one second of temporal resolution, so ObjectId values created within the same second do not have a guaranteed ordering, and
Are generated by clients, which may have differing system clocks.
For mongo version >= 3.4, the Objectid generation is changed a little.
Its structs are:
a 4-byte value representing the seconds since the Unix epoch,
a 5-byte random value, and
a 3-byte counter, starting with a random value.
So the first 4 bytes are still the seconds since the Unix epoch, it is still almost ascending but not strictly.
https://docs.mongodb.com/manual/reference/bson-types/#objectid
_id: ObjectId(4 bytes timestamp, 3 bytes machine id, 2 bytes process id, 3 bytes incrementer)
This is the id structure. So only last 3 bytes will increment uniquely. So the answer of your question is yes.

When was a document added to a MongoDB collection

I have an existing mongodb collection, which doesn't have any information about when the document was created.
Is it possible to get this information some how? I've had a look through the docs but can't see it anywhere.
If you are using the default ObjectId value for your _id attribute, the creation time is encoded inside it.
As stated in the ObjectID documentation:
ObjectId is a 12-byte BSON type, constructed using:
a 4-byte value representing the seconds since the Unix epoch,
a 3-byte machine identifier
a 2-byte process id, and a 3-byte counter, starting with a random value.
You can call the getTimestamp() function on an ObjectId object to get an ISODate object containing the creation time of the object:
In the mongo shell:
ObjectId().getTimestamp()
ISODate("2014-05-14T14:29:12Z")
There is most likely the actual _id value of the document unless you have replaced this with something else.
In the simple "JavaScript" syntax of this ( and various methods are available to other languages) you simply access this as:
var id = new ObjectId();
id.getTimetstamp();
Various language implementations have a way of retrieving the "timestamp" from an ObjectId value so you can just use that.

How are MongoDB's ObjectIds generated?

Are they somewhat random?
I mean....would people be able to break them apart?
They are not random and can be easily predicted :
A BSON ObjectID is a 12-byte value
consisting of a 4-byte timestamp
(seconds since epoch), a 3-byte
machine id, a 2-byte process id, and a
3-byte counter
http://www.mongodb.org/display/DOCS/Object+IDs
Heres a javascript implementation of the MongoDB ObjectID (http://jsfiddle.net/icodeforlove/rN3zb/)
function ObjectIdDetails (id) {
return {
seconds: parseInt(id.slice(0, 8), 16),
machineIdentifier: parseInt(id.slice(8, 14), 16),
processId: parseInt(id.slice(14, 18), 16),
counter: parseInt(id.slice(18, 24), 16)
};
}
So if you have enough of them they leak quite a bit of information about your infrastructure. And you also know the object creation dates for everything.
IE: how many servers do you have, and how many processes each server is running.
Generation
They are usually generated on the client side by the driver itself. For example, in ruby, BSON::ObjectID can be used:
https://github.com/mongodb/bson-ruby/blob/master/lib/bson/object_id.rb#L369
You can also generate your own ObjectIds. This is particularly useful if you want to use business identifiers.
Breakability
When using driver generated ObjectIds, is low
When using own business Id, is slightly higher depending on their predictability (login, consecutives identifiers...)
MongoDB database drivers by default generate an ObjectID identifier that is assigned to the _id field of each document. In many cases the ObjectID may be used as a unique identifier in an application.
ObjectID is a 96-bit number which is composed as follows:
a 4-byte value representing the seconds since the Unix epoch (which will not run out of seconds until the year 2106)
a 3-byte machine identifier (usually derived from the MAC address),
a 2-byte process id, and
a 3-byte counter, starting with a random value.
From the MongoDB Official Document links
it shows :
ObjectId
ObjectIds are small, likely unique, fast to generate, and
ordered. ObjectId values consist of 12 bytes, where the first four
bytes are a timestamp that reflect the ObjectId’s creation.
Specifically:
a 4-byte value representing the seconds since the Unix epoch,
a
5-byte random value, and
a 3-byte counter, starting with a random
value.
In MongoDB, each document stored in a collection requires a
unique _id field that acts as a primary key. If an inserted document
omits the _id field, the MongoDB driver automatically generates an
ObjectId for the _id field.
MongoDB database drivers by default generate an ObjectID identifier that is assigned to the _id field of each document. In many cases the ObjectID may be used as a unique identifier in an application.
Total 12 bytes:
4-byte timestamp value representing the seconds since the Unix epoch (which will not run out of seconds until the year 2106)
5-byte random value, and
3-byte incrementing counter, starting with a random value.
Example from mongo-go-driver:
var objectId [12]byte
// 4 bytes unix time-stamp second (big endian)
binary.BigEndian.PutUint32(objectId[0:4], uint32(timestamp.Unix()))
// global random number generated by driver
copy(objectId[4:9], processUnique[:])
// global counter by driver
putUint24(objectId[9:12], atomic.AddUint32(&objectIDCounter, 1))

What's MongoDB hash's size?

I need to know what is size of the hash of MongoDB. Can't find it on wikipedia or official site.
MongoDB uses 12-byte binary value (an ObjectId) -- it can be converted to 24-byte hex string.
An ObjectId, the default value for the _id field, is a 12-byte value; it is not a hash nor a string -- it is stored as binary value. Many drivers will show it as a hex string, so it can be easily printed.
It is comprised of a timestamp (in secs), a host id, process id and counter; this means that it is increasing over time of creation, and encodes the time of creation (insertion).
http://www.mongodb.org/display/DOCS/Object+IDs
Most drivers have helper methods to convert to and from the hex string representation, as well as creating one based on just the parts you are interested in -- i.e. a timestamp you might use for a range query. You can also easily extract the timestamp portion.