I need to know what is size of the hash of MongoDB. Can't find it on wikipedia or official site.
MongoDB uses 12-byte binary value (an ObjectId) -- it can be converted to 24-byte hex string.
An ObjectId, the default value for the _id field, is a 12-byte value; it is not a hash nor a string -- it is stored as binary value. Many drivers will show it as a hex string, so it can be easily printed.
It is comprised of a timestamp (in secs), a host id, process id and counter; this means that it is increasing over time of creation, and encodes the time of creation (insertion).
http://www.mongodb.org/display/DOCS/Object+IDs
Most drivers have helper methods to convert to and from the hex string representation, as well as creating one based on just the parts you are interested in -- i.e. a timestamp you might use for a range query. You can also easily extract the timestamp portion.
Related
I've been reading about MongoDB using timestamps of object's creation to create ids. Is it valid to simply compare these and find out which object's been created earlier?
You can compare ObjectIDs with the .equals(). See the documentation.
ObjectId is a hexadecimal string which represents a 12-byte number.
a 4-byte timestamp value, representing the ObjectId's creation,
measured in seconds since the Unix epoch
a 5-byte random value
a 3-byte incrementing counter, initialized to a random value
Since the time stamp is the most significant part of an ObjectId, yes you can.
Selecting the most significant four bytes of the ObjectId as the time stamp.
Also see ObjectId.getTimestamp() documentation.
generally, it is possible to compare Objects' creation by ObjectId: for more info, refer this link.
-- citing this link: https://steveridout.github.io/
Why generate an ObjectId from a timestamp?
To query documents by creation date.
e.g. to find all comments created after 2013-11-01:
db.comments.find({_id: {$gt: ObjectId("5272e0f00000000000000000")}})
-- another helpful and explanatory link:
uses for mongodb ObjectId creation time
best regards
I have read through MongoDB manual but still couldn't find what I need.
Is it only 24 alphabet letters and 0123456789 are involved into autogenerated objectId or "id", Is there a chance that it will generate something like "jkfdfak-123kjsd?" and which exactly symbols are not used.
By default, ObjectId is a 12-byte BSON type, constructed using this data:
4-byte value representing the seconds since the Unix epoch
3-byte machine identifier
2-byte process id
3-byte counter, starting with a random value.
And the string representation is in hexadecimal.
If you want create your own ObjectId you must provide unique hexadecimal (0[xX][0-9a-fA-F]+) string.
Mongodb _id field is defined as:
ObjectId is a 12-byte BSON type, constructed using:
a 4-byte value representing the seconds since the Unix epoch,
a 3-byte machine identifier,
a 2-byte process id, and
a 3-byte counter, starting with a random value.
what would be most efficient representation of this field in postgresql?
I've used char(24) with a constraint CHECK decode(mongo_id::text, 'hex'::text) > '\x30'::bytea. While this constraint doesn't check the sanity of the ObjectId, it allows only valid format to be stored. This stores the ObjectId in plain text, which keeps the values easily readable.
Other option could be to use bytea type for the column, and input the data as "\xOBJECT_ID" where \x transforms text form of OBJECT_ID to a byte array. This consumes less space than char(24) (might be relevant if you have millions of rows), but accessing the values in a non-binary format requires using eg. encode(mongo_id::bytea, 'hex') (might be burdensome).
Also some platforms such as RedShift might have problems with the bytea data type.
If you need an easy access to the metadata in the ObjectId, you could parse and store it separately (eg. in a jsonb column or a separate column for each relevant attribute). Possibly the "created at" part of the metadata is the only interesting attribute.
i am new to mongodb and stack overflow.
I want to know why on mongodb collection ID is of 24 hex characters?
what is importance of that?
Why is the default _id a 24 character hex string?
The default unique identifier generated as the primary key (_id) for a MongoDB document is an ObjectId. This is a 12 byte binary value which is often represented as a 24 character hex string, and one of the standard field types supported by the MongoDB BSON specification.
The 12 bytes of an ObjectId are constructed using:
a 4 byte value representing the seconds since the Unix epoch
a 3 byte machine identifier
a 2 byte process id
a 3 byte counter (starting with a random value)
What is the importance of an ObjectId?
ObjectIds (or similar identifiers generated according to a GUID formula) allow unique identifiers to be independently generated in a distributed system.
The ability to independently generate a unique ID becomes very important as you scale up to multiple application servers (or perhaps multiple database nodes in a sharded cluster). You do not want to have a central coordination bottleneck like a sequence counter (eg. as you might have for an auto-incrementing primary key), and you will want to insert new documents without risk that a new identifier will turn out to be a duplicate.
An ObjectId is typically generated by your MongoDB client driver, but can also be generated on the MongoDB server if your client driver or application code or haven't already added an _id field.
Do I have to use the default ObjectId?
No. If you have a more suitable unique identifier to use, you can always provide your own value for _id. This can either be a single value or a composite value using multiple fields.
The main constraints on _id values are that they have to be unique for a collection and you cannot update or remove the _id for an existing document.
Now mongoDB current version is 4.2. ObjectId size is still 12 bytes but consist of 3 parts.
ObjectIds are small, likely unique, fast to generate, and ordered.
ObjectId values are 12 bytes in length, consisting of:
a 4-byte timestamp value, representing the ObjectId’s creation, measured in seconds since the Unix epoch
a 5-byte random value
a 3-byte incrementing counter, initialized to a random value
Create ObjectId and get timestamp from it
> x = ObjectId()
ObjectId("5fdedb7c25ab1352eef88f60")
> x.getTimestamp()
ISODate("2020-12-20T05:05:00Z")
Reference
Read MongoDB official doc
I have an existing mongodb collection, which doesn't have any information about when the document was created.
Is it possible to get this information some how? I've had a look through the docs but can't see it anywhere.
If you are using the default ObjectId value for your _id attribute, the creation time is encoded inside it.
As stated in the ObjectID documentation:
ObjectId is a 12-byte BSON type, constructed using:
a 4-byte value representing the seconds since the Unix epoch,
a 3-byte machine identifier
a 2-byte process id, and a 3-byte counter, starting with a random value.
You can call the getTimestamp() function on an ObjectId object to get an ISODate object containing the creation time of the object:
In the mongo shell:
ObjectId().getTimestamp()
ISODate("2014-05-14T14:29:12Z")
There is most likely the actual _id value of the document unless you have replaced this with something else.
In the simple "JavaScript" syntax of this ( and various methods are available to other languages) you simply access this as:
var id = new ObjectId();
id.getTimetstamp();
Various language implementations have a way of retrieving the "timestamp" from an ObjectId value so you can just use that.