what is the meaning of each bytes existing in the _id property in mongodb collections. in mongodb's site there were three meaningful values:
a 4-byte timestamp value representing the seconds since the Unix epoch (which will not run out of seconds until the year 2106)
a 5-byte random value, and
a 3-byte incrementing counter, starting with a random value.
but also as mosh hamedany and chatGPT said:
It is a 12-byte BSON type, which consists of a 4-byte timestamp, a 3-byte machine id, a 2-byte process id, and a 3-byte counter.
witch of them is true?
The official documentation says:
ObjectId()
Returns a new ObjectId. The 12-byte ObjectId consists of:
A 4-byte timestamp, representing the ObjectId's creation, measured in seconds since the Unix epoch.
A 5-byte random value generated once per process. This random value is unique to the machine and process.
A 3-byte incrementing counter, initialized to a random value.
When you go back to documentation of MongoDB version 3.2, it is a little more detailed:
a 4-byte value representing the seconds since the Unix epoch,
a 3-byte machine identifier,
a 2-byte process id, and
a 3-byte counter, starting with a random value.
And if you like to know the details, have a look at the source code
Related
I'm working on tables obtained from a PervasiveSQL database and I have some trouble managing dates.
In some of the fields dates are recorded in the format we use in Italy, dd/mm/yyyy, but in others are recorded in a format I can't understand, something like this:
Start_Date 132384788
Last_Tx_Date 132385052
Last_Tx_Time 252711936
What kind of format is it?
How can I convert it in a human readable one?
I think that Start_Date could be August 8 2020 but I'm not sure.
Thanks for any help!
I tried to copy and paste tables in an Excel file but automatic dates conversion did not work.
The Start_Date and Last_Tx_Date fields look to be Btrieve Date fields. If you set the data type for that field in the DDFs to Date, it should show a human readable field. However the Last_Tx_Time field is a Btrieve Time (not timestamp) type.
From the Actian Zen v15.10 documentation (https://docs.actian.com/zen/v15/#page/sqlref/sqldtype.htm#ww136646):
Date:
The DATE key type is stored internally as a 4-byte value. The day and the month are each stored in 1-byte binary format. The year is a 2-byte binary number that represents the entire year value. The MicroKernel places the day into the first byte, the month into the second byte, and the year into a two-byte word following the month.
An example of C structure used for date fields would be:
TYPE dateField {
char day;
char month;
integer year;
}
The year portion of a date field is expected to be set to the integer representation of the entire year. For example, 2,001 for the year 2001.
Time:
The TIME key type is stored internally as a 4-byte value. Hundredths of a second, second, minute, and hour values are each stored in 1-byte binary format. The MicroKernel places the hundredths of a second value in the first byte, followed respectively by the second, minute, and hour values. The data format is hh:mm:ss.nn. Supported values range from 00:00:00.00 to 23:59:59.99.
If at some time, the epoch is ffffffff, than the objectId created at this moment is something like :
ffffffff15580625bcb65364
Then, what could be the ObjectId created after 1 second?
Then, what could be the ObjectId created after [the Unix epoch rolls over in 32 bits]?
This would depend on the specific implementation, its programming language and their handling of math calculations.
It is possible that some implementations and languages would error when they retrieve the number of seconds since the Unix epoch as a 64-bit integer (which is quite common today) and then try to use a value which exceeds 32 bits in size for ObjectId generation. If this happens the driver will cease to be able to generate ObjectIds, consequently it may be unable to insert documents without _id values being provided by the application using some other generation strategy.
In other implementations the timestamp itself may roll over to zero, at which point the ObjectId generation will succeed with a very small timestamp value.
Yet other implementations may truncate (from either most or least significant side) the timestamp to coerce it into the 32 available bits of an ObjectId.
The ObjectId value itself doesn't actually have to have an accurate timestamp - it is required to be unique within the collection and it is "generally increasing" but MongoDB-the-database wouldn't care if ObjectId values wrapped to around zero at some point.
As docs says, timestamp is represented by 4-byte.
4-byte timestamp value, representing the ObjectId’s creation, measured in seconds since the Unix epoch
4 bytes is from -2,147,483,648 to 2,147,483,647 values, so, that is 4,294,967,295 values.
And the date from 4,294,967,295 according to unix timestamp is: GMT: Sunday, 7 February 2106 6:28:15
After this date, ObjectId won't be able to store the timestamp.
So, can ObjectId overflow? In 85 years every new ObjectId created will fail because it won't be able to create the timestamp with only 4 bytes.
I already have a SO question and answer on how to convert BSON Timestamp in a MongoDB aggregation, but now I have a situation where I would like to convert in node.js.
So just to repeat. My goal is to convert a "Timestamp" datatype to a javascript date, without doing it in an aggregation - is this possible?
If the BSON library you are using provides Timestamp type, you can use its getTime method to return the seconds, and create a date from that:
Date(object.clusterTime.getTime())
If you don't have that function available, the timestamp is a 64-bit value where the high 32-bits are the seconds since epoch, and the low 32-bits are a counter.
Bitshift the value 32 bits right or divide by 2^32 to get the seconds:
Date(object.clusterTime/Math.pow(2,32))
For Mongo Bson type Timestamp, there is a constructor: BsonTimestamp(final int seconds, final int increment), how to understand the increment? what is the design consideration?
Timestamp is an internal BSON type used by MongoDB to reflect the operation time (ts) for entries in the replication oplog.
The BSON Timestamp type is designed for the specific use case of logging ordered batches of time-based operations:
the first 32 bits (time_t) are an integer value representing seconds since the Unix epoch
the second 32 bits are an integer value (ordinal) indicating ordering within a given second
The design requirement for Timestamps is based on preserving strict ordering for oplog entries rather than time precision (eg milliseconds or microseconds). The leading time component gives a course granularity of seconds; appending an incrementing ordinal value ensures strict ordering of unique Timestamp values within a given second. Using an ordered sequence instead of time precision avoids pushing down the potential conflict of two operations that might occur in the same millisecond (or microsecond).
For application use cases you should use the BSON Date type instead of a Timestamp. A BSON Date is the same size (in bits) as a Timestamp, but provides more granularity for time:
BSON Date is a 64-bit integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970). This results in a representable date range of about 290 million years into the past and future.
I know that we can use getTimestamp() to retrieve the timestamp from the ObjectId, but is there any way to generate an ObjectId from a timestamp?
More specifically, if I have an input of month and year, then I want to convert it into Mongo ObjectID to query in db, how should I do this?
try this,
> ObjectId("5a682326bf8380e6e6584ba5").getTimestamp()
ISODate("2018-01-24T06:09:42Z")
> ObjectId.fromDate(ISODate("2018-01-24T06:09:42Z"))
ObjectId("5a6823260000000000000000")
Works from mongo shell.
If you pass a number to the bson ObjectId constructor it will take that as a timestamp and pass it to the generate method.
You can get a Date from a month and year per this answer (months start at zero).
So:
timestamp = ~~(new Date(2016, 11, 17) / 1000)
new ObjectId(timestamp)
Yes you can:
dummy_id = ObjectId.from_datetime(gen_time)
Where gen_time is datetime.
An ObjectId() is a 12-byte BSON type and consists of:
The first 4 bytes representing the seconds since the unix epoch
The next 3 bytes are the machine identifier
The next 2 bytes consists of process id
The last 3 bytes are a random counter value
Clearly, you will not be able to create ObjectId() only from timestamp.