What is the best way to store dates in MongoDB? - mongodb

I am just starting to learn about MongoDB and hoping to slowly migrate from MySQL.
In MySQL, there are two different data types - DATE ('0000-00-00') and DATETIME ('0000-00-00 00:00:00'). In my MySQL, I use the DATE type, but I am not sure how to transfer them into MongoDB. In MongoDB, there is a Date object, which is comparable to DATETIME. It seems it would be most appropriate to use Date objects, but that would be wasting space, since hours, min, sec are not utilized. On the other hand, storing dates as strings seems wrong.
Is there a golden standard on storing dates ('0000-00-00') in MongoDB?

I'm actually in the process of converting a MongoDB database where dates are stored as proper Date() types to instead store them as strings in the form yyyy-mm-dd. Why, considering that every other answerer says that this is a horrible idea? Simply put, because of the neverending pain I've been suffering trying to work with dates in JavaScript, which has no (real) concept of timezones. I had been storing UTC dates in MongoDB, i.e. a Date() object with my desired date and the time set as midnight UTC, but it's unexpectedly complicated and error-prone to get a user-submitted date correctly converted to that from whatever timezone they happen to be in. I've been struggling to get my JavaScript "whatever local timezone to UTC" code to work (and yes, I'm aware of Sugar.js and Moment.js) and I've decided that simple strings like the good old MySQL standard yyyy-mm-dd is the way to go, and I'll parse into Date() objects as needed at runtime on the client side.
Incidentally, I'm also trying to sync this MongoDB database with a FileMaker database, which also has no concept of timezones. For me the simplicity of simply not storing time data, especially when it's meaningless like UTC midnight, helps ensure less-buggy code even if I have to parse to and from the string dates now and then.

BSON (the storage data format used by mongo natively) has a dedicated date type UTC datetime which is a 64 bit (so, 8 byte) signed integer denoting milliseconds since Unix time epoch. There are very few valid reasons why you would use any other type for storing dates and timestamps.
If you're desperate to save a few bytes per date (again, with mongo's padding and minimum block size and everything this is only worth the trouble in very rare cases) you can store dates as a 3 byte binary blob by storing it as an unsigned integer in YYYYMMDD format, or a 2 byte binary blob denoting "days since January 1st of year X" where X must be chosen appropriately since that only supports a date range spanning 179 years.
EDIT: As the discussion below demonstrates this is only a viable approach in very rare circumstances. Basically; use mongo's native date type ;)

If you really care about saving 4 bytes per field (in case you have many DATE fields per document) you can store dates as int32 fields in form 20110720 (note MySQL DATE occupies 3 bytes, so the storage will be greater in any case). Otherwise I'd better stick to standard datetime type.

Related

How does Swift manage dates in CoreData

I have been including a date in my CoreData, and was curious too see what happens behind the scenes with the underlying SQLite database.
I noticed that while integer and varchar are are used for other CoreData attributes, a CoreData Date attribute appears in SQLite as timestamp.
Now I know that SQLite has its quirks, notably:
Columns have a type affinity rather than a fixed type, so you can put whatever you like wherever you like
You can make up your own column types
There is no distinct date/time type, and that SQLite has conversion functions to work with string, integer or real representations of the date/time
I also notice that the date is stored in a human-readable format. However, it appears to be a non-standard format which is also off by a few years.
For example:
The SQLite version is 1989/10/05 13:15:45
The IS8601 formatted version is 2020-10-05T14:15:45+11:00
The ISO8601 exported version above is correct. The SQLite version above is off by 31 years, ignores our DST offset, and doesn’t conform to any standards that I’m aware of.
How does CoreData mange its dates?
See the documentation for Swift Date. Apple use a different reference date (1/1/2001) from Unix (1/1/1970), which explains the bulk, if not all, of the difference.

What is up with this wierd BSON date saving?

I'm currently writing a driver for MongoDB, so I have to dig a little deeper and so I find this:
BSON spec for DateTimeUTC:
"\x09" e_name int64
BSON spec for int64:
"\x12" e_name int64
BSON spec for timeStamp (although I know its almost always used internally, its just to show BSON makes use of unsigned integers):
"\x11" e_name uint64
It seems a bit controversial to me. Why are int64 and utc millis even separated? Does mongoDB use different ways to compare different BSON dateTimeUTCs?
And why is dateTimeUTC NOT a uint64 but a signed integer? millis are always > 0. Is there a reason behind this? Am I missing something?
DateTimeUTC is used to represent a point in time. It predates BSON, and has historically been using signed integer. This is to enable the use of DateTimeUTC to point to a date before the epoch. Otherwise, it won't be possible to represent dates before 1970-01-01 using DateTimeUTC.
In contrast, timestamp is for mostly internal use, and is expected to be used for mostly current dates that have little need to represent a time before the epoch (e.g. the timestamp of an operation).
There's a related question in UNIX StackExchange regarding this: Why does Unix store timestamps in a signed integer?

Datetime format to determine records order

we need sending some objects from database of various types within long-polling by rest. Data are sent and each record contains timestamp. When client receive new data from server he should create another poll request with record's timestamp as parameter which helps to specify following data records.
I consider about epoch unix time and store this value in each record in database to filtering and also this value will be sent with each poll requests.
What do you think about this solution? Is that usage fine or should I worry about something? Thanks.
EDIT:
I forget notice these data will be added by clients in different time-zones. This is also another reason why I consider use unix time.
Any format of storing the timestamp is fine, as long as users will be able to unambiguously interpret it. There is no reason for timestamp format in API to be the same as in database. Idea of API is to decouple model from database.
Personally I would choose one format from ISO 8601 Basic and Extended Notations. Example: 2008-09-15T15:53:00. In virtually all programing languages there are methods to handle this format (cast to unix timestamp or to internal date/time classes). For java you would use java.time.LocalDateTime#parse
Unix timestamp has some issues (they may be or not may be issues for you)
unable to represent dates before January 1st, 1970
unable to represent dates after January 19, 2038
not human-readable
does not contain timezone (timestamp itself does not have concept of timezone, but it may be useful to send client timezone along with timestamp. server may always normalise the value to UTC)

MongoDB - Storing date without timezone

We have a simple application in which we have all user in same timezone & therefore we are not interested to store timezone information in mongo date object.
Reason for such extreme step is we have multiple micro service using common database managed by different developers. Each of them requires to explicitly set timezone related stuff in query & forgetting same results in invalid dataset.
Since currently MongoDB folks Mongo Data Types
doesn't support storing dates without timezone.
Just eager to know that is their any alternative approach to represent date without timezone in mongo by which we can still able to take advantage of mongo database queries date based syntax like date ranges, date etc.
At the same time it would be convenient for DBA's to read and manage records.
Look at this answer: https://stackoverflow.com/a/6776273/6105830
You can use two types of long representation (milliseconds or format yyyyMMddHHmmss). These are the only ways to not store timezone and still be able to make range queries.
Unfortunately you lost some aggregation properties. But you can do something like keeping two representations and use them at opportune times.
UPDATE:
Do not store date as I said before. You will lost many and many features of MongoDB and also will be hard to perform major operators on date fields.
Newer versions of MongoDB has operators to deal with timezone, and it should be enough to work with ISOTime formats. My application was using my own suggestion to store date. Now I have to let my users select their TimeZone (company has grown and we need to expand to other countries). We are struggling to change all models to use timestamp instead of a normalized date format.
For further more explore the link: https://docs.mongodb.com/manual/reference/method/Date/
and you can also use MongoDB official community channel for questioning
Here is the link: https://developer.mongodb.com/community/forums/
You could consider storing all the dates in UTC and presenting to the users in UTC, so you don't have the problem of silent conversion by either client JavaScript, server or MongoDB and therefore confusion.
You can do it like this: new Date(Date.UTC(2000, 01, 28))
Here's MDN link on the subject.

Mongodb timestamp in milliseconds

I have stored date in unix timestamp format in mongodb >> 1449060622
now I want to add milliseconds as well so if records inserted in same seconds can be sorted properly.
can someone suggest me that using js new Date() is more better or simply (new Date).getTime() ?
Whenever you store times in MongoDB you should really consider using the native Date type instead. Not only does it provide you with millisecond precision, it also unlocks a lot of features which are unavailable for simple integer, like date aggregation operators for example.
If you really don't want to use native dates for some obscure reason (I couldn't think of a good one) or don't want to convert your whole database (really, you should) and need a higher precision, you might consider to add new values as floating point values. This ensures interoperability with the old data because integers and floating point values usually can be converted and compared between each other easily.