mongodb sort order on _id - mongodb

I wonder how mongodb compare the "_id" field when doing query like the following:
db.data.find({"_id":{$gt:ObjectId("502aa46c0674d23e3cee6152")}}).sort({"_id":1}).limit(10);
Is it purely based on timestamp portion of the id?

To expand slightly on what Andre said:
Since the ObjectID timestamp is only to the second, two (or more) ObjectIDs could easily be created with the same value for the timestamp (the first 4 bytes). If these were created on the same machine (machine ID - the next 3 bytes), by the same process (PID - the next 2 bytes), then the only thing to differentiate them would be the "inc" field, the last 3 bytes at the end.
Update: Jan 2020
This answer continues to be popular so it is worth updating a little. The ObjectID spec has evolved since this answer was written 8 years ago and the 5 bytes after the timestamp are now simply random, which will greatly decrease the likelihood of any collisions. The last three bytes are still incremental, but initialised at a random value to start, again making collisions less likely. The ObjectID now contains less context (you can't easily tell where it was generated and by what process) but I would guess that the information was not being used in any meaningful way and has been deprecated in favor of better randomisation of the ID.
End Update
See here for the full spec:
https://docs.mongodb.com/manual/reference/method/ObjectId/#ObjectIDs-BSONObjectIDSpecification
That "inc" field is either an ever incrementing field (then you can reasonably expect the sort to be in the insert/create order) or a random value (then likely unique, but not ordered), assuming the spec is implemented correctly of course. Note that the ObjectIDs may be generated by the driver, or the application (or indeed manually) rather than by MongoDB itself, so unless you have full control over how they are generated, then any or all of the above may apply.

In a way you are correct, if you sort by the _id you will sort by the insertion time. This does not mean that the only comparison is done on the timestamp portion. ObjectID's are a BSON object type in their own right, they can be directly compared with each other. As they start with a timestamp, it follows logically that those in the past will be less than those in the future.
You can find more detail in the documentation

copy paste from Mongo specs
https://docs.mongodb.com/manual/reference/bson-types/#objectid
The relationship between the order of ObjectId values and generation time is not strict within a single second. If multiple systems, or multiple processes or threads on a single system generate values, within a single second; ObjectId values do not represent a strict insertion order. Clock skew between clients can also result in non-strict ordering even for values, because client drivers generate ObjectId values, not the mongod process.

Related

Using UUIDs instead of ObjectIDs in MongoDB

We are migrating a database from MySQL to MongoDB for performance reasons and considering what to use for IDs of the MongoDB documents. We are debating between using ObjectIDs, which is the MongoDB default, or using UUIDs instead (which is what we have been using up until now in MySQL). So far, the arguments we have to support any of these options are the following:
ObjectIDs:
ObjectIDs are the MongoDB default and I assume (although I'm not sure) that this is for a reason, meaning that I expect that MongoDB can handle them more efficiently than UUIDs or has another reason for preferring them. I also found this stackoverflow answer that mentions that usage of ObjectIDs makes indexing more efficient, it would be nice however to have some metrics on how much this "more efficient" is.
UUIDs:
Our basic argument in favour of using UUIDs (and it is a quite important one) is that they are supported, one way or another, by virtually any database. This means that if some way down the road we decide to switch from MongoDB to something else for whatever reason and we already have an API that retrieves documents from the DB based on their IDs nothing changes for the clients of this API since the IDs can continue to be exactly the same. If we were to use ObjectIDs I'm not really sure how we would go about migrating them to another DB.
Does anyone have any insights on whether one of these options may be better than the other and why? Have you ever used UUIDs in MongoDB instead of ObjectIDs and if yes what were the advantages / problems you came across?
Using UUIDs in Mongo is certainly possible and reasonably well supported. For example, the Mongo docs list UUIDs as one of the common options for the _id field.
Considerations
Performance – As other answers mention, benchmarks show UUIDs cause a performance drop for inserts. In the worst case measured (going from 10M to 20M docs in a collection) they've about ~2-3x slower – the difference between inserting 2,000 (UUID) and 7,500 (ObjectID) docs per second. This is a large difference but its significance depends entirely on you use case. Will you be batch inserting millions of docs at a time? For most apps I've build the common case is inserting individual documents. The same benchmarks show that, for that usage pattern, the difference is much smaller (6,250 -vs- 7,500; ~20%). Not insignificant.. but not earth shattering either.
Portability – Many other DB platforms have good UUID support so portability would be improved. Alternatively, since UUIDs are larger (more bits) it is possible to repack an ObjectID into the "shape" of a UUID. This approach isn't as nice as direct portability but it does give you a way to "map" between existing ObjectIDs and UUIDs.
Decentralisation – One of the big selling points of UUIDs is that they're universally unique. This makes it practical to generate them anywhere, in a decentralised fashion (in contrast to, for example an auto-incrementing value, that requires a centralised source of truth to determine the "next" value). Of course, Mongo Object IDs profess this benefit too. The difference is, UUIDs are based on a 15+ year old standard and supported on (nearly?) all platforms, languages, etc. This makes them very useful if you ever need to create entities (or specifically, sets of related entities) in disjointed systems, without interacting with the database. You can create a dataset with IDs and foreign keys in place, then write the whole graph into the database at some point in the future without conflict. Although this is also possible with Mongo ObjectIDs, finding code to generate them/work with the format will often be harder.
Corrections
Contrary to some of the other answers:
UUIDs do have native Mongo support – You can use the UUID() function in the Mongo Shell exactly the same way you'd use ObjectID(); to convert a UUID string into equivalent BSON object.
UUIDs are not especially large – When encoded using binary subtype 0x04 they're 128 bits, compared to 96 bits for ObjectIDs. (If encoded as strings they will be pretty wasteful, taking around 288 bits.)
UUIDs can include a timestamp – Specifically, UUIDv1 encodes a timestamp with 60 bits of precision, compared to 32 bits in ObjectIDs. In decimal, this is over 6 orders of magnitude more precision – so nanoseconds instead of seconds. It can actually be a decent way of storing create timestamps with more accuracy than Mongo/JS Date objects support, however...
The build in UUID() function only generates v4 (random) UUIDs so, to leverage this this, you'd to lean on on your app or Mongo driver for ID creation.
Unlike ObjectIDs, because of the way UUIDs are chunked, the timestamp doesn't give you a natural order. This can be good or bad depending on your use case. (New standards may change this; see 2021 update below.)
Including timestamps in your IDs is sometimes a Bad Idea. You end up leaking the created time of documents anywhere an ID is exposed. (Of course ObjectIDs also encode a timestamp so this is partly true for them too.)
If you do this with (spec compliant) v1 UUIDs, you're also encoding part of the servers MAC address, which can potentially be used to identify the machine. Probably not an issue for most systems but also not ideal. (New standards may change this; see 2021 update below.)
Conclusion
If you think about your Mongo DB in isolation, ObjectIDs are the obvious choice. They work well out of the box and are a perfectly capable default. Using UUIDs instead does add some friction, both when working with the values (needing to convert to binary types, etc.) and in terms of performance. Whether this slight inconvenience is worth having a standardised ID format really depends on the importance you place on portability and your architectural choices.
Will you be syncing data between different database platforms? Will you migrate your data to a different platform in the future? Do you need to generate IDs outside the database, in other systems or in the browser? If not now at some point in the future? UUIDs might be worth the hassle.
Aug 2021 Update
The IEFT recently published a draft update to the UUID spec that would introduce some new versions of the format.
Specifically, UUIDv6 and UUIDv7 are based on UUIDv1 but flip the timestamp chunks so the bits are arranged from most significant to least significant. This gives the resultant values a natural order that (more or less) reflects the order in which they were created. The new versions also exclude data derived from the servers MAC address, addressing a long-standing criticism of v1 UUIDs.
It'll take time for these changes to flow though to implementations but (IMHO) they significantly modernise and improve the format.
The _id field of MongoDB can have any value you want as long as you can guarantee that it is unique for the collection. When your data already has a natural key, there is no reason not to use this in place of the auto-generated ObjectIDs.
ObjectIDs are provided as a reasonable default solution to safe time generating an own unique key (and to discourage beginners from trying to copy SQL's AUTO INCREMENT which is a bad idea in a distributed database).
By not using ObjectIDs you also miss out on another convenience feature: An ObjectID also includes an unix timestamp when it was generated, and many drivers provide a funtion to extract it and convert it to a date. This can sometimes make a separate create-date field redundant.
But when neither is a concern for you, you are free to use your UUIDs as _id field.
Consider the amount of data you would store in each case.
A MongoDB ObjectID is 12 bytes in size, is packed for storage, and its parts are organized for performance (i.e. timestamp is stored first, which is a logical ordering criteria).
Conversely, a standard UUID is 36 bytes, contains dashes and is typically stored as a string. Further, even if you strip non-numeric characters and intend to store numerically, you must still content with its "indexy" portion (the part of a UUID v1 that is timestamp-based) is in the middle of the UUID, and doesn't lend itself well to sorting. There are studies done which allow for performant UUID storage, and I even wrote a Node.js library to assist in its management.
If you intend to use a UUID, consider reorganizing it for optimal indexing and sorting; otherwise you'll likely hit a performance wall.
We must be careful to distinguish the cost of MongoDB inserting a thing vs. the cost to generate the thing in the first place plus that cost relative to the size of the payload. Below is a little matrix that shows method of generating the _id crossed against the size of an optional extra bytes worth of payload. Tests are using javascript only, conducted on MacBook Pro localhost for 100,000 inserts using insertMany of batches of 100 without transactions to try to remove network, chatty, and other factors. Two runs with batch = 1 were also done just to highlight the dramatic difference.
Method
A : Simple int: _id:0, _id:1, ...
B : ObjectId _id:ObjectId("5e0e6a804888946fa61a1976"), ...
C : Simple string: _id:"A0", _id:"A1", ...
D : UUID length string _id:"9575edcc-cb70-4d63-97ed-ee5d624de87b0", ...
(but not actually
generated by UUID()
E : Real generated UUID _id: UUID("35992974-21ea-4f61-b715-2dfaed663b73"), ...
(stored UUID() object)
F : Real generated UUID _id: "6b16f733-ff24-4172-83f9-e4f96ace6775"
(stored as string, e.g.
UUID().toString().substr(6,36)
Time in milliseconds to perform 100,000 inserts on fresh (empty) collection.
Extra M E T H O D (Batch = 100)
Payload A B C D E F % drop A to F
-------- ---- ---- ---- ---- ---- ---- ------------
None 2379 2386 2418 2492 3472 4267 80%
512 2934 2928 3048 3128 4151 4870 66%
1024 3249 3309 3375 3390 4847 5237 61%
2048 3953 3832 3987 4342 5448 5888 49%
4096 6299 6343 6199 6449 7634 8640 37%
8192 9716 9292 9397 10816 11212 11321 16%
Extra M E T H O D (Batch = 1)
Payload A B C D E F % drop A to F
-------- ----- ----- ----- ----- ----- -----
None 48006 48419 49136 48757 50649 51280 6.8%
1024 50986 50894 49383 49373 51200 51821 1.2%
This was a quicky test but it seems clear that basic strings and ints as _id are roughly the same speed but actually generating a UUID adds time -- especially if you take the string version of the UUID() object, e.g. UUID().toString().substr(6,36) It is also worth noting that constructing an ObjectId appears to be as quick.
I have been thinking about this for last several weeks. What I simply found is ObjectId and UUID both are unique. In fact in collection level you can not have duplicate _id whatever type you use. Some Of the answers talked about insertion performance. The important thing is it is not about insertion performance it needs to be indexing performance. That needs to be counted based on how much ram you are going to use for indexing _ids. We know that ObjectId is 12 bytes where UUID is 36 bytes. It says that for same amount of index you will need 2 times more ram space if you use UUID instead of ObjectId.
So from that point of view it is better to use ObjectId over UUID in mongodb.
UUID are 128 bit (16 byte) and are globally unique. See RFC 4122.
Object Ids are a MongoDB specific construct and is 96 bits (12 bytes). And although it would be enough to provide uniqueness globally but there are some edge conditions. MongoDB has this official document to compare the two.
We prefer to not be tied down with MongoDB specific ID generation and prefer to do it on the client side. We also use multiple kinds of databases. The bottom line is that choosing UUID over ObjectId is a decision one can take based on their specific use cases.
I found these Benchmarks sometime ago when I had the same question.
They basically show that using a Guid instead of ObjectId causes Index Performance drop.
I would anyways recommend that you customize the Benchmarks to imitate your specific real life scenario and see how the numbers look like, one cannot rely 100% on generic Benchmarks.
Try this
const uuid = require('uuid')
const mongoose = require('mongoose')
const YourSchema = new Schema({
_id:{
type: String,
default: () => uuid.v4().replace(/\-/g, ""),
}
})

mongodb part of objectid most likely to be unique

In my app I'm letting mongo generate order id's via its ObjectId method.
But in user testing we've had some concerns that the order id's are humanly 'intimidating', i.e. if you need to discuss your order with someone over the telephone, reading out 24 alphanumeric characters is a bit tedious.
At the same time, I don't really want to have to store two different id's, one 'human-accessible' and one used by mongo internally.
So my question is this - is there a way to choose a substring of length 6 or even 8 of the mongo objectId string that I could be fairly sure would be unique ?
For example if I have a mongo objectid like this
id = '4b28dcb61083ed3c809e0416'
maybe I could take out
human_id = id.substr(0,7);
and be sure that i'd always get unique id's for my orders...
The advantage of course is that these are orders, and so are human-created, and so there aren't millions of them per millisecond. On the other hand, it would really be a problem if two orders had the same shortened id...
--- clearer explanation ---
I guess a better way to ask my question would be this :
If I decide for example to just use the last 6 characters of a mongo id, is there some kind of measure of 'probability' that just these 6 characters would repeat in a given week ?
Given a certain number of mongo's running in parallel, a certain number of users during the week, etc.
If you have multiple web servers, with multiple processes, then there really isn't something you can remove with losing uniqueness.
If you look at the nature of the ObjectId:
a 4-byte value representing the seconds since the Unix epoch,
a 3-byte machine identifier,
a 2-byte process id, and
a 3-byte counter, starting with a random value.
You'll see there's not much there that you could safely remove. As the first 4 bytes are time, it would be challenging to implement an algorithm that removed portions of the time stamp in a clean and safe way.
The machine identifier and process identifier are used in cases where there are multiple servers and/or processes acting as clients to the database server. If you dropped either of those, you could end up with duplicates again. The random value as the last 3 bytes is used to make sure that two identifiers, on the same machine, within the same process are unique, even when requested frequently.
If you were using it as an order id, and you want assured uniqueness, I wouldn't trim anything away from the 12 byte number as it was carefully designed to provide a robust and efficient distributed mechanism for generating unique numbers when there are many connected database clients.
If you took the last 5 characters of the ObjectId ..., and in a given period, what's the probability of conflict?
process id
counter
The probability of conflict is high. The process id may remain the same through the entire period, and the other number is just an incrementing number that would repeat after 4095 orders. But, if the process recycles, then you also have the chance that there will be a conflict with older orders, etc. And if you're talking multiple database clients, the chances increase as well. I just wouldn't try to trim the number. It's not worth the unhappy customers trying to place orders.
Even the timestamp and the random seed value aren't sufficient when there are multiple database clients generating ObjectIds. As you start to look at the various pieces, especially in the context of a farm of database clients, you should see why the pieces are there, and why removing them could lead to a meltdown in ObjectId generation.
I'd suggest you implement an algorithm to create a unique number and store it in the database. It's simple enough to do. It does impact performance a bit, but it's safe.
I wrote this answer a while ago about the challenges of using an ObjectId in a Url. It includes a link to how to create a unique auto incrementing number using MongoDB.
Actually what you choose for and Id (actually _id in MongoDB storage) is totally up to you. If there is some useful data you can keep in _id as long as you keep it unique, then do so. If it has to be something valid to url encoding, then do so.
By default, if you do not specify an _id then that field will be populated with the value you have come to love and hate. But if you explicitly use it, then you will get what you want.
The extra thing to keep in mind is that even if you specify an addtional unique index field, let's say order_id then MongoDB would actually have to check through that and other indexes on a query plan to see which one was best to use. But if _id was your key, the plan would give up and go strait for the 'Primary Key', and this is going to be a lot faster.
So make your own Id just as long as you can ensure it will be unique.

Mongo: How to enable strict FIFO data retrieval?

I have a mongo collection I'd like to retrieve in first in, first out (FIFO) order. We're batch-importing a few hundred tasks each sec and from what I understand, documents imported within the same second are not necessarily retrieved in the same order they were inserted.
To quote from http://docs.mongodb.org/manual/reference/object-id/:
The relationship between the order of ObjectId values and
generation time is not strict within a single second. If multiple
systems, or multiple processes or threads on a single system generate
values, within a single second; ObjectId values do not represent a
strict insertion order. Clock skew between clients can also result in
non-strict ordering even for values, because client drivers generate
ObjectId values, not the mongod process.
My question is: is there a common practice for ensuring strict FIFO in mongo? At the moment we're tempted to add a new key w/nanoseconds, but adding an entire column just to ensure FIFO seems a little excessive. Any thoughts appreciated

Are Mongodb ObjectIds GUIDs?

The autogenerated BSON ID that is stored in the _id field of every document, is it a GUID?
The documentation says its 'most likely unique', so I am a little confused. Why would they use an id that is not guaranteed to be unique?
Its uniqness is based upon probability. Unlike #mattexx answer:
It's not "guaranteed" to be unique because MongoDB does not enforce uniqueness to save time.
MongoDB DOES enforce uniqness on the ObjectId, it in fact has a unique index on the _id field. When talking about saving time, the ObjectId is historical in that manner since it was designed in the days when MongoDB did not ack any writes and needed a 99% chance of being able to insert a new unique record without the client waiting for an ack (ObjectIds are generated client side).
They are not GUIDs however they, as #Asya says, are guaranteed to have a high level of uniqness.
So long as time never moves backwards there is still a 99% chance it will be unique forever. Okay, as #Devesh says, there is a, 1 in 1 trillion (? haven't done the math), chance of even a GUID being duplicated but, again, I do not think you will reach that probability anytime soon.
It is unique in most of the requirement and it is consists of timestamp , unique identifier of the machine (hash of the machine host) , process Identifier and in last the increment number. http://docs.mongodb.org/manual/reference/object-id/
The chance of an ID collision is theoretically close enough to zero that it can be presumed for typical web apps. Many real-world systems (Mongo or not) do rely on this property of GUIDs, though it would not be a good assumption for safety/mission-critical systems.
In practical terms, there are indeed scenarios where it could go wrong if there's a misconfiguration or third-party library bug. Those shouldn't rule out the concept, but important to be aware of those risks and avoid them where possible.
Some good analysis here of practical issues that could cause a collision. In particular:
Some Mongo drivers use random numbers instead of incrementing numbers for the counter bytes. In these cases, there is a 1/16,777,216 chance of generating a non-unique ID, but only if those two IDs are generated in the same second (i.e. before the time section of the ID updates to the next second), on the same machine, in the same process.
ObjectId is explained in the doc here. It's not "guaranteed" to be unique because MongoDB does not enforce uniqueness to save time. It simply trusts that the complicated generation algorithm will probably never produce two identical ObjectIds in the same datastore. So technically it is not a GUID, but pretty much as good.

Best NoSql for querying date ranges?

Given a store which is a collection of JSON documents in the (approximate) form of:
{
PeriodStart: 18/04/2011 17:10:49
PeriodEnd: 18/04/2011 17:15:54
Count: 12902
Max: 23041 Min: 0
Mean: 102.86 StdDev: 560.97
},
{
PeriodStart: 18/04/2011 17:15:49
PeriodEnd: 18/04/2011 17:20:54
Count: 10000
Max: 23041 Min: 0
Mean: 102.86 StdDev: 560.97
}... etc
If I want to query the collection for given date range (say all documents from last 24 hours), which would give me the easiest querying operations to do this?
To further elaborate on requirements:
Its for an application monitoring service, so strict CAP/ACID isn't necessarily required
Performance isn't a primary consideration either. Read/writes would be at most 10s per second which could be handled by an RDBMS anyway
Ability to handle changing document schema's would be desirable
Ease of querying ability of lists/sets is important (ad-hoc queries an advantage)
I may not have your query requirements down exactly, as you didn't specify. However, if you need to find any documents that start or end in a particular range, then you can apply most of what is written below. If that isn't quite what you're after, I can be more helpful with a bit more direction. :)
If you use CouchDB, you can create your indexes by splitting up the parts of your date into an array. ([year, month, day, hour, minute, second, ...])
Your map function would probably look similar to:
function (doc) {
var date = new Date(doc.PeriodStart);
emit([ date.getFullYear(), date.getMonth(), date.getDate(), date.getHours(), date.getMinutes() ] , null]);
}
To perform any sort of range query, you'd need to convert your start and end times into this same array structure. From there, your view query would have params called startkey and endkey. They would would receive the array parameters for start and end respectively.
So, to find the documents that started in the past 24 hours, you would send a querystring like this in addition to the full URI for the view itself:
// start: Apr 17, 2011 12:30pm ("24 hours ago")
// end: Apr 18, 2011 12:30pm ("today")
startkey=[2011,04,17,12,30]&endkey=[2011,04,18,12,30]
Or if you want everything from this current year:
startkey=[2011]&endkey=[2011,{}]
Note the {}. When used as an endkey: [2011,{}] is identical to [2012] when the view is collated. (either format will work)
The extra components of the array will simply be ignored, but the further specificity you add to your arrays, the more specific your range can be. Adding reduce functions can be really powerful here, if you add in the group_level parameter, but that's beyond the scope of your question.
[Update edited to match edit to original question]
Short answer, (almost) any of them will work.
BigTable databases are a great platform for monitoring services (log analysis, etc). I prefer Cassandra (Super Column Families, secondary indexes, atomic increment coming soon), but HBase will work for you too. Structure the date value so that its lexicographic ordering is the same as the date ordering. Fixed-length strings following the format "YYYYMMDDHHmmss" work nicely for this. If you use this string as your key, range queries will be very simple to perform.
Handling changing schema is a breeze - just add more columns to the column family. They don't need to be defined ahead of time.
I probably wouldn't use graph databases for this problem, as it'll probably summarize to traversing a linked list. However, I don't have a ton of experience with graph databases, so take this advice with a grain of salt.
[Update: some of this is moot since the question was edited, but I'm keeping it for posterity]
Is this all you're doing with this database? The big problem with selecting a NoSQL database isn't finding one that supports one query requirement well. The problem is finding one that supports all of your query requirements well. Also, what are your operational requirements? Can you accept a single point of failure? What kind of setup/maintenance overhead are you willing to tolerate? Can you sacrifice low latency for high-throughput batch operations, or is realtime your gig?
Hope this helps!
It seems to me that the easiest way to implement what you want is performing a range query in a search engine like ElasticSearch.
I, for one, certainly would not want to write all the map/reduce code for CouchDB (because I did in the past). Also, based on my experience (YMMV), range queries will outperform CouchDB's views and use much less resources for large datasets.
Not to mention you can compute interesting statistics with „date histogram“ facets in ElasticSearch.
ElasticSearch is schema-free, JSON based, so you should be able to evaluate it for your case in a very short time.
I've decided to go with Mongo for the time being.
I found that setup/deployment was relatively easy, and the C# wrapper was adequate for what we're trying to do (and in the cases where its not we can resort to javascript queries easily).
What you want is whichever one gives you access to some kind of spatial index. Most of these work off of B-Trees and/or hashes, neither of which is particularly good for spatial indexing.
Now, if your definition of "last 24 hours" is simply "starts or ends within the last 24 hours" then a B-Tree may be find (you do two queries, one on PeriodStart and then one on PeriodEnd, both being within range of the time window).
But if the PeriodStart to PeriodEnd is longer than the time window, then neither of these will be as much help to you.
Either way, that's what you're looking for.
This question explains how to query a date range in CouchDB. You would need your data to be in a lexicographically sortable state, in all the examples I've seen.
Since this is tagged Redis and nobody has answered that aspect I'm going to put forth a solution for it.
Step one, store your documents under a given redis key, as a hash or perhaps as a JSON string.
Step two, add the redis key (lets call it a DocID) in a sorted set, with the timestamp converted to a UNIX timestamp. For example where r is a redis Connection instance in the Python redis client library:
mydocs:Doc12 => [JSON string of the doc]
In Python:
r.set('mydocs:Doc12', JSONStringOfDocument)
timeindex:documents, DocID, Timestamp:
In Python:
r.zadd('timeindex:documents', 'Doc12', timestamp)
In effect you are building an index of documents based on UNIX timestamps.
To get documents from a range of time, you use zrange (or zrevrange if you want the order reversed) to get the list of Document IDs in that window. Then you can retrieve the documents from the db as normal. Sorted sets are pretty fast in Redis. Further advantages are that you can do set operations such as "documents in this window but not this window", and indeed even store the results in Redis automatically for later use.
One example of how this would be useful is that in your example documents you have a start and end time. If you made an index of each as above, you could get the intersection of the set of documents that start in a given range and the set of documents that end in a given range, and store the resulting set in a new key for later re-use. This would be done via zinterstore
Hopefully, that helps someone using Redis for this.
Mongodb is very positive for queries, i think that it's useful because has a lot of functions. I use mongodb for GPS distance, text search and pipeline model (aggregation includes)