Binary Search in MongoDB array without index - mongodb

Ok, so I have a very big number of documents in a sharded collection, let's say 1million. Each document holds a SORTED array of documents of size 10000.
In order to access the top-level documents fastly, MongoDB uses the shard order plus the index to quickly find the document in question. Nonetheless, once I reach the document, then I have to look which set of sub-documents(in the array) satisfy my query. Now, I know this array is sorted but MongoDB doesn't. Also, creating 1Million indices is too expensive.
Therefore, my question is the following, is there a way to force MongoDB to binary search a sorted array without an index?

I think that using $where and passing in some custom javascript is your only hope:
https://docs.mongodb.org/manual/reference/operator/query/where/

Related

Does a find() operation on a MongoDB capped collection with an index maintain insertion order?

In a MongoDB database, I have a capped collection, where one of the fields of the documents in the collection is a boolean. Let's suppose the boolean field is called isRelevant.
In addition, there is a single partial index on isRelevant, for documents where isRelevant: false (beyond the default index on _id).
I understand from the MongoDB documentation for capped collections that in general the documents returned by a find query are guaranteed to be returned in their original insertion order, but I'm not sure this still holds when using the mentioned partial index.
My question is, if I perform a find operation on the collection with a limit of e.g. 500 and filter of {isRelevant:false} to retrieve from the collection the first 500 documents where isRelevant:false, are the documents returned guaranteed to be in their original insertion order? If so, how can this be understood from the MongoDB documentation?

How to maintain order of a Mongo collection by sorting on an indexed field efficiently

ObjectId _id <--- index
String UserName
int Points <--- Descending index
Using this document structure as a simple example, we have a collection of users, each with a name and a "points" value. The collection has the usual _id index but also a "descending index" on Points.
Problem
The sample use case would be to maintain a ranking scoreboard (something like the League of Legends/DOTA ranking system or chess elo system). Each users' Points field would be constantly changing but the scoreboard is viewed very frequently and thus needs to be accurately maintained.
My current unoptimized solution
I'm not sure what "ascending/descending sort order means" in the mongo docs, but apparently it doesn't matter for single-field indices anyways.
So currently I'm just doing a very brute force solution of sorting the collection each time a user's Points field gets updated. At least it's indexed so for a smaller userbase this shouldn't be too bad. However, sorting the entire userbase on each update/insertion just seems wrong in general.
Other things I'm considering
There are data structures traditionally used for maintaining order during insert/update such as search trees but implementing that without putting the entire collection in memory seems like a huge project in itself.
I tried to search for some built-in functionality of Mongo indices that automatically maintains order in the collection for you but I couldn't really find anything like that.
Maybe some logic to only re-sort only some chunk of documents directly above and below the insertion/update? This solution seems pretty dependent on the expected spread of Points across the userbase and the use cases of this system.
You don't need to sort additionally already created indices , when you create indices in mongoDB you specify in what direction they need to be sorted(ascending(1) or descending(-1)) , so when you search multiple documents based on some field the result will be already sorted based on this field index order.
Afcourse you can specify explicitly if you need the result in reverse order or sorted by other field.

Projection in MongoDb

I am learning MongoDb and a question came to my mind regarding projection.
When we do a projection for some fields, what does MongoDB do?
Would it read the whole document and then drop some fields and returns the results or it won't read excluded fields and return the fields mentioned in the query.
For e.g. If I have a document with 4 fields and 3 arrays(each of size ~10) and I just want the 4 fields and not the arrays.
Would MongoDB read the whole document and drop the array or would just read the 4 fields?
If it's the first case how the execution time or latency would differ if the array becomes big in the document?
The document is compressed on storage , so mongo need to read the document first , uncompress it and get the fields specified in the filter only.
The trick here is that when you search by some of the fields you need to index them so the search to happen faster in memory and to avoid mongo to read all documents one by one and check for the searched field.
And if you need faster access for only those fields it is best all those fields to be in compound index and you search them via so called "covered query" , then you will search only in memory and fetch only from memory without accessing storage which will be much more faster.
Also in many cases it happen that same documents are searched multiple times so the mongoDB predictive algorithm is caching those documents in memory to be accessed faster.

What does nscannedObjects = 0 actually mean?

As far as I understood, nscannedObjects entry in the explain() method means the number of documents that MongoDB needed to go to find in the disk.
My question is: when this value is 0, what this actually mean besides the explanation above? Does MongoDB keep a cache with some documents stored there?
nscannedObjects=0 means that there was no fetching or filtering to satisfy your query, the query was resolved solely based on indexes. So for example if you were to query for {_id:10} and there were no matching documents you would get nscannedObjects=0.
It has nothing to do with the data being in memory, there is no such distinction with the query plan.
Note that in MongoDB 3.0 and later nscanned and nscannedObjects are now called totalKeysExamined and totalDocsExamined, which is a little more self-explanatory.
Mongo is a document database, which means that it can interpret the structure of the stored documents (unlike for example key-value stores).
One particular advantage of that approach is that you can build indices on the documents in the database.
Index is a data structure (usually a variant of b-tree), which allows for fast searching of documents basing on some of their attributes (for example id (!= _id) or some other distinctive feature). These are usually stored in memory, allowing very fast access to them.
When you search for documents basing on indexed attributes (let's say id > 50), then mongo doesn't need to fetch the document from memory/disk/whatever - it can see which documents match the criteria basing solely on the index (note that fetching something from disk is several orders of magnitude slower than memory lookup, even with no cache). The only time it actually goes to the disk is when you need to fetch the document for further processing (and which is not covered by the statistic you cited).
Indices are crucial to achieve high performance, but also have drawbacks (for example rarely used index can slow down inserts and not be worth it - after each insertion the index has to be updated).

Can we save new record in decending order in mongodb

Can we save new record in decending order in MongoDB? So that the first saved document will be returned last in a find query. I do not want to use $sort, so data should be presaved in decending order.
Is it possible?
According to above mentioned description ,as an alternative solution if you do not need to use $sort, you need to create a Capped collection which maintains order of insertion of documents into MongoDB collection
For more detailed description regarding Capped collections in MongoDB please refer the documentation mentioned in following URL
https://docs.mongodb.org/manual/core/capped-collections/
But please note that capped collections are fixed size collections hence it will automatically flush old documents in case when collection size exceeds size of capped collection
The order of the records is not guaranteed by MongoDB unless you add a $sort operator. Even if the records happen to be ordered on disk, there is no guarantee that MongoDB will always return the records in the same order. MongoDB does quite a bit of work under the hood and as your data grows in size, the query optimiser may pick a different execution plan and return the data in a different order.