My Mongo query is too large and I'm reaching a memory issue - mongodb

I'm reaching some sort of RAM limit when doing this query, here's the error:
The operation: #<Moped::Protocol::Query
#length=100
#request_id=962
#response_to=0
#op_code=2004
#flags=[]
#full_collection_name="test_db.cases"
#skip=1650
#limit=150
#selector={"$query"=>{}, "$orderby"=>{"created_at"=>1}}
#fields=nil>
failed with error 17144: "Runner error: Overflow sort stage buffered data usage of 33555783 bytes exceeds internal limit of 33554432 bytes"
See https://github.com/mongodb/mongo/blob/master/docs/errors.md
for details about this error.
There are two solutions I can think of:
1) up the buffer limit. this requires mongo 2.8 which is some unstable release that i'd have to install manually.
2) break apart the query? chunk it? this is what the query looks like:
upload_set = Case.all.order_by(:created_at.asc).skip(#set_skipper).limit(150).each_slice(5).to_a
#set_skipper grows by 150 every time the method is called.
Any help?

From http://docs.mongodb.org/manual/reference/limits/
Sorted Documents
MongoDB will only return sorted results on fields without an index if
the combined size of all documents in the sort operation, plus a small
overhead, is less than 32 megabytes.
Did you try using an index on created_at ? That should remove that limitation.

Related

Is there a max limit to the $nin operator in MongoDB?

My sample use case is to Query data about people who have not blocked the user
and there is no limit to the number of people that can block the user
So my query looks something like
db.collection().find( { followedPersonId: { $nin: [ blockerId1, blockerId2, blockerId3.....] } } )
So the number of Array items in the $nin operator can grow to a potentially large number, So is there a limit to the size of this array in MongoDB?
The command size limit is currently set at 48 MB. If your query is bigger than that the driver should fail when trying to serialize it, and the server should fail if it was asked to parse it.
Since your query is technically a query document, I imagine the lower 16 MB BSON document size limit would also apply to it. The 48 MB limit applies to find-and-modify queries that specify a query document and an update document.
Short answer is NO!
When you try to do this the query will become huge. And performance will be less.
Will the ids be in crores?
If yes then I will suggest not to go with this method.

DocumentTooLarge during query

I want to get a large number (1 million) of documents by their object id, stored in a list obj_ids. So I use
docs = collection.find({'_id' : {'$in' : obj_ids}})
However, when trying to access the documents (e.g. list(docs)) I get
pymongo.errors.DocumentTooLarge: BSON document too large (19889042 bytes) - the connected server supports BSON document sizes up to 16777216 bytes.
which confuses me. As I understand this, the document size is 16 MB for a single document. But I don't think I have any document exceeding this limit:
I did not get this error message when inserting any of the documents in the first place.
This error does not show up if I chunk the ObjectIds into 2 subsets, and recombine the results later.
So if there is not some too big document in my collection, what is the error message about?
Your query {'_id' : {'$in' : obj_ids}} is the issue, that's too large, not the documents themselves.
You'll need to refactor your approach; maybe do it in batches and join the results.

Change MongoDB Document Max Size

I am using mongodb for one of my application.
We are fetching large amount of records from the db.
We are facing following issue when we fetch large number of documents from db.
aggregation result exceeds maximum document size
Any option to set this max limit?
The documentation states that:
If you do not specify the cursor option or store the results in a
collection, the aggregate command returns a single BSON document that
contains a field with the result set. As such, the command will
produce an error if the total size of the result set exceeds the BSON
Document Size limit.
Earlier versions of the aggregate command can only return a single
BSON document that contains the result set and will produce an error
if the if the total size of the result set exceeds the BSON Document
Size limit.
The maximum BSON document size is 16 megabytes.
That actually is the problem that you have now.

maximum limit for MongoDB 2.6 sort query

sorting 2 millions of records using mongo sort is possible or not?
From the MongoDB Documentation, it is clearly mentioned that "When the sort operation consumes more than 32 megabytes, MongoDB returns an error."
But I have a requirement to sort huge number of records. How to do it?
It's possible. The documentation states that 32MB limit is there only when MongoDB sorts data in-memory i.e. without using an index.
When the sort operation consumes more than 32 megabytes, MongoDB
returns an error. To avoid this error, either create an index to
support the sort operation or use sort() in conjunction with limit().
The specified limit must result in a number of documents that fall
within the 32 megabyte limit.
I suggest that you add an index on the field on which you want to sort with ensureIndex command:
db.coll.ensureIndex({ sortFieldName : 1});
If you're sorting on multiple fields, you will need to add an compound index on the fields your sorting on (order of the fields in index matter):
db.coll.ensureIndex({ sortFieldName1 : 1, sortFieldName2 : 1});

In Mongodb, how do I get the count of the total results returned, without the limit?

Let's say i put a limit and skip on the MongoDB query...I want to know the total results if there was not a limit on there.
Of course, I could do this the shitty way...which is to query twice.
In MongoDB the default behavior of count() is to ignore skip and limit and count the number of results in the entire original query. So running count will give you exactly what you want.
Passing a Boolean true to count or calling size instead would give you a count WITH skip or limit.
There is no way to get the count without executing the query twice.