How many BSON ObjectIds can a mongodb array field save

How many BSON ObjectIds can a mongodb array field save - mongodb

I am planning to save huge number of ids for foreignkey ids in this array. So, I am just checking what would be the max number of BSON::ObjectIds I can save in the field array. Lets say for example
department_ids: [BSON::OBjectId('57cf6d6e8315292136000001'), BSON::OBjectId('57cf6d6e8315292136000002') ...... ]

16MB is big enough to hold really large amount of ObjectId, ObjectIds aren't that heavy, they are 12 bytes and when you divide it by 16MB you get well beyond 1 million.
But in case you still aren't assured, you can benefit by the flexible schema design of Mongo and create one follow-up document to hold further arrays and store the _id of that document in the concerned document with a field named as "followedBy" or something.
downside is you will have to execute a follow-up query (or maybe not).
Hope that helps.

No such limit mentioned in documents of monogdb for it but a javascript array can have upto 2^32-1 = 4,294,967,295 = 4.29 billion elements.
And mongodb document could have upto 16MB.

Every ObjectID uses 12 bytes, if the limit is 16 MB you could have approximately 1398101 objectsID in a array per document.
Maybe the DBRefs could help you, or you could use a GridFS collection.
If you could avoid the joins that would be the best solution on mongo.

Edited 1st July 2021
Since the document that contains the array can not exceed 16 megabytes, the number of objects in an array is limited.
The other answers explain how to calculate this.
MongoDB Limits and Thresholds

Related

Projection in MongoDb

I am learning MongoDb and a question came to my mind regarding projection.
When we do a projection for some fields, what does MongoDB do?
Would it read the whole document and then drop some fields and returns the results or it won't read excluded fields and return the fields mentioned in the query.
For e.g. If I have a document with 4 fields and 3 arrays(each of size ~10) and I just want the 4 fields and not the arrays.
Would MongoDB read the whole document and drop the array or would just read the 4 fields?
If it's the first case how the execution time or latency would differ if the array becomes big in the document?

The document is compressed on storage , so mongo need to read the document first , uncompress it and get the fields specified in the filter only.
The trick here is that when you search by some of the fields you need to index them so the search to happen faster in memory and to avoid mongo to read all documents one by one and check for the searched field.
And if you need faster access for only those fields it is best all those fields to be in compound index and you search them via so called "covered query" , then you will search only in memory and fetch only from memory without accessing storage which will be much more faster.
Also in many cases it happen that same documents are searched multiple times so the mongoDB predictive algorithm is caching those documents in memory to be accessed faster.

How to handle MongoDB documents with array larger than 16MB

There is a document with array, which size is more than 16 MB. How to store this document to be able to query some data from this array.

When you have documents which exceed the 16MB limit then you are very likely taking the denormalization approach of MongoDB too far and should consider to create another collection with one document for each array entry (or one document for each sensible grouping of array entries).
Another option is to treat the content as binary data and store it as a file in GridFS, but then you won't be able to do any meaningful queries on its content (only on the metadata you write for it separately).
The 16MB limit is hardcoded. You can not change it through configuration. There was a bugtracker ticket for that and it was closed as "Won't fix". But considering that MongoDB is open source, you could always change it in the sourcecode. Just keep the license conditions in mind when you do that.

16 MB size of aggregration pipeline in MongoDB

This is about a recommendation on mongodb. I have a collection that always increase row by row (I mean the count of documents). It is about 5 billion now. When I make a request on this collection I sometimes get the error about 16 MB size.
The first thing that I want to ask is which structure is the best way of creating collections that increasing the rows hugely. What is the best approach? What should I do for this kind of structure and the performance?

Just to clarify, the 16MB limitation is on documents, not collections. Specifically, its the maximum BSON document size, as specified in this page in the documentation.
The maximum BSON document size is 16 megabytes.
If you're running into the 16MB limit in aggregation is because you are using MongoDB version 2.4 or older. In these, the aggregate() method returned a document, which is subject to the same limitation as all other documents. Starting in 2.6, the aggregate() method returns a cursor, which is not subject to the 16MB limit. For more information, you should consult this page in the documentation. Note that each stage in the aggregation pipeline is still limited to 100MB of RAM.

How does MongoDB calculate the size of a document?

In MongoDB there is the maximum size of 16 MByte per document. Does this size limit include sub-documents?
In other words: Are the 16 MByte per document including its sub-documents, or is it 16 MByte per document and each sub-document counts as an own document?

Yes, this is 16MB limit for the whole structure, including sub-documents.
Keep in mind that what you call sub-documents, MongoDB sees as regular values. From its perspective, they are no different than, say, strings. Just values.

Mapping datasets to NoSql (MongoDB) collection

what I have ?
I have data of 'n' department
each department has more than 1000 datasets
each datasets has more than 10,000 csv files(size greater than 10MB) each with different schema.
This data even grow more in future
What I want to DO?
I want to map this data into mongodb
What approaches I used?
I can't map each datasets to a document in mongo since it has limit of 4-16MB
I cannot create collection for each datasets as max number of collection is also limited (<24000)
So finally I thought to create collection for each department , in that collection one document for each record in csv file belonging to that department.
I want to know from you :
will there be a performance issue if we map each record to document?
is there any max limit for number of documents?
is there any other design i can do?

will there be a performance issue if we map each record to document?
mapping each record to document in mongodb is not a bad design. You can have a look at FAQ at mongodb site
http://docs.mongodb.org/manual/faq/fundamentals/#do-mongodb-databases-have-tables .
It says,
...Instead of tables, a MongoDB database stores its data in collections,
which are the rough equivalent of RDBMS tables. A collection holds one
or more documents, which corresponds to a record or a row in a
relational database table....
Along with limitation of BSON document size(16MB), It also has max limit of 100 for level of document nesting
http://docs.mongodb.org/manual/reference/limits/#BSON Document Size
...Nested Depth for BSON Documents Changed in version 2.2.
MongoDB supports no more than 100 levels of nesting for BSON document...
So its better to go with one document for each record
is there any max limit for number of documents?
No, Its mention in reference manual of mongoDB
...Maximum Number of Documents in a Capped Collection Changed in
version
2.4.
If you specify a maximum number of documents for a capped collection
using the max parameter to create, the limit must be less than 232
documents. If you do not specify a maximum number of documents when
creating a capped collection, there is no limit on the number of
documents ...
is there any other design i can do?
If your document is too large then you can think of document partitioning at application level. But it will have high computation requirement at application layer.

will there be a performance issue if we map each record to document?
That depends entirely on how you search them. When you use a lot of queries which affect only one document, it is likely even faster that way. When a higher document-granularity results in a lot of document-spanning queries, it will get slower because MongoDB can't do that itself.
is there any max limit for number of documents?
No.
is there any other design i can do?
Maybe, but that depends on how you want to query your data. When you are content with treating files as a BLOB which is retrieved as a whole but not searched or analyzed on the database level, you could consider storing them on GridFS. It's a way to store files larger than 16MB on MongoDB.
In General, MongoDB database design doesn't depend so much on what and how much data you have, but rather on how you want to work with it.