MongoDB insertMany returned _id order - mongodb

After using insertMany I need to get the returned _ids from the new documents in the same order as they were inserted.
As stated by the documentation if the ordered parameter is equal to true, mongoDB will perform an ordered insert, but that does not imply that the returned _ids are also ordered in the same way as the new inserted documents.
I did some quick tests to check this and it seems that the order is preserved, but I need to be 100% sure that this is always the behavior.

Related

Is resulting `insertedIds` unordered when insertMany() with ordered=False?

Official documentation of insertMany() https://docs.mongodb.com/manual/reference/method/db.collection.insertMany/ as per Jan 2022 states about using ordered=false
If ordered is set to false, documents are inserted in an unordered format and may be
reordered by mongod to increase performance. Applications should not depend on ordering of inserts if using an unordered insertMany().
All text in that page is explicit about the documents will be inserted in an unordered format which can have forseable effects on the generated ids distribution, but it is not explicit about the insertedIds field in the return value. So, assuming a successful operation are they unordered too? or can I reliably assign the resulting Id to each inserted document?

Fundamental misunderstanding of MongoDB indices

So, I read the following definition of indexes from [MongoDB Docs][1].
Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect.
Indexes are special data structures that store a small portion of the
collection’s data set in an easy to traverse form. The index stores
the value of a specific field or set of fields, ordered by the value
of the field. The ordering of the index entries supports efficient
equality matches and range-based query operations. In addition,
MongoDB can return sorted results by using the ordering in the index.
I have a sample database with a collection called pets. Pets have the following structure.
{
"_id": ObjectId(123abc123abc)
"name": "My pet's name"
}
I created an index on the name field using the following code.
db.pets.createIndex({"name":1})
What I expect is that the documents in the collection, pets, will be indexed in ascending order based on the name field during queries. The result of this index can potentially reduce the overall query time, especially if a query is strategically structured with available indices in mind. Under that assumption, the following query should return all pets sorted by name in ascending order, but it doesn't.
db.pets.find({},{"_id":0})
Instead, it returns the pets in the order that they were inserted. My conclusion is that I lack a fundamental understanding of how indices work. Can someone please help me to understand?
Yes, it is misunderstanding about how indexes work.
Indexes don't change the output of a query but the way query is processed by the database engine. So db.pets.find({},{"_id":0}) will always return the documents in natural order irrespective of whether there is an index or not.
Indexes will be used only when you make use of them in your query. Thus,
db.pets.find({name : "My pet's name"},{"_id":0}) and db.pets.find({}, {_id : 0}).sort({name : 1}) will use the {name : 1} index.
You should run explain on your queries to check if indexes are being used or not.
You may want to refer the documentation on how indexes work.
https://docs.mongodb.com/manual/indexes/
https://docs.mongodb.com/manual/tutorial/sort-results-with-indexes/

Remove obsolete collection in mongodb

I want to delete all the collections from my db which are not used for long time. Is there any why i can check when the particular collection was last used?
It depends what you mean by 'last used'. If you mean the last time a document was inserted into the collection then you could do this by converting the ObjectId of the last inserted document into a date. The following query should return the date the last document was inserted:
db.<collection_name>.findOne({},{_id:1})._id.getTimestamp()
the findOne query will return documents in natural order, therefore if you input no query criteria ('{}') then it will return the most recently inserted document. You can then get the _id field and call the getTimestamp() function
I'm not sure if there is any way to reliably tell when a collection was last queried. If you're running your database with profiling enabled then there might be entries in the db.system.profile collection, or in the oplog.

MongoDB: findOne with $or condition

Such request as db.collection.findOne({$or: [{email: 'email#example.com'},{'linkedIn.id': 'profile.id'}]}); may return an array with two records.
Is it possible to specify to return only the first occurrence so that I always have a model as a response, not an array?
E.g. if there is a record with a specified email, return it and do not return another record, if any, matching profile.id?
Another question is if the order of the params 'email' and 'linkedIn.id' matters.
All this hazel is about LinkedIn strategy, which never returns an email (at least for me) but I have to cater for case when it may return an email. So I construct my query depending on email presence and if it is present, the query is with $or operator. But I would like to avoid checking for whether the response is an object or an array and then perform additional operation on array values to figure out which of the values to use.
According to documentation of mongo DB
findOne()
always returns a single document irrespective of matches it found.
And regarding order of retrieval it will always return the first match except capped collection which maintains order of insertion of documents into collection.
For more detailed description about findOne please refer the documentation as mentioned in following URL
https://docs.mongodb.org/manual/reference/method/db.collection.findOne/
According to the MongoDB docs for db.collection.findOne():
Returns one document that satisfies the specified query criteria. If multiple documents satisfy the query, this method returns the first document according to the natural order which reflects the order of documents on the disk. In capped collections, natural order is the same as insertion order. If no document satisfies the query, the method returns null.
You can't recieve multiple records from db.collection.findOne(). Are you sure you're not using db.collection.find()?

How does MongoDB order their docs in one collection? [duplicate]

This question already has answers here:
How does MongoDB sort records when no sort order is specified?
(2 answers)
Closed 7 years ago.
In my User collection, MongoDB usually orders each new doc in the same order I create them: the last one created is the last one in the collection. But I have detected another collection where the last one I created has the 6 position between 27 docs.
Why is that?
Which order follows each doc in MongoDB collection?
It's called natural order:
natural order
The order in which the database refers to documents on disk. This is the default sort order. See $natural and Return in Natural Order.
This confirms that in general you get them in the same order you inserted, but that's not guaranteed–as you noticed.
Return in Natural Order
The $natural parameter returns items according to their natural order within the database. This ordering is an internal implementation feature, and you should not rely on any particular structure within it.
Index Use
Queries that include a sort by $natural order do not use indexes to fulfill the query predicate with the following exception: If the query predicate is an equality condition on the _id field { _id: <value> }, then the query with the sort by $natural order can use the _id index.
MMAPv1
Typically, the natural order reflects insertion order with the following exception for the MMAPv1 storage engine. For the MMAPv1 storage engine, the natural order does not reflect insertion order if the documents relocate because of document growth or remove operations free up space which are then taken up by newly inserted documents.
Obviously, like the docs mentioned, you should not rely on this default order (This ordering is an internal implementation feature, and you should not rely on any particular structure within it.).
If you need to sort the things, use the sort solutions.
Basically, the following two calls should return documents in the same order (since the default order is $natural):
db.mycollection.find().sort({ "$natural": 1 })
db.mycollection.find()
If you want to sort by another field (e.g. name) you can do that:
db.mycollection.find().sort({ "name": 1 })
For performance reasons, MongoDB never splits a document on the hard drive.
When you start with an empty collection and start inserting document after document into it, mongoDB will place them consecutively on the disk.
But what happens when you update a document and it now takes more space and doesn't fit into its old position anymore without overlapping the next? In that case MongoDB will delete it and re-append it as a new one at the end of the collection file.
Your collection file now has a hole of unused space. This is quite a waste, isn't it? That's why the next document which is inserted and small enough to fit into that hole will be inserted in that hole. That's likely what happened in the case of your second collection.
Bottom line: Never rely on documents being returned in insertion order. When you care about the order, always sort your results.
MongoDB does not "order" the documents at all, unless you ask it to.
The basic insertion will create an ObjectId in the _id primary key value unless you tell it to do otherwise. This ObjectId value is a special value with "monotonic" or "ever increasing" properties, which means each value created is guaranteed to be larger than the last.
If you want "sorted" then do an explicit "sort":
db.collection.find().sort({ "_id": 1 })
Or a "natural" sort means in the order stored on disk:
db.collection.find().sort({ "$natural": 1 })
Which is pretty much the standard unless stated otherwise or an "index" is selected by the query criteria that will determine the sort order. But you can use that to "force" that order if query criteria selected an index that sorted otherwise.
MongoDB documents "move" when grown, and therefore the _id order is not always explicitly the same order as documents are retrieved.
I could find out more about it thanks to the link Return in Natural Order provided by Ionică Bizău.
"The $natural parameter returns items according to their natural order within the database.This ordering is an internal implementation feature, and you should not rely on any particular structure within it.
Typically, the natural order reflects insertion order with the following exception for the MMAPv1 storage engine. For the MMAPv1 storage engine, the natural order does not reflect insertion order if the documents relocate because of document growth or remove operations free up space which are then taken up by newly inserted documents."