What's the difference between Drop and Remove in Mongodb? [duplicate] - mongodb

This question already has answers here:
MongoDB: Trade-offs of dropping a collection vs. removing all of its documents
(4 answers)
Closed 5 years ago.
I see there's two ways to remove collections in Mongodb. I just want to delete it essentially.
Should I use:
db.collection.drop()
Or
db.collection.remove()

Once we have documents stored in our collection , we can remove all of the documents from it in two ways. Now choosing one over another is totally depends on your requirement.
1. Using drop():
By invoking drop() on a collection , it will remove all the documents from it ,it will delete all the indexes on it and at the end it will delete the collection itself.
2.Using remove():
remove has two overloaded versions ,one in which we will pass the criteria to remove all the documents that are matching our passed criteria and 2nd one is default where we won’t pass any criteria (prior to 2.6) or pass an empty document (version 2.6 or more) and it will remove all the documents from the collection. Here, we are more interested in 2nd version when our intention is to clear all the documents from a collection.
Remark: To remove all documents from a collection, it may be more efficient to use the drop() method to drop the entire collection, including the indexes, and then recreate the collection and rebuild the indexes.

Based on the documentation, remove()
Removes documents from a collection.
but doesn't get rid of the collection (or associated indexes). remove() can take parameters to specify deletion criteria as well.
drop() gets rid of the collection and associated indexes:
Removes a collection or view from the database. The method also removes any indexes associated with the dropped collection. The method provides a wrapper around the drop command.
https://docs.mongodb.com/manual/reference/method/db.collection.remove/
https://docs.mongodb.com/manual/reference/method/db.collection.drop/
It's a little bit like the difference between DELETE (or maybe TRUNCATE) and DROP TABLE/VIEW in relational SQL.

Related

How can I bulk update in mongodb?

I am trying to update many documents in a single query, how can I update many documents in a single query such that I don't have to loop over a list and update each individually?
You can create an array of operations that you want, and use a bulkWrite (view the docs here).
In that way you don't need to make a lot of request and get all the updates done. You can choose if you want the operations to be ordered or unordered and each type of operation has its own behavior. You can also choose which level of write concern you want.

Can an ordered array inside a document with MongoDB be guaranteed safe to maintain order in production [duplicate]

Simple question, do arrays keep their order when stored in MongoDB?
yep MongoDB keeps the order of the array.. just like Javascript engines..
Yes, in fact from a quick google search on the subject, it seems that it's rather difficult to re-order them: http://groups.google.com/group/mongodb-user/browse_thread/thread/1df1654889e664c1
I realise this is an old question, but the Mongo docs do now specify that all document properties retain their order as they are inserted. This naturally extends to arrays, too.
Document Field Order
MongoDB preserves the order of the document fields following write operations except for the following cases:
The _id field is always the first field in the document.
Updates that include renaming of field names may result in the reordering of fields in the document.
Changed in version 2.6: Starting in version 2.6, MongoDB actively attempts to preserve the field order in a document. Before version 2.6, MongoDB did not actively preserve the order of the fields in a document.

MONGODB - Add duplicate field with different value

Is there a way to write a script that updates a document by adding a duplicate field with a different value? I cannot use set as that replaces the existing value. I cannot use push as the field is in an object, not an array. I even tried creating the new field with a different name and renaming it which also replaces the existing field.
You cannot have duplicate fields in a Mongo record. A Mongo collection is a collection of documents, otherwise known as objects. You cannot have a duplicate field in an object and Mongo is no different.
MongoDB (and any other database that I have come across so far) is built around the idea that individual fields are identifiable so they can be filtered by, grouped by, sorted by, etc... That also explains why MongoDB does not provide support for the scenario you're facing. That being said, MongoDB can be used as a dumb datastore for arbitrary JSON data. And the JSON specification does not say anything about duplicate field names which is probably why you can actually store such a document in MongoDB in the first place.
Anyway, there is no way to achieve what you want without loading the entire document, changing it (by adding the duplicate field(s)) and then replacing the whole document. That, however, will work.
I personally cannot think of a reasonable scenario where this sort of document could make sense, though. So I would strongly suggest you revisit your document structure.

What is the preferred way to add many fields to all documents in a MongoDB collection?

I have have a Python application that is iteratively going through every document in a MongoDB (3.0.2) collection (typically between 10K and 1M documents), and adding new fields (probably doubling/tripling the number of fields in the document).
My initial thought was that I would use upsert the entire of the revised documents (using pyMongo) - now I'm questioning that:
Given that the revised documents are significantly bigger should I be inserting only the new fields, or just replacing the document?
Also, is it better to perform a write to the collection on a document by document basis or in bulk?
this is actually a great question that can be solved a few different ways depending on how you are managing your data.
if you are upserting additional fields does this mean your data is appending additional fields at a later point in time with the only changes being the addition of the additional fields? if so you could set the ttl on your documents so that the old ones drop off over time. keep in mind that if you do this you will want to set an index that sorts your results by descending _id so that the most recent additions are selected before the older ones.
the benefit of this of doing it this way is that your are continually writing data as opposed to seeking and updating data so it is faster.
in regards to upserts vs bulk inserts. bulk inserts are always faster than upserts since bulk upserting requires you to find the original document first.
Given that the revised documents are significantly bigger should I be inserting only the new fields, or just replacing the document?
you really need to understand your data fully to determine what is best but if only change to the data is additional fields or changes that only need to be considered from that point forward then bulk inserting and setting a ttl on your older data is the better method from the stand point of write operations as opposed to seek, find and update. when using this method you will want to db.document.find_one() as opposed to db.document.find() so that only your current record is returned.
Also, is it better to perform a write to the collection on a document by document basis or in bulk?
bulk inserts will be faster than inserting each one sequentially.

Efficient way to remove all entries from mongodb

Which is the better way to remove all entries from a mongodb collection?
db.collection.remove({})
or
db.collection.drop()
Remove all documents in a collection
db.collection.remove({})
Will only remove all the data in the collection but leave any indexes intact. If new documents are added to the collection they will populate the existing indexes.
Drop collection and all attached indexes
db.collection.drop()
Will drop the collection and all indexes. If the collection is recreated then the indexes will also need to be re-created.
From your question, if you only want to remove all the entities from a collection then using db.collection.remove({}) would be better as that would keep the Collection intact with all indexes still in place.
From a speed perspective, the drop() command is faster.
db.collection.drop() will delete to whole collection (very fast) and all indexes on the collection.
db.collection.remove({}) will delete all entries but the collection and the indexes will still exists.
So there is a difference in both operations. The first one is faster but it will delete more meta information. If you want to ceep them, you should not use it.