Does updating a MongoDB record rewrites the whole record or only the updated fields? - mongodb

I have a MongoDB collection as follows:
comment_id (number)
comment_title (text)
score (number)
time_score (number)
final_score (number)
created_time (timestamp)
Score is and integer that's usually updated using $inc 1 or -1 whenever someone votes up or down for that record.
but time_score is updated using a function relative to timestamp and current time and other factors like how many (whole days passed) and how many (whole weeks passed) .....etc
So I do $inc and $dec on db directly but for the time_score, I retrieve data from db calculate the new score and write it back. What I'm worried about is that in case many users incremented the "score" field during my calculation of time_score then when I wrote time_score to db it'll corrupt the last value of score.
To be more clear does updating specific fields in a record in Mongo rewrites the whole record or only the updated fields ? (Assume that all these fields are indexed).

By default, whole documents are rewritten. To specify the fields that are changed without modifying anything else, use the $set operator.
Edit: The comments on this answer are correct - any of the update modifiers will cause only relevant fields to be rewritten rather than the whole document. By "default", I meant a case where no special modifiers are used (a vanilla document is provided).

The algorithm you are describing is definitely not thread-safe.
When you read the entire document, change one field and then write back the entire document, you are creating a race condition - any field in the document that is modified after your read but before your write will be overwritten by your update.
That's one of many reasons to use $set or $inc operators to atomically set individual fields rather than updating the entire document based on possibly stale values in it.
Another reason is that setting/updating a single field "in-place" is much more efficient than writing the entire document. In addition you have less load on your network when you are passing smaller update document ({$set:{field:value}}, rather than entire new version of the document).

Related

MongoDB: How to get the last updated timestamp of the last updated document in a collection

Is there a simple OR elegant method (or query that I can write) to retrieve the last updated timestamp (of the last updated document) in a collection. I can write a query like this to find the last inserted document
db.collection.find().limit(1).sort({$natural:-1})
but I need information about the last updated document (it could be an insert or an update).
I know that one way is to query the oplog collection for the last record from a collection. But it seems like an expensive operation given the fact that oplog could be of very large size (also not trustworthy as it is a capped collection). Is there a better way to do this?
Thanks!
You could get the last insert time same way you mentioned in the question:
db.collection.find().sort({'_id': -1}).limit(1)
But, There isn't any good way to see the last update/delete time. But, If you are using replica sets you could get that from the oplog.
Or, you could add new field in document as 'lastModified'.
You can also checkout collection-hooks. I hope this will help
One way to go about it is to have a field that holds the time of last update. You can name it updatedAt. Every time you make an update to the document, you'll just update the value to the current time. If you use the ISO format to store the time, you'll be able to sort without issues (that's what I use).
The other way is the _id field.
Method 1
db.collection.find().limit(1).sort({updatedAt: -1})
Method 2
db.collection.find().limit(1).sort({_id: -1})
You can try with ,
db.collection.findOne().sort({$natural:-1}).limit(1);

How does MongoDB order their docs in one collection? [duplicate]

This question already has answers here:
How does MongoDB sort records when no sort order is specified?
(2 answers)
Closed 7 years ago.
In my User collection, MongoDB usually orders each new doc in the same order I create them: the last one created is the last one in the collection. But I have detected another collection where the last one I created has the 6 position between 27 docs.
Why is that?
Which order follows each doc in MongoDB collection?
It's called natural order:
natural order
The order in which the database refers to documents on disk. This is the default sort order. See $natural and Return in Natural Order.
This confirms that in general you get them in the same order you inserted, but that's not guaranteed–as you noticed.
Return in Natural Order
The $natural parameter returns items according to their natural order within the database. This ordering is an internal implementation feature, and you should not rely on any particular structure within it.
Index Use
Queries that include a sort by $natural order do not use indexes to fulfill the query predicate with the following exception: If the query predicate is an equality condition on the _id field { _id: <value> }, then the query with the sort by $natural order can use the _id index.
MMAPv1
Typically, the natural order reflects insertion order with the following exception for the MMAPv1 storage engine. For the MMAPv1 storage engine, the natural order does not reflect insertion order if the documents relocate because of document growth or remove operations free up space which are then taken up by newly inserted documents.
Obviously, like the docs mentioned, you should not rely on this default order (This ordering is an internal implementation feature, and you should not rely on any particular structure within it.).
If you need to sort the things, use the sort solutions.
Basically, the following two calls should return documents in the same order (since the default order is $natural):
db.mycollection.find().sort({ "$natural": 1 })
db.mycollection.find()
If you want to sort by another field (e.g. name) you can do that:
db.mycollection.find().sort({ "name": 1 })
For performance reasons, MongoDB never splits a document on the hard drive.
When you start with an empty collection and start inserting document after document into it, mongoDB will place them consecutively on the disk.
But what happens when you update a document and it now takes more space and doesn't fit into its old position anymore without overlapping the next? In that case MongoDB will delete it and re-append it as a new one at the end of the collection file.
Your collection file now has a hole of unused space. This is quite a waste, isn't it? That's why the next document which is inserted and small enough to fit into that hole will be inserted in that hole. That's likely what happened in the case of your second collection.
Bottom line: Never rely on documents being returned in insertion order. When you care about the order, always sort your results.
MongoDB does not "order" the documents at all, unless you ask it to.
The basic insertion will create an ObjectId in the _id primary key value unless you tell it to do otherwise. This ObjectId value is a special value with "monotonic" or "ever increasing" properties, which means each value created is guaranteed to be larger than the last.
If you want "sorted" then do an explicit "sort":
db.collection.find().sort({ "_id": 1 })
Or a "natural" sort means in the order stored on disk:
db.collection.find().sort({ "$natural": 1 })
Which is pretty much the standard unless stated otherwise or an "index" is selected by the query criteria that will determine the sort order. But you can use that to "force" that order if query criteria selected an index that sorted otherwise.
MongoDB documents "move" when grown, and therefore the _id order is not always explicitly the same order as documents are retrieved.
I could find out more about it thanks to the link Return in Natural Order provided by Ionică Bizău.
"The $natural parameter returns items according to their natural order within the database.This ordering is an internal implementation feature, and you should not rely on any particular structure within it.
Typically, the natural order reflects insertion order with the following exception for the MMAPv1 storage engine. For the MMAPv1 storage engine, the natural order does not reflect insertion order if the documents relocate because of document growth or remove operations free up space which are then taken up by newly inserted documents."

Mongodb store and select order

Basic question. Does mongodb find command will always return documents in the order they where added to collection? If no how is it possible to implement selection docs in the right order?
Sort? But what if docs where added simultaneously and say created date is the same, but there was an order still.
Well, yes and ... not exactly.
Documents are default sorted by natural order. Which is initially the order the documents are stored on disk, which is indeed the order in which the documents had been added to a collection.
This order however, is not deterministic, as document may be moved on disk once these documents grow after update operations, and can't be fit into current space anymore. This way the initial (insert) order may change.
The way to guarantee insert order sort is sort by {_id : 1} as long as the _id is of type ObjectId. This will return your documents sorted in ascending order.
Write operations do not take place simultaneously. Write locks are imposed in database level (V 2.4 and on). The first four bytes of _id is insert timestamp, and 3 last digits is a random counter used to distinguish (and sort) between ObjectId instances with same timestamp.
_id field is indexed by default

Does newly inserted document in MongoDB surely has "bigger" _id than older document?

What's the algorithm for MongoDB to calculate the "_id" field. It looks it is incremental.
I'm wondering if it is safe to sort by "_id" field as sort by time the document inserted.
The way ids are generated is described here. Turns out leading bytes are given to the timestamp, so probably the order of ids corresponds to the order of insertion (if we don't consider deviations in time between different machines).
If you need to sort by order of insertion then you need to add your own field for timestamp or incremental counter. In a sharded set-up sorting by _id might not work.

The fastest way to show Documents with certain property first in MongoDB

I have collections with huge amount of Documents on which I need to do custom search with various different queries.
Each Document have boolean property. Let's call it "isInTop".
I need to show Documents which have this property first in all queries.
Yes. I can easy do sort in this field like:
.sort( { isInTop: -1 } );
And create proper index with field "isInTop" as last field in it. But this will be work slowly, as indexes in mongo works best with unique fields.
So is there is solution to show Documents with field "isInTop" on top of each query?
I see two solutions here.
First: set Documents wich need to be in top the _id from "future". As you know, ObjectId contains timestamp. So I can create ObjectId with timestamp from future and use natural order
Second: create separate collection for Ducuments wich need to be in top. And do queries in it first.
Is there is any other solutions for this problem? Which will work fater?
UPDATE
I have done this issue with sorting on custom field which represent rank.
Using the _id field trick you mention has the problem that at some point in time you will reach the special time, and you can't change the _id field (without inserting a new document and removing the old one).
Creating a special collection which just holds the ones you care about is probably the best option. It gives you the ability to logically (and to some extent, physically) separate the documents.
Newly introduced in mongodb there is also support for a "sparse" index which may fulfill your needs as well. You could only set the "isInTop" field when you want it to be special, and then create a sparse index on it which would not have the problems you would normally have with a single indexed boolean field (in btrees).