I have mongodb collection users.
Each user have field called rating which is between 1 and 5. It means that when user votes on another user he 'gives' him his vote which is a number between 1 and 5. I have a problem with storing this data in mongo document beacause I have to query user collection by rating field and I have to update it atomicly...
If I store both rating and number of votes when I can update votes_number with $inc operator but I cant atomicly set rating = ((rating*votes) + vote_val)/(votes+1)
I could just keep sum of votes and votes number in document and update both using $inc but then I cant query like WHERE votes_sum/votes_num > 3...
Is there any solution to this problem?
What you can do is use option two from above and then combine it with a cached result field. You can set up the data flow so the result field remains consistent with the rest of the document by using the filter predicate on your update.
Step one is to add a new field to your schema which will be your cached rating field. This will allow you to perform your range query without having to do the dynamic division. The problem you'll run into there is that you can't atomically increment the votes_sum & votes_num fields AND in the same atomic operation set the cached rating fields. So here's what you do.
1) Atomically increment the votes_sum and votes_num fields
2) Grab the _id, votes_sum & votes_num for the updated document
3) Update the rating but, as part of the filter predicate, include the _id, expected votes_sum and expected votes_num fields.
db.collection.update({_id: $id, votes_sum: $votes_sum, votes_num: $votes_num}, {$set: {rating: $votes_sum / $votes_num}});
This will ensure that nothing has changed since you updated the doc. If someone else comes along and updates those fields in between you updating them and generating the rating then the doc will not be returned in the find part of the update statement and thus it will not be updated with stale data
This pattern takes advantage of the fact that writes are atomic at the document level in MongoDB so you don't have to worry about the consistency of data within a document. The nice thing is that the rating will be set correctly because every operation to update the votes_sum and votes_num fields is followed by an update to rating.
See here for some sample code: http://docs.mongodb.org/manual/tutorial/isolate-sequence-of-operations/
Related
I need to fetch the document in a mongodb collection using its position. I know the position of the document inside the collection exactly but could not figure out a way to pull those documents from collection. Is there any way to achieve this?
db.daily.find({'_id': {'$in': 0,5,8}})
This is what i tried but _id is not inserted as 1,2,3... but it has some random num Eg:57d8fd62f2a9d913ba0d006d. Thanks in advance.
You can use skip and limit to query based on the position in the natural order
db.collection.find().skip(10).limit(1) // get 10th document in natural order
As the natural order link points out, the document order need not match the order that documents are inserted (with an exception for capped collections). If you use the default ObjectId as the _id field for your documents you can sort by _id to order based on insertion in the collection (up to the resolution of the timestamp in the ObjectId)
db.collection.find().sort([("_id",1)]).skip(10).limit(1) // get 10th document in inserted order
You may also consider using your own _id or adding a field to be able to sort on in order to query based on the position you define.
Is there a way I can find the last inserted document and the field, i.e. _id or id such that I can increment and use when inserting a new document?
The issue is that I create my own id count, but I do not store this, now I've deleted records, I cannot seem to add new records because I am attempting to use the same id.
There is no way to check insertion order in MongoDB, because the database does not keep any metadata in the collections regading the documents.
If your _id field is generated server-side then you need to have a very good algorithm for this value in order to provide collision avoidance and uniqueness while at the same time following any sequential constraints that you might have.
If i want to update multiple documents based on multiple filter criteria which is the better approach?
Filter and get the documents (only _id field) which needs to be updated and supply the array of _id to updatemanyasync ($in) as a parameter and update . (see below 1)
Update the documents by supplying filter criteria directly.(see below 2)
Reason for this doubt.
1. MongoDB search only for _id matches and update it.
2. MongoDB search for the supplied mulitple criteria (multiple fields) each document and it will update.
What is the performance difference on these 2 approaches by spliting up the updates as 2 process
Performance on Timeouts,Locks,Document Avalability after update.
Please help to share your suggestions and views.
I have problem whit long Mongo find results. Example how can i start query starting from _id X
to forward Example I know I have document where is 1000 users details I know there is user called Peter in list I can make query Users.find({userName: "Peter"}) and get this on user _id but how I can get all users also after this with out I need return JSON from above "Peter"
With the little amount of information you have given, You need to do this in two steps:
Get the id of the first record that matches the name "peter".
db.test.findOne({"userName":"Peter"},{"_id":1});
Returns one document that satisfies the specified query criteria. If
multiple documents satisfy the query, this method returns the first
document according to the natural order which reflects the order of
documents on the disk. In capped collections, natural order is the
same as insertion order.
Once you have the id of the record with peter, you can retrieve the records with their id > the id of this record.
db.test.find({"_id":{$gte:x}});
Where, x is the id of the first record returned by the first query.
I have a MongoDB collection as follows:
comment_id (number)
comment_title (text)
score (number)
time_score (number)
final_score (number)
created_time (timestamp)
Score is and integer that's usually updated using $inc 1 or -1 whenever someone votes up or down for that record.
but time_score is updated using a function relative to timestamp and current time and other factors like how many (whole days passed) and how many (whole weeks passed) .....etc
So I do $inc and $dec on db directly but for the time_score, I retrieve data from db calculate the new score and write it back. What I'm worried about is that in case many users incremented the "score" field during my calculation of time_score then when I wrote time_score to db it'll corrupt the last value of score.
To be more clear does updating specific fields in a record in Mongo rewrites the whole record or only the updated fields ? (Assume that all these fields are indexed).
By default, whole documents are rewritten. To specify the fields that are changed without modifying anything else, use the $set operator.
Edit: The comments on this answer are correct - any of the update modifiers will cause only relevant fields to be rewritten rather than the whole document. By "default", I meant a case where no special modifiers are used (a vanilla document is provided).
The algorithm you are describing is definitely not thread-safe.
When you read the entire document, change one field and then write back the entire document, you are creating a race condition - any field in the document that is modified after your read but before your write will be overwritten by your update.
That's one of many reasons to use $set or $inc operators to atomically set individual fields rather than updating the entire document based on possibly stale values in it.
Another reason is that setting/updating a single field "in-place" is much more efficient than writing the entire document. In addition you have less load on your network when you are passing smaller update document ({$set:{field:value}}, rather than entire new version of the document).