Mongodb findAndModify atomicity - mongodb

I want to know how to reference the returned document attributes
from find and use it within modify. E.x. :
var totalNoOfSubjects = 5;
db.people.findAndModify( {
query: { name: "Tom", state: "active", rating: { $gt: 10 } },
sort: { rating: 1 },
update: { $set: { average: <reference score value returned by find>/totalNoOfSubjects} }
} );
My understanding is that findAndModify locks the document, hence I want to perform the update in the modify using the attributes found in the find. This will make the operation
atomic.
I am wondering if this is supported by mongo.

No, you cannot refer to values in the found document during the update portion of a findAndModify. It's the same as update in this respect.
As such, you cannot do this atomically as you need to first fetch the document and then craft the update or findAndMondify to contain the value computed from your fetched doc.
See https://jira.mongodb.org/browse/SERVER-458 for one way this may be addressed in the future.

Atomicity is exactly the reason for findAndModify.
As stated in the docs, Mongo will find one or more documents (matching the query specified) modify one document (using the update specified). The whole process is atomic. Default implementation has Mongo returning the found document (in its unchanged state). This can be modified using the new option.

Related

When doing an upsert to MongoDb is it possible to set a field with a timestamp only if other data in the record has changed?

We need to cache records for a service with a terrible API.
This service provides us with API to query for data about our employees, but does not inform us whether employees are new or have been updated. Nor can we filter our queries to them for this information.
Our proposed solution to the problems this creates for us is to periodically (e.g. every 15 minutes) query all our employee data and upsert it into a Mongo database. Then, when we write to the MongoDb, we would like to include an additional property which indicates whether the record is new or whether the record has any changes since the last time it was upserted (obviously not including the field we are using for the timestamp).
The idea is, instead of querying the source directly, which we can't filter by such timestamps, we would instead query our cache which would include said timestamp and use it for a filter.
(Ideally, we'd like to write this in C# using the MongoDb driver, but more important right now is whether we can do this in an upsert call or whether we'd need to load all the records into memory, do comparisons, and then add the timestamps before upserting them....)
There might be a way of doing that, but how efficient that is, still needs to be seen. The update command in MongoDB can take an aggregation pipeline to perform an update operation. We can use the $addFields stage of MongoDB to add a new field denoting the update status, and we can use $function to compute its value. A short example is:
db.collection.update({
key: 1
},
[
{
"$addFields": {
changed: {
"$function": {
lang: "js",
"args": [
"$$ROOT",
{
"key": 1,
data: "somedata"
}
],
"body": "function(originalDoc, newDoc) { return JSON.stringify(originalDoc) !== JSON.stringify(newDoc) }"
}
}
}
}
],
{
upsert: true
})
Here's the playground link.
Some points to consider here, are:
If the order of fields in the old and new versions of the doc is not the same then JSON.stringify will fail.
The function specified in $function will run on the server-side, so ideally it needs to be lightweight. If there is a large number of users, that will get upserted, then it may or may not act as a bottleneck.

MongoDB: Is is possible to make sure to add a column to a document before inserting to a collection

Let's say I have a collection named products. I want to make sure whenever a document in this collection is inserted or updated, I check if there is a viewCount field present or not. If it is, I let the create/update operation complete. Else, I want to add this field and set the value to zero.
The challenge is, there are a lot of such operations in the application code. So, I am looking for a way to accomplish this at DB level. Is this possible ?
Use findAndModify:
db.products.findAndModify({
query:{ yourQuery},
update:{ fieldsToCreate, $inc: { viewCount:1} },
new: true,
upsert: true
})
where fieldsToCreate is a partial document of the values you want to create if the document does not exist. The new document will be returned with viewCount set to 1, which is correct, since it was viewed 1 time when returned.

How to create an index in MongoDB which calls a JS function via system.js?

I have two collections viz. whitelist (id, count, expiry) and blacklist (id).
Now i would like to create an index such that when count>=200 then call a JS function which will remove the document from whitelist and add the id to blacklist.
So can i do this in Mongo using db.collection.createindex({"count":1}, ???);
or do i need to write a daemon to scan the entire collection? or is there any better method for the same?
You seem to be asking for what in a SQL relational database we would call a "trigger", which is something completely different from an "index" even in that world.
In the NoSQL world typically and especially with MongoDB, that sort of "server logic" is relegated to the "client" code operations rather than the server. Think of it as another part of the "scalability" philosphy of these products, where certain functions like "triggers" are taken away due to the stance that these "cost" a lot with distributed data.
So in order to do what you want you do it in "code" instead of defining a database "trigger". The process is simple enough, via .findAndModify() and other wrapping variants available to langauge API's:
// Increment below 200 and return the modified document
var doc = db.whitelist.findAndModify({
"query": { "_id": myId, "count": { "$lt": 200 } }
"update": { "count": { "$inc": 1 } },
"new": true
});
// Then remove the blacklist where the value meets conditions
if ( doc.hasOwnProperty("count") {
if ( doc.count >= 200 )
db.blacklist.remove({ "_id": myId });
}
Be careful with the actual language API method variant as the structure typically differs fromt the "query/update" keys as is provided in the shell method.
The basic principles remain the same. Modifiy and fetch, then remove from the other collection if your conditions are met. But it is "two" trips to the server, and there is no way to make the server "trigger" when such a condition is met by itself.
db.whitelist.insert(doc);
if(db.whitelist.find(criterion).count() >= 200) {
var bulkRemove = db.whitelist.initializeUnorderedBulkOp();
var bulkInsert = db.blacklist.initializeUnorderedBulkOp();
db.whitelist.find(criterion).forEach(
function(doc){
bulkInsert.insert({_id:doc._id});
bulkRemove.find({doc._id}).removeOne();
}
);
bulkInsert.execute();
bulkRemove.execute();
}
First, you insert the document as usual. Since criterion is going to use an index, the if clause should be determined fast and efficiently.
In case we have 200 or more documents matching that criterion, we use bulk operations to insert the ids into the blacklist and remove the documents from the whitelist, which will be executed in parallel.
The problem with only writing the _id to the backlist is that you need to check wether the criterion for being blacklisted is matched, so the _id needs to contain that criterion.
A better solution IMHO is to flag entries of a single collection using a field named blacklisted for individual entries or to use the aggregation framework to find blacklisted documents and write them to an a collection using the out pipeline stage. Sadly, you didn't give example data or a proper description of your use case, so you get an unspecified answer.

Does Mongo have versioning on objects (like ElasticSearch)

Does Mongo automatically track a version which is incremented for each update and can be used for optimistic locking?
So something that would correspond to functionality described here in ElasticSearch http://www.elasticsearch.org/blog/elasticsearch-versioning-support/
No, MongoDB does not store versions of documents. There is only the "current" version of the document as far as the database API is concerned.
It would be necessary to create your own scheme if you wanted such a thing. You could use a version field in your document and use $inc for example to increment it as needed and also verify that the version value matched before applying an update.
Example:
db.myCollection.update(
{ _id: "abc123", _version: 5 },
{
$set: { 'fieldA' : 'some-value' },
$inc: { '_version' : 1 }
}
)
The above example would find a document with the specific _id and _version fields. If matched, a field called fieldA is set to a new value and the _version field is incremented by 1. If another update was attempted on the same document and version, it would fail as the _version would have been updated to 6.

Manually change MongoID

Through the PHP problem when inserting stuff into MongoDB explained in the answer by Denver Matt in this question I created duplicate IDs in a dataset in MongoDB. Fixing the PHP code is easy, but to be able to still use my dataset I wonder:
Can I change the MongoId manually without breaking something? Or can I just reset this ID somehow into a new one?
The _id field of a document is immutable, as discussed in this documentation. Attempting to modify its value would result in an error/exception, as in:
> db.foo.drop()
> db.foo.insert({ _id: 1 })
> db.foo.update({ _id: 1 }, { $set: { _id: 3 }})
Mod on _id not allowed
> db.foo.find()
{ "_id" : 1 }
If you do need to alter the identifier of a document, you could fetch it, modify the _id value, and then re-persist the document using insert() or save(). insert() may be safer on the off chance that you new _id value conflicts and you're rather see a uniqueness error than overwrite the existing document (as save() would do). Afterwards, you'll need to go back and remove the original document.
Since you can't do all of this in a single atomic transaction, I would suggest the following order of operations:
findOne() existing document by its _id
Modify the returned document's _id property
insert() the modified document back into the collection
If the insert() succeeded, remove() the old document by the original _id