Is there a function in mongo to return the number of documents that were updated in an update statement?
I have tried to use count() but apparently the update statement only returns true or false so i think I'm getting the count of a string.
Thanks
Use the getLastError command to get information about the result of your operation.
I don't know the ruby driver, but most drivers automatically do this in 'safe mode'. In safe mode, each write will examine the result of getLastError to make sure the write was successful. The update operation should return an object that looks like the JSON object further down, and it includes the number of updated documents (n). You can fine-tune the safe mode settings, but be warned that the default mode is "fire and forget", so safe mode is a good idea for many use cases.
In the shell,
> db.customers.update({}, {$set : {"Test" : "13232"}}, true, true);
> db.runCommand( "getlasterror" )
{
"updatedExisting" : true,
"n" : 3,
"connectionId" : 18,
"err" : null,
"ok" : 1
}
Here, I updated n = 3 documents. Note that by default, update operations in mongodb only apply to the first matched document. In the shell, the fourth parameter is used to indicate we want to update multiple documents.
For future reference as of the latest version of mongoDB, you can save the result of the update command and look for the property modifiedCount. Note that if the update operation results in no change to the document, such as setting the value of the field to its current value, it may be less than the number of documents that were matched in the query.
For example (in JS):
var updateResult = await collection.updateOne( .... )
if(updateResult.modifiedCount < 1) throw new Error("No docs updated");
There are also two fields result.n and result.nModified in the response object that tell you the number of documents matched, and the number updated, respectively. Here's a portion of the response object that mongoDB returns
result: {
n: 1,
nModified: 1,
...
},
...
modifiedCount: 1
More info here: https://docs.mongodb.com/manual/reference/command/update/
Related
My Mongodb dataset is like this
{
"_id" : ObjectId("5a27cc4783800a0b284c7f62"),
"action" : "1",
"silent" : "0",
"createdate" : ISODate("2017-12-06T10:53:59.664Z"),
"__v" : 0
}
Now I have to find the data whose Action value is 1 and silent value is 0. one more thing is that all the data returns is descending Order.
My Mongodb Query is
db.collection.find({'action': 1, 'silent': 0}).sort({createdate: -1}).exec(function(err, post) {
console.log(post.length);
});
Earlier It works Fine for me But Now I have 121000 entry on this Collection. Now it returns null.
I know there is some confusion on .sort()
If i remove the sort Query then everything is fine. Example
db.collection.find({'action': 1, 'silent': 0}).exec(function(err, post) {
console.log(post.length);// Now it returns data but not on Descending order
});
MongoDB limits the amount of data it will attempt to sort without an index .
This is because Mongo has to sort the data in memory or on disk, both of which can be expensive operations, particularly for queries run frequently.
In most cases, this can be alleviated by creating indexes on the fields you sort on.
you can create index with :-
db.myColl.createIndex( { createdate: 1 })
thanks !
I have a document in my Mongo collection which has a field with the following structure:
"_id" : "F7WNvjwnFZZ7HoKSF",
"process" : [
{
"process_id" : "wTGqVk5By32mpXadZ",
"stages" : [
{
"stage_id" : "D6Huk89DGFsd29ds7",
"completed" : "N"
},
{
"stage_id" : "Msd390vekn09nvL23",
"completed" : "N"
}
]
}
]
I need to update the value of completed where the stage_id is equal to 'D6Huk89DGFsd29ds7' - the update query will not know which object in the stages array this value of stage_id will be in.
How do I do this?
Since you have nested arrays in your object, this is bit tricky and I'm not sure if this problem can be solved with help of just one update query.
However, if you happen to know index of your matching object in first array, in your case process[0] you can write your update query like.
db.collection.update(
{"process.stages.stage_id":"D6Huk89DGFsd29ds7"},
{$set:{"process.0.stages.$.completed":"Y"}}
);
Query above will work perfect with your test case. Again, there is still possibility of having multiple objects at root level and there is no guarantee that matching object will always be at 0 index.
Solution I proposed above will fail if you have multiple children of process and if matching index of object is not zero.
However, you can achieve your goal with help of client side programming. That is find matching document, modify on client side and replace whole document with new content.
Since this approach is very in efficient, I'll suggest that you should consider altering your document structure to avoid nesting. Create another collection and move content of process array there.
In the end, I removed the outer process block, so that the process_id and stages were in the root of the document - made the process of updating easier using:
MyColl.update(
{
_id: 'F7WNvjwnFZZ7HoKSF',
"stages.stage_id": 'D6Huk89DGFsd29ds7'
},
{
$set: {"stages.$.completed": 'Y'}
}
);
What's a good way to store a set of documents in MongoDB where order is important? I need to easily insert documents at an arbitrary position and possibly reorder them later.
I could assign each item an increasing number and sort by that, or I could sort by _id, but I don't know how I could then insert another document in between other documents. Say I want to insert something between an element with a sequence of 5 and an element with a sequence of 6?
My first guess would be to increment the sequence of all of the following elements so that there would be space for the new element using a query something like db.items.update({"sequence":{$gte:6}}, {$inc:{"sequence":1}}). My limited understanding of Database Administration tells me that a query like that would be slow and generally a bad idea, but I'm happy to be corrected.
I guess I could set the new element's sequence to 5.5, but I think that would get messy rather quickly. (Again, correct me if I'm wrong.)
I could use a capped collection, which has a guaranteed order, but then I'd run into issues if I needed to grow the collection. (Yet again, I might be wrong about that one too.)
I could have each document contain a reference to the next document, but that would require a query for each item in the list. (You'd get an item, push it onto the results array, and get another item based on the next field of the current item.) Aside from the obvious performance issues, I would also not be able to pass a sorted mongo cursor to my {#each} spacebars block expression and let it live update as the database changed. (I'm using the Meteor full-stack javascript framework.)
I know that everything has it's advantages and disadvantages, and I might just have to use one of the options listed above, but I'd like to know if there is a better way to do things.
Based on your requirement, one of the approaches could be to design your schema, in such a way that each document has the capability to hold more than one document and in itself act as a capped container.
{
"_id":Number,
"doc":Array
}
Each document in the collection will act as a capped container, and the documents will be stored as array in the doc field. The doc field being an array, will maintain the order of insertion.
You can limit the number of documents to n. So the _id field of each container document will be incremental by n, indicating the number of documents a container document can hold.
By doing these you avoid adding extra fields to the document, extra indices, unnecessary sorts.
Inserting the very first record
i.e when the collection is empty.
var record = {"name" : "first"};
db.col.insert({"_id":0,"doc":[record]});
Inserting subsequent records
Identify the last container document's _id, and the number of
documents it holds.
If the number of documents it holds is less than n, then update the
container document with the new document, else create a new container
document.
Say, that each container document can hold 5 documents at most,and we want to insert a new document.
var record = {"name" : "newlyAdded"};
// using aggregation, get the _id of the last inserted container, and the
// number of record it currently holds.
db.col.aggregate( [ {
$group : {
"_id" : null,
"max" : {
$max : "$_id"
},
"lastDocSize" : {
$last : "$doc"
}
}
}, {
$project : {
"currentMaxId" : "$max",
"capSize" : {
$size : "$lastDocSize"
},
"_id" : 0
}
// once obtained, check if you need to update the last container or
// create a new container and insert the document in it.
} ]).forEach( function(check) {
if (check.capSize < 5) {
print("updating");
// UPDATE
db.col.update( {
"_id" : check.currentMaxId
}, {
$push : {
"doc" : record
}
});
} else {
print("inserting");
//insert
db.col.insert( {
"_id" : check.currentMaxId + 5,
"doc" : [ record ]
});
}
})
Note that the aggregation, runs on the server side and is very efficient, also note that the aggregation would return you a document rather than a cursor in versions previous to 2.6. So you would need to modify the above code to just select from a single document rather than iterating a cursor.
Inserting a new document in between documents
Now, if you would like to insert a new document between documents 1 and 2, we know that the document should fall inside the container with _id=0 and should be placed in the second position in the doc array of that container.
so, we make use of the $each and $position operators for inserting into specific positions.
var record = {"name" : "insertInMiddle"};
db.col.update(
{
"_id" : 0
}, {
$push : {
"doc" : {
$each : [record],
$position : 1
}
}
}
);
Handling Over Flow
Now, we need to take care of documents overflowing in each container, say we insert a new document in between, in container with _id=0. If the container already has 5 documents, we need to move the last document to the next container and do so till all the containers hold documents within their capacity, if required at last we need to create a container to hold the overflowing documents.
This complex operation should be done on the server side. To handle this, we can create a script such as the one below and register it with mongodb.
db.system.js.save( {
"_id" : "handleOverFlow",
"value" : function handleOverFlow(id) {
var currDocArr = db.col.find( {
"_id" : id
})[0].doc;
print(currDocArr);
var count = currDocArr.length;
var nextColId = id + 5;
// check if the collection size has exceeded
if (count <= 5)
return;
else {
// need to take the last doc and push it to the next capped
// container's array
print("updating collection: " + id);
var record = currDocArr.splice(currDocArr.length - 1, 1);
// update the next collection
db.col.update( {
"_id" : nextColId
}, {
$push : {
"doc" : {
$each : record,
$position : 0
}
}
});
// remove from original collection
db.col.update( {
"_id" : id
}, {
"doc" : currDocArr
});
// check overflow for the subsequent containers, recursively.
handleOverFlow(nextColId);
}
}
So that after every insertion in between , we can invoke this function by passing the container id, handleOverFlow(containerId).
Fetching all the records in order
Just use the $unwind operator in the aggregate pipeline.
db.col.aggregate([{$unwind:"$doc"},{$project:{"_id":0,"doc":1}}]);
Re-Ordering Documents
You can store each document in a capped container with an "_id" field:
.."doc":[{"_id":0,","name":"xyz",...}..]..
Get hold of the "doc" array of the capped container of which you want
to reorder items.
var docArray = db.col.find({"_id":0})[0];
Update their ids so that after sorting the order of the item will change.
Sort the array based on their _ids.
docArray.sort( function(a, b) {
return a._id - b._id;
});
update the capped container back, with the new doc array.
But then again, everything boils down to which approach is feasible and suits your requirement best.
Coming to your questions:
What's a good way to store a set of documents in MongoDB where order is important?I need to easily insert documents at an arbitrary
position and possibly reorder them later.
Documents as Arrays.
Say I want to insert something between an element with a sequence of 5 and an element with a sequence of 6?
use the $each and $position operators in the db.collection.update() function as depicted in my answer.
My limited understanding of Database Administration tells me that a
query like that would be slow and generally a bad idea, but I'm happy
to be corrected.
Yes. It would impact the performance, unless the collection has very less data.
I could use a capped collection, which has a guaranteed order, but then I'd run into issues if I needed to grow the collection. (Yet
again, I might be wrong about that one too.)
Yes. With Capped Collections, you may lose data.
An _id field in MongoDB is a unique, indexed key similar to a primary key in relational databases. If there is an inherent order in your documents, ideally you should be able to associate a unique key to each document, with the key value reflecting the order. So while preparing your document for insertion, explicitly add an _id field as this key (if you do not, mongo creates it automatically with a BSON objectid).
As far as retrieving the results are concerned, MongoDB does not guarantee the order of return documents unless you explicitly use .sort() . If you do not use .sort(), the results are usually returned in natural order (order of insertion).Again, there is no guarantee on this behavior.
I'd advise you to override _id with your order while inserting, and use a sort while retrieving. Since _id is a necessary and auto-indexed entity, you will not be wasting any space defining a sort key, and storing the index for it.
For abitrary sorting of any collection, you'll need a field to sort it on. I call mine "sequence".
schema:
{
_id: ObjectID,
sequence: Number,
...
}
db.items.ensureIndex({sequence:1});
db.items.find().sort({sequence:1})
Here is a link to some general sorting database answers that may be relevant:
https://softwareengineering.stackexchange.com/questions/195308/storing-a-re-orderable-list-in-a-database/369754
I suggest going with Floating point solution - adding a position column:
Use a floating-point number for the position column.
You can then reorder the list changing only the position column in the "moved" row.
If your user wants to position "red" after "blue" but before "yellow" Then you just need to calculate
red.position = ((yellow.position - blue.position) / 2) + blue.position
After a few re-positions in the same place (Cuttin in half every time) - you might reach a wall - it's better that if you reach a certain threshold - to resort the list.
When retrieving it you can simply say col.sort() to get it sorted and no need for any client-side code (Like in the case of a Linked list solution)
In our product we have a core collection which can be accessed from a distributed set of workers.
I want to be able to get a document from the collection without any of the workers accidentally picking up the same document.
The best way I've come up with so far for managing to prevent duplicate records being loaded is the following:
Having 2 separate collections with the following basic structure:
core: { _id: '{mongoGeneratedId}', locked: false, lockTimeout: 0}
lock: { _id: null, lockTimeout: 0}
(lockTimeout would have a TTL index)
A worker would run a query that looks something like this:
db.core.findOne({
$or: [
{locked: false},
{lockTimeout < $currentTime}
]
})
and would have a record returned to it.
To test if the record has been grabbed by another worker and locked it would then try to insert a record into lock with a lockTimeout of 5 mins in the future and an id of the same id as your id from the core table.
If this fails, then we know that another worker pipped us to the post and we want to try to run the query again. If it succeeds, then we update core to have locked as true and have the lockTimeout as the same as the lockTimeout from the lock collection.
Apart from the addition of some form of slightly more complicated ordering to reduce the chances of 2 workers picking up the same record I believe this should work.
However, it doesn't feel elegant and I feel like there should be a better way that doesn't require me to create a secondary collection just to keep track of locking.
Does such a thing exist? Kind regards!
Try using the findAndModify command. This command atomically updates a document and returns the document (default pre-, optionally post-update). You can use the atomic update to lock the document as you grab it:
> db.queue.insert({ "x" : 1, "locked" : false })
> db.queue.findAndModify({
"query" : { "locked" : false },
"update" : { "$set" : { "locked" : true } }
})
{ "_id" : ObjectId("53ea6f0ef9b63e0dd3ca1a1f"), "x" : 1, "locked" : false }
You can also remove the document atomically. Check out the link for all of the features that could help for your queue-like use case and to read more about the command's behavior.
I am attempting to use Map/Reduce to accomplish partial merges into an existing collection. I have the MR working correctly but am having troubles returning the merged results.
Here are the stats on the MR with output type of reduced:
{
"result" : "calculation",
"timeMillis" : 222,
"counts" : {
"input" : 492,
"emit" : 920,
"reduce" : 64,
"output" : 435078
},
"ok" : 1.0
}
I would expect output to be the number of docs actually merged, not the entire collection. Is there any way to do this?
I tried to merge a modified:true flag into the target docs. This way a query could be made that returns only the documents that were modified in the target collection. After the query, I then set flag back to false.
While this works correctly, it starts thrashing the index because of the massive amount of changes being made then flipped back, so the HD rate shoots up and MR performance plummets.
Ideally, calling result.GetResults() from the C# driver would naturally return the documents that were modified by the MR without the need to use flags.
Update:
Specifically, I have one collection that is "write only" which the MR runs on to merge into a "read" collection.
If there was a document set like
{
"_id":BsonId,
"key":"key1",
"valarray":["one"],
},
{
"_id":BsonId
"key":"key2"
"valarray":["one"]
}
then MR into the blank query collection would yield
{
"_id":"key1",
"value":
{
"valarray":["one"]
}
},
{
"_id":"key2",
"value":
{
"valarray":["one"]
}
}
and I would expect that the counts would be: input = 2, emit = 2, reduce = 0, output = 2
If then there was a new document inserted into the write collection
{
"_id":BsonId,
"key":"key1",
"valarray":["two"],
}
then the map-reduce collection would be
{
"_id":"key1",
"value":
{
"valarray":["one", "two"]
}
},
{
"_id":"key2",
"value":
{
"valarray":["one"]
}
}
The counts are then: input = 1, emit = 1, reduce = 1, output = 2
And through the C# driver, calling result.GetResults() would iterate over the whole target collection. The issue is that I do not want to iterate over the collection, I only want to iterate over the documents in the target collection that were modified by the MR. In this case, it should return "_id":"key1" but not "_id":"key2".
Nutshell of problem. You have a relatively small number of documents to merge, but this is thrashing out against the whole collection. You don't want it to.
The thing here would be that you want to apply a reduce function over not only the output documents resulting from the input stage but of course over the documents that already exists. So the implementation seems to be to run the reduce over the whole output collection in order to merge with the results.
So what you want is a targeted result, where only the documents being updated are actually modified. There is a way I can see to achieve this but it is going to take some steps. And a bit more code.
Run your regular mapReduce operation. But instead of directing the output to your target collection, output to a temporary input collection.
Using the keys from that output get the required modified documents from your target and insert those into a temporary target collection.
Run a modified mapReduce that takes the temporary input and applies your reduce function through that and the temporary target collection items. This part is doing the work you want but only on the items to be updated, and in a smaller collection.
Once modified, take that input and apply with update operations on your main target.
So once thinking like that, then you have a workaround to get the results you want in the target without the output stage doing all the thrashing over all the collection documents. The trade-off is in the extra steps, but the gains would seem to outweigh the performance problems incurred by doing this in one step.