mongodb update a key to all documents using forEach - mongodb

I want to update in Mongo the 'order' field to all of my documents so they will be 1..2..3..4....34.
After running this, they all have "order": "34".
What am I doing wrong?
var i = 1;
db.images.find().forEach(function() {
db.images.update(
{},
{ "$set": {"order": NumberInt(i)} },
{ multi: true }
);
i++;
})

multi : true means all documents matching the query will be updated. And your query is {}, which matches all the documents. So, basically you are updating the order of all the documents in every iteration.
Also, snapshot mode has to be enabled on the cursor to ensure that the same document isn't returned more than once.
You could try this:
var i = 1;
db.images.find().snapshot().forEach(function(image) {
db.images.update(
{"_id" : image._id},
{ "$set": {"order": NumberInt(i)} }
);
i++;
})
From a performance standpoint, it is better to use the bulk APIs. bulkwrite

Related

Remove empty field's name from document

I've been able to add a large volume of empty keys in a MongoDB collection. As empty keys is not allowed in MongoDB I'm having a hard time to unset the key using regular methods in MongoDB. Is there a workaround for this kind of problems using mongo shell?
{
"foo": "bar",
"": "I should not exist, I have no key..."
}
We can delete the empty string field using bulk operations.
We need to iterate over the cursor snapshot then use the bulkWrite method method to bulk update the document. Note that we need to replace the document because we can't $unset the empty field or rename it. So an update operation is not possible here.
let requests = [];
db.coll.find( { "": { "$exists": true } } ).snapshot().forEach( document => {
delete document[""];
requests.push( {
"replaceOne": {
"filter": { "_id": document._id },
"replacement": document
}
});
if ( requests.length === 1000 ) {
// Execute per 1000 operations and re-init
db.coll.bulkWrite(requests);
requests = [];
}
});
// Clean up queues
if ( requests.length > 0 ) {
db.coll.bulkWrite(requests);
}
Give this a throw....
db.collection.find().forEach(function(doc) {
delete doc[''];
db.collection.save(doc);
});
There are some abilities to manage documents that have errors in them, such as empty keys.
If you are on version 5.0 or later, you can use $setField expression with expressive pipeline update:
db.c.updateMany({ "": { "$exists": true } },
[ {$replaceWith:{$setField:{field:"",input:"$$ROOT",value:"$$REMOVE"}}}]
)
If you are on 4.2 through 4.4 (pre-5.0) you can do it with different pipeline update:
db.c.updateMany({ "": { "$exists": true } },
[ {$replaceWith:{$arrayToObject:{$filter:{
input:{$objectToArray:"$$ROOT"},
cond:{$ne:["$$this.k",""]}
}}}}]
)

MongoDB - Remove Folder

I'm trying to delete all folders on MongoDB whose descriptions contain a number higher than 10. Can you tell me how to do that?
I've been trying desperately since hours...
Thanks very much!
Robomongo
You need a mechanism to get a list of the keys in the collection first, filter the list for the ones that have a number greater than 10 and then generate a query that you will use with the $unset operator in your update. Your update document should have this structure:
var update = {
"$unset": {
"p11": "",
"p12": "",
...
}
}
which you will use in your update as
db.collection.update({}, update, {multi: true});
You need the mapReduce() command to generate that update document. The following mapreduce operation will populate a separate collection with the document as the value:
db.collection.mapReduce(
function() {
var map = this;
for (var key in map) {
if (map.hasOwnProperty(key)){
num = parseInt(key.replace(/[^\d.]/g, '' ));
if (num > 10) emit(null, key);
}
}
},
function(key, values) {
return values.reduce(function(o, v) {
o[v] = "";
return o;
}, {});
},
{ "out": "filtered_keys" }
);
You can then run a query on the resultant collection to get the update document and do the actual update:
var update = {
"$unset": db.filtered_keys.findOne({"_id": null}).value
},
options = { "multi": true };
db.collection.update({}, update, options);

MongoDB - change simple field into an object

In MongoDB, I want to change the structure of my documents from:
{
discount: 10,
discountType: "AMOUNT"
}
to:
{
discount: {
value: 10,
type: "AMOUNT"
}
}
so I tried following query in mongo shell:
db.discounts.update({},
{
$rename: {
discount: "discount.value",
discountType: "discount.type"
}
},
{multi: true}
)
but it throws an error:
"writeError" : {
"code" : 2,
"errmsg" : "The source and target field for $rename must not be on the same path: discount: \"discount.value\""
}
A workaround that comes to my mind is to do it in 2 steps: first assign the new structure to a new field (let's say discount2) and then rename it to discount. But maybe there is a way to do it one step?
The simplest way is to do it in two steps as you allude to in your question; initially renaming discount to a temporary field name so that it can be reused in the second step:
db.discounts.update({}, {$rename: {discount: 'temp'}}, {multi: true})
db.discounts.update({},
{$rename: {temp: 'discount.value', discountType: 'discount.type'}},
{multi: true})
The reason you are getting this error is because as mentioned in the documentation:
The $rename operator logically performs an $unset of both the old name and the new name, and then performs a $set operation with the new name. As such, the operation may not preserve the order of the fields in the document; i.e. the renamed field may move within the document.
And the problem with this is that you can't $set and $unset same field at the same time in MongoDB.
The solution will be to use bulk operations to update your documents in order to change their structure, and even in that case you need to use a field's name that doesn't exist in your collection. Of course the best way to do all this is using "Bulk" operations for maximum efficiency
MongoDB 3.2 or newer
MongoDB 3.2 deprecates Bulk() and its associated methods. You need to use the .bulkWrite() method.
var operations = [];
db.discounts.find().forEach(function(doc) {
var discount = doc.discount;
var discountType = doc.discountType;
var operation = { 'updateOne': {
'filter': { '_id': doc._id },
'update': {
'$unset': { 'discount': '', 'discountType': '' },
'$set': { 'discounts.value': discount, 'discounts.type': discountType }
}
}};
operations.push(operation);
});
operations.push( {
ordered: true,
writeConcern: { w: "majority", wtimeout: 5000 }
});
db.discounts.bulkWrite(operations);
Which yields:
{
"_id" : ObjectId("56682a02e6a2321d88f6d078"),
"discounts" : {
"value" : 10,
"type" : "AMOUNT"
}
}
MongoDB 2.6
Prior to MongoDB 3.2 and using MongoDB version 2.6 or newer you can use the "Bulk" API.
var bulk = db.discounts.initializeOrderedBulkOp();
var count = 0;
db.discounts.find().forEach(function(doc) {
var discount = doc.discount;
var discountType = doc.discountType;
bulk.find( { '_id': doc._id } ).updateOne( {
'$unset': { 'discount': '', 'discountType': '' },
'$set': { 'discounts.value': discount, 'discounts.type': discountType } });
count++;
if (count % 500 === 0) {
bulk.execute();
bulk = db.discounts.initializeOrderedBulkOp();
}
})
if (count > 0)
bulk.execute();
This query yields same result as previous one.
Thanks to answers from Update MongoDB field using value of another field I figured out following solution:
db.discounts.find().snapshot().forEach(
function(elem) {
elem.discount = {
value: elem.discount,
type: elem.discountType
}
delete elem.discountType;
db.discounts.save(elem);
}
)
Which I quite like because the source code reads nicely but performance sucks for large amount of documents.

Rename a sub-document field within an Array

Considering the document below how can I rename 'techId1' to 'techId'. I've tried different ways and can't get it to work.
{
"_id" : ObjectId("55840f49e0b"),
"__v" : 0,
"accessCard" : "123456789",
"checkouts" : [
{
"user" : ObjectId("5571e7619f"),
"_id" : ObjectId("55840f49e0bf"),
"date" : ISODate("2015-06-19T12:45:52.339Z"),
"techId1" : ObjectId("553d9cbcaf")
},
{
"user" : ObjectId("5571e7619f15"),
"_id" : ObjectId("55880e8ee0bf"),
"date" : ISODate("2015-06-22T13:01:51.672Z"),
"techId1" : ObjectId("55b7db39989")
}
],
"created" : ISODate("2015-06-19T12:47:05.422Z"),
"date" : ISODate("2015-06-19T12:45:52.339Z"),
"location" : ObjectId("55743c8ddbda"),
"model" : "model1",
"order" : ObjectId("55840f49e0bf"),
"rid" : "987654321",
"serialNumber" : "AHSJSHSKSK",
"user" : ObjectId("5571e7619f1"),
"techId" : ObjectId("55b7db399")
}
In mongo console I tried which gives me ok but nothing is actually updated.
collection.update({"checkouts._id":ObjectId("55840f49e0b")},{ $rename: { "techId1": "techId" } });
I also tried this which gives me an error. "cannot use the part (checkouts of checkouts.techId1) to traverse the element"
collection.update({"checkouts._id":ObjectId("55856609e0b")},{ $rename: { "checkouts.techId1": "checkouts.techId" } })
In mongoose I have tried the following.
collection.findByIdAndUpdate(id, { $rename: { "checkouts.techId1": "checkouts.techId" } }, function (err, data) {});
and
collection.update({'checkouts._id': n1._id}, { $rename: { "checkouts.$.techId1": "checkouts.$.techId" } }, function (err, data) {});
Thanks in advance.
You were close at the end, but there are a few things missing. You cannot $rename when using the positional operator, instead you need to $set the new name and $unset the old one. But there is another restriction here as they will both belong to "checkouts" as a parent path in that you cannot do both at the same time.
The other core line in your question is "traverse the element" and that is the one thing you cannot do in updating "all" of the array elements at once. Well, not safely and without possibly overwriting new data coming in anyway.
What you need to do is "iterate" each document and similarly iterate each array member in order to "safely" update. You cannot really iterate just the document and "save" the whole array back with alterations. Certainly not in the case where anything else is actively using the data.
I personally would run this sort of operation in the MongoDB shell if you can, as it is a "one off" ( hopefully ) thing and this saves the overhead of writing other API code. Also we're using the Bulk Operations API here to make this as efficient as possible. With mongoose it takes a bit more digging to implement, but still can be done. But here is the shell listing:
var bulk = db.collection.initializeOrderedBulkOp(),
count = 0;
db.collection.find({ "checkouts.techId1": { "$exists": true } }).forEach(function(doc) {
doc.checkouts.forEach(function(checkout) {
if ( checkout.hasOwnProperty("techId1") ) {
bulk.find({ "_id": doc._id, "checkouts._id": checkout._id }).updateOne({
"$set": { "checkouts.$.techId": checkout.techId1 }
});
bulk.find({ "_id": doc._id, "checkouts._id": checkout._id }).updateOne({
"$unset": { "checkouts.$.techId1": 1 }
});
count += 2;
if ( count % 500 == 0 ) {
bulk.execute();
bulk = db.collection.initializeOrderedBulkOp();
}
}
});
});
if ( count % 500 !== 0 )
bulk.execute();
Since the $set and $unset operations are happening in pairs, we are keeping the total batch size to 1000 operations per execution just to keep memory usage on the client down.
The loop simply looks for documents where the field to be renamed "exists" and then iterates each array element of each document and commits the two changes. As Bulk Operations, these are not sent to the server until the .execute() is called, where also a single response is returned for each call. This saves a lot of traffic.
If you insist on coding with mongoose. Be aware that a .collection acessor is required to get to the Bulk API methods from the core driver, like this:
var bulk = Model.collection.inititializeOrderedBulkOp();
And the only thing that sends to the server is the .execute() method, so this is your only execution callback:
bulk.exectute(function(err,response) {
// code body and async iterator callback here
});
And use async flow control instead of .forEach() such as async.each.
Also, if you do that, then be aware that as a raw driver method not governed by mongoose, you do not get the same database connection awareness as you do with mongoose methods. Unless you know for sure the database connection is already established, it is safter to put this code within an event callback for the server connection:
mongoose.connection.on("connect",function(err) {
// body of code
});
But otherwise those are the only real ( apart from call syntax ) alterations you really need.
This worked for me, I created this query to perform this procedure and I share it, (although I know it is not the most optimized way):
First, make an aggregate that (1) $match the documents that have the checkouts array field with techId1 as one of the keys of each sub-document. (2) $unwind the checkouts field (that deconstructs the array field from the input documents to output a document for each element), (3) adds the techId field (with $addFields), (4) $unset the old techId1 field, (5) $group the documents by _id to have again the checkout sub-documents grouped by its _id, and (6) write the result of these aggregation in a temporal collection (with $out).
const collection = 'yourCollection'
db[collection].aggregate([
{
$match: {
'checkouts.techId1': { '$exists': true }
}
},
{
$unwind: {
path: '$checkouts'
}
},
{
$addFields: {
'checkouts.techId': '$checkouts.techId1'
}
},
{
$project: {
'checkouts.techId1': 0
}
},
{
$group: {
'_id': '$_id',
'checkouts': { $push: { 'techId': '$checkouts.techId' } }
}
},
{
$out: 'temporal'
}
])
Then, you can make another aggregate from this temporal collection to $merge the documents with the modified checkouts field to your original collection.
db.temporal.aggregate([
{
$merge: {
into: collection,
on: "_id",
whenMatched:"merge",
whenNotMatched: "insert"
}
}
])

How to limit number of updating documents in mongodb

How to implement somethings similar to db.collection.find().limit(10) but while updating documents?
Now I'm using something really crappy like getting documents with db.collection.find().limit() and then updating them.
In general I wanna to return given number of records and change one field in each of them.
Thanks.
You can use:
db.collection.find().limit(NUMBER_OF_ITEMS_YOU_WANT_TO_UPDATE).forEach(
function (e) {
e.fieldToChange = "blah";
....
db.collection.save(e);
}
);
(Credits for forEach code: MongoDB: Updating documents using data from the same document)
What this will do is only change the number of entries you specify. So if you want to add a field called "newField" with value 1 to only half of your entries inside "collection", for example, you can put in
db.collection.find().limit(db.collection.count() / 2).forEach(
function (e) {
e.newField = 1;
db.collection.save(e);
}
);
If you then want to make the other half also have "newField" but with value 2, you can do an update with the condition that newField doesn't exist:
db.collection.update( { newField : { $exists : false } }, { $set : { newField : 2 } }, {multi : true} );
Using forEach to individually update each document is slow. You can update the documents in bulk using
ids = db.collection.find(<condition>).limit(<limit>).map(
function(doc) {
return doc._id;
}
);
db.collection.updateMany({_id: {$in: ids}}, <update>})
The solutions that iterate over all objects then update them individually are very slow.
Retrieving them all then updating simultaneously using $in is more efficient.
ids = People.where(firstname: 'Pablo').limit(10000).only(:_id).to_a.map(&:id)
People.in(_id: ids).update_all(lastname: 'Cantero')
The query is written using Mongoid, but can be easily rewritten in Mongo Shell as well.
Unfortunately the workaround you have is the only way to do it AFAIK. There is a boolean flag multi which will either update all the matches (when true) or update the 1st match (when false).
As the answer states there is still no way to limit the number of documents to update (or delete) to a value > 1. A workaround to use something like:
db.collection.find(<condition>).limit(<limit>).forEach(function(doc){db.collection.update({_id:doc._id},{<your update>})})
If your id is a sequence number and not an ObjectId you can do this in a for loop:
let batchSize= 10;
for (let i = 0; i <= 1000000; i += batchSize) {
db.collection.update({$and :[{"_id": {$lte: i+batchSize}}, {"_id": {$gt: i}}]}),{<your update>})
}
let fetchStandby = await db.model.distinct("key",{});
fetchStandby = fetchStandby.slice(0, no_of_docs_to_be_updated)
let fetch = await db.model.updateMany({
key: { $in: fetchStandby }
}, {
$set:{"qc.status": "pending"}
})
I also recently wanted something like this. I think querying for a long list of _id just to update in an $in is perhaps slow too, so I tried to use an aggregation+merge
while (true) {
const record = db.records.findOne({ isArchived: false }, {_id: 1})
if (!record) {
print("No more records")
break
}
db.records.aggregate([
{ $match: { isArchived: false } },
{ $limit: 100 },
{
$project: {
_id: 1,
isArchived: {
$literal: true
},
updatedAt: {
$literal: new Date()
}
}
},
{
$merge: {
into: "records",
on: "_id",
whenMatched: "merge"
}
}
])
print("Done update")
}
But feel free to comment if this is better or worse that a bulk update with $in.