Related
In MongoDB, is it possible to update the value of a field using the value from another field? The equivalent SQL would be something like:
UPDATE Person SET Name = FirstName + ' ' + LastName
And the MongoDB pseudo-code would be:
db.person.update( {}, { $set : { name : firstName + ' ' + lastName } );
The best way to do this is in version 4.2+ which allows using the aggregation pipeline in the update document and the updateOne, updateMany, or update(deprecated in most if not all languages drivers) collection methods.
MongoDB 4.2+
Version 4.2 also introduced the $set pipeline stage operator, which is an alias for $addFields. I will use $set here as it maps with what we are trying to achieve.
db.collection.<update method>(
{},
[
{"$set": {"name": { "$concat": ["$firstName", " ", "$lastName"]}}}
]
)
Note that square brackets in the second argument to the method specify an aggregation pipeline instead of a plain update document because using a simple document will not work correctly.
MongoDB 3.4+
In 3.4+, you can use $addFields and the $out aggregation pipeline operators.
db.collection.aggregate(
[
{ "$addFields": {
"name": { "$concat": [ "$firstName", " ", "$lastName" ] }
}},
{ "$out": <output collection name> }
]
)
Note that this does not update your collection but instead replaces the existing collection or creates a new one. Also, for update operations that require "typecasting", you will need client-side processing, and depending on the operation, you may need to use the find() method instead of the .aggreate() method.
MongoDB 3.2 and 3.0
The way we do this is by $projecting our documents and using the $concat string aggregation operator to return the concatenated string.
You then iterate the cursor and use the $set update operator to add the new field to your documents using bulk operations for maximum efficiency.
Aggregation query:
var cursor = db.collection.aggregate([
{ "$project": {
"name": { "$concat": [ "$firstName", " ", "$lastName" ] }
}}
])
MongoDB 3.2 or newer
You need to use the bulkWrite method.
var requests = [];
cursor.forEach(document => {
requests.push( {
'updateOne': {
'filter': { '_id': document._id },
'update': { '$set': { 'name': document.name } }
}
});
if (requests.length === 500) {
//Execute per 500 operations and re-init
db.collection.bulkWrite(requests);
requests = [];
}
});
if(requests.length > 0) {
db.collection.bulkWrite(requests);
}
MongoDB 2.6 and 3.0
From this version, you need to use the now deprecated Bulk API and its associated methods.
var bulk = db.collection.initializeUnorderedBulkOp();
var count = 0;
cursor.snapshot().forEach(function(document) {
bulk.find({ '_id': document._id }).updateOne( {
'$set': { 'name': document.name }
});
count++;
if(count%500 === 0) {
// Excecute per 500 operations and re-init
bulk.execute();
bulk = db.collection.initializeUnorderedBulkOp();
}
})
// clean up queues
if(count > 0) {
bulk.execute();
}
MongoDB 2.4
cursor["result"].forEach(function(document) {
db.collection.update(
{ "_id": document._id },
{ "$set": { "name": document.name } }
);
})
You should iterate through. For your specific case:
db.person.find().snapshot().forEach(
function (elem) {
db.person.update(
{
_id: elem._id
},
{
$set: {
name: elem.firstname + ' ' + elem.lastname
}
}
);
}
);
Apparently there is a way to do this efficiently since MongoDB 3.4, see styvane's answer.
Obsolete answer below
You cannot refer to the document itself in an update (yet). You'll need to iterate through the documents and update each document using a function. See this answer for an example, or this one for server-side eval().
For a database with high activity, you may run into issues where your updates affect actively changing records and for this reason I recommend using snapshot()
db.person.find().snapshot().forEach( function (hombre) {
hombre.name = hombre.firstName + ' ' + hombre.lastName;
db.person.save(hombre);
});
http://docs.mongodb.org/manual/reference/method/cursor.snapshot/
Starting Mongo 4.2, db.collection.update() can accept an aggregation pipeline, finally allowing the update/creation of a field based on another field:
// { firstName: "Hello", lastName: "World" }
db.collection.updateMany(
{},
[{ $set: { name: { $concat: [ "$firstName", " ", "$lastName" ] } } }]
)
// { "firstName" : "Hello", "lastName" : "World", "name" : "Hello World" }
The first part {} is the match query, filtering which documents to update (in our case all documents).
The second part [{ $set: { name: { ... } }] is the update aggregation pipeline (note the squared brackets signifying the use of an aggregation pipeline). $set is a new aggregation operator and an alias of $addFields.
Regarding this answer, the snapshot function is deprecated in version 3.6, according to this update. So, on version 3.6 and above, it is possible to perform the operation this way:
db.person.find().forEach(
function (elem) {
db.person.update(
{
_id: elem._id
},
{
$set: {
name: elem.firstname + ' ' + elem.lastname
}
}
);
}
);
I tried the above solution but I found it unsuitable for large amounts of data. I then discovered the stream feature:
MongoClient.connect("...", function(err, db){
var c = db.collection('yourCollection');
var s = c.find({/* your query */}).stream();
s.on('data', function(doc){
c.update({_id: doc._id}, {$set: {name : doc.firstName + ' ' + doc.lastName}}, function(err, result) { /* result == true? */} }
});
s.on('end', function(){
// stream can end before all your updates do if you have a lot
})
})
update() method takes aggregation pipeline as parameter like
db.collection_name.update(
{
// Query
},
[
// Aggregation pipeline
{ "$set": { "id": "$_id" } }
],
{
// Options
"multi": true // false when a single doc has to be updated
}
)
The field can be set or unset with existing values using the aggregation pipeline.
Note: use $ with field name to specify the field which has to be read.
Here's what we came up with for copying one field to another for ~150_000 records. It took about 6 minutes, but is still significantly less resource intensive than it would have been to instantiate and iterate over the same number of ruby objects.
js_query = %({
$or : [
{
'settings.mobile_notifications' : { $exists : false },
'settings.mobile_admin_notifications' : { $exists : false }
}
]
})
js_for_each = %(function(user) {
if (!user.settings.hasOwnProperty('mobile_notifications')) {
user.settings.mobile_notifications = user.settings.email_notifications;
}
if (!user.settings.hasOwnProperty('mobile_admin_notifications')) {
user.settings.mobile_admin_notifications = user.settings.email_admin_notifications;
}
db.users.save(user);
})
js = "db.users.find(#{js_query}).forEach(#{js_for_each});"
Mongoid::Sessions.default.command('$eval' => js)
With MongoDB version 4.2+, updates are more flexible as it allows the use of aggregation pipeline in its update, updateOne and updateMany. You can now transform your documents using the aggregation operators then update without the need to explicity state the $set command (instead we use $replaceRoot: {newRoot: "$$ROOT"})
Here we use the aggregate query to extract the timestamp from MongoDB's ObjectID "_id" field and update the documents (I am not an expert in SQL but I think SQL does not provide any auto generated ObjectID that has timestamp to it, you would have to automatically create that date)
var collection = "person"
agg_query = [
{
"$addFields" : {
"_last_updated" : {
"$toDate" : "$_id"
}
}
},
{
$replaceRoot: {
newRoot: "$$ROOT"
}
}
]
db.getCollection(collection).updateMany({}, agg_query, {upsert: true})
(I would have posted this as a comment, but couldn't)
For anyone who lands here trying to update one field using another in the document with the c# driver...
I could not figure out how to use any of the UpdateXXX methods and their associated overloads since they take an UpdateDefinition as an argument.
// we want to set Prop1 to Prop2
class Foo { public string Prop1 { get; set; } public string Prop2 { get; set;} }
void Test()
{
var update = new UpdateDefinitionBuilder<Foo>();
update.Set(x => x.Prop1, <new value; no way to get a hold of the object that I can find>)
}
As a workaround, I found that you can use the RunCommand method on an IMongoDatabase (https://docs.mongodb.com/manual/reference/command/update/#dbcmd.update).
var command = new BsonDocument
{
{ "update", "CollectionToUpdate" },
{ "updates", new BsonArray
{
new BsonDocument
{
// Any filter; here the check is if Prop1 does not exist
{ "q", new BsonDocument{ ["Prop1"] = new BsonDocument("$exists", false) }},
// set it to the value of Prop2
{ "u", new BsonArray { new BsonDocument { ["$set"] = new BsonDocument("Prop1", "$Prop2") }}},
{ "multi", true }
}
}
}
};
database.RunCommand<BsonDocument>(command);
MongoDB 4.2+ Golang
result, err := collection.UpdateMany(ctx, bson.M{},
mongo.Pipeline{
bson.D{{"$set",
bson.M{"name": bson.M{"$concat": []string{"$lastName", " ", "$firstName"}}}
}},
)
In MongoDB, is it possible to update the value of a field using the value from another field? The equivalent SQL would be something like:
UPDATE Person SET Name = FirstName + ' ' + LastName
And the MongoDB pseudo-code would be:
db.person.update( {}, { $set : { name : firstName + ' ' + lastName } );
The best way to do this is in version 4.2+ which allows using the aggregation pipeline in the update document and the updateOne, updateMany, or update(deprecated in most if not all languages drivers) collection methods.
MongoDB 4.2+
Version 4.2 also introduced the $set pipeline stage operator, which is an alias for $addFields. I will use $set here as it maps with what we are trying to achieve.
db.collection.<update method>(
{},
[
{"$set": {"name": { "$concat": ["$firstName", " ", "$lastName"]}}}
]
)
Note that square brackets in the second argument to the method specify an aggregation pipeline instead of a plain update document because using a simple document will not work correctly.
MongoDB 3.4+
In 3.4+, you can use $addFields and the $out aggregation pipeline operators.
db.collection.aggregate(
[
{ "$addFields": {
"name": { "$concat": [ "$firstName", " ", "$lastName" ] }
}},
{ "$out": <output collection name> }
]
)
Note that this does not update your collection but instead replaces the existing collection or creates a new one. Also, for update operations that require "typecasting", you will need client-side processing, and depending on the operation, you may need to use the find() method instead of the .aggreate() method.
MongoDB 3.2 and 3.0
The way we do this is by $projecting our documents and using the $concat string aggregation operator to return the concatenated string.
You then iterate the cursor and use the $set update operator to add the new field to your documents using bulk operations for maximum efficiency.
Aggregation query:
var cursor = db.collection.aggregate([
{ "$project": {
"name": { "$concat": [ "$firstName", " ", "$lastName" ] }
}}
])
MongoDB 3.2 or newer
You need to use the bulkWrite method.
var requests = [];
cursor.forEach(document => {
requests.push( {
'updateOne': {
'filter': { '_id': document._id },
'update': { '$set': { 'name': document.name } }
}
});
if (requests.length === 500) {
//Execute per 500 operations and re-init
db.collection.bulkWrite(requests);
requests = [];
}
});
if(requests.length > 0) {
db.collection.bulkWrite(requests);
}
MongoDB 2.6 and 3.0
From this version, you need to use the now deprecated Bulk API and its associated methods.
var bulk = db.collection.initializeUnorderedBulkOp();
var count = 0;
cursor.snapshot().forEach(function(document) {
bulk.find({ '_id': document._id }).updateOne( {
'$set': { 'name': document.name }
});
count++;
if(count%500 === 0) {
// Excecute per 500 operations and re-init
bulk.execute();
bulk = db.collection.initializeUnorderedBulkOp();
}
})
// clean up queues
if(count > 0) {
bulk.execute();
}
MongoDB 2.4
cursor["result"].forEach(function(document) {
db.collection.update(
{ "_id": document._id },
{ "$set": { "name": document.name } }
);
})
You should iterate through. For your specific case:
db.person.find().snapshot().forEach(
function (elem) {
db.person.update(
{
_id: elem._id
},
{
$set: {
name: elem.firstname + ' ' + elem.lastname
}
}
);
}
);
Apparently there is a way to do this efficiently since MongoDB 3.4, see styvane's answer.
Obsolete answer below
You cannot refer to the document itself in an update (yet). You'll need to iterate through the documents and update each document using a function. See this answer for an example, or this one for server-side eval().
For a database with high activity, you may run into issues where your updates affect actively changing records and for this reason I recommend using snapshot()
db.person.find().snapshot().forEach( function (hombre) {
hombre.name = hombre.firstName + ' ' + hombre.lastName;
db.person.save(hombre);
});
http://docs.mongodb.org/manual/reference/method/cursor.snapshot/
Starting Mongo 4.2, db.collection.update() can accept an aggregation pipeline, finally allowing the update/creation of a field based on another field:
// { firstName: "Hello", lastName: "World" }
db.collection.updateMany(
{},
[{ $set: { name: { $concat: [ "$firstName", " ", "$lastName" ] } } }]
)
// { "firstName" : "Hello", "lastName" : "World", "name" : "Hello World" }
The first part {} is the match query, filtering which documents to update (in our case all documents).
The second part [{ $set: { name: { ... } }] is the update aggregation pipeline (note the squared brackets signifying the use of an aggregation pipeline). $set is a new aggregation operator and an alias of $addFields.
Regarding this answer, the snapshot function is deprecated in version 3.6, according to this update. So, on version 3.6 and above, it is possible to perform the operation this way:
db.person.find().forEach(
function (elem) {
db.person.update(
{
_id: elem._id
},
{
$set: {
name: elem.firstname + ' ' + elem.lastname
}
}
);
}
);
I tried the above solution but I found it unsuitable for large amounts of data. I then discovered the stream feature:
MongoClient.connect("...", function(err, db){
var c = db.collection('yourCollection');
var s = c.find({/* your query */}).stream();
s.on('data', function(doc){
c.update({_id: doc._id}, {$set: {name : doc.firstName + ' ' + doc.lastName}}, function(err, result) { /* result == true? */} }
});
s.on('end', function(){
// stream can end before all your updates do if you have a lot
})
})
update() method takes aggregation pipeline as parameter like
db.collection_name.update(
{
// Query
},
[
// Aggregation pipeline
{ "$set": { "id": "$_id" } }
],
{
// Options
"multi": true // false when a single doc has to be updated
}
)
The field can be set or unset with existing values using the aggregation pipeline.
Note: use $ with field name to specify the field which has to be read.
Here's what we came up with for copying one field to another for ~150_000 records. It took about 6 minutes, but is still significantly less resource intensive than it would have been to instantiate and iterate over the same number of ruby objects.
js_query = %({
$or : [
{
'settings.mobile_notifications' : { $exists : false },
'settings.mobile_admin_notifications' : { $exists : false }
}
]
})
js_for_each = %(function(user) {
if (!user.settings.hasOwnProperty('mobile_notifications')) {
user.settings.mobile_notifications = user.settings.email_notifications;
}
if (!user.settings.hasOwnProperty('mobile_admin_notifications')) {
user.settings.mobile_admin_notifications = user.settings.email_admin_notifications;
}
db.users.save(user);
})
js = "db.users.find(#{js_query}).forEach(#{js_for_each});"
Mongoid::Sessions.default.command('$eval' => js)
With MongoDB version 4.2+, updates are more flexible as it allows the use of aggregation pipeline in its update, updateOne and updateMany. You can now transform your documents using the aggregation operators then update without the need to explicity state the $set command (instead we use $replaceRoot: {newRoot: "$$ROOT"})
Here we use the aggregate query to extract the timestamp from MongoDB's ObjectID "_id" field and update the documents (I am not an expert in SQL but I think SQL does not provide any auto generated ObjectID that has timestamp to it, you would have to automatically create that date)
var collection = "person"
agg_query = [
{
"$addFields" : {
"_last_updated" : {
"$toDate" : "$_id"
}
}
},
{
$replaceRoot: {
newRoot: "$$ROOT"
}
}
]
db.getCollection(collection).updateMany({}, agg_query, {upsert: true})
(I would have posted this as a comment, but couldn't)
For anyone who lands here trying to update one field using another in the document with the c# driver...
I could not figure out how to use any of the UpdateXXX methods and their associated overloads since they take an UpdateDefinition as an argument.
// we want to set Prop1 to Prop2
class Foo { public string Prop1 { get; set; } public string Prop2 { get; set;} }
void Test()
{
var update = new UpdateDefinitionBuilder<Foo>();
update.Set(x => x.Prop1, <new value; no way to get a hold of the object that I can find>)
}
As a workaround, I found that you can use the RunCommand method on an IMongoDatabase (https://docs.mongodb.com/manual/reference/command/update/#dbcmd.update).
var command = new BsonDocument
{
{ "update", "CollectionToUpdate" },
{ "updates", new BsonArray
{
new BsonDocument
{
// Any filter; here the check is if Prop1 does not exist
{ "q", new BsonDocument{ ["Prop1"] = new BsonDocument("$exists", false) }},
// set it to the value of Prop2
{ "u", new BsonArray { new BsonDocument { ["$set"] = new BsonDocument("Prop1", "$Prop2") }}},
{ "multi", true }
}
}
}
};
database.RunCommand<BsonDocument>(command);
MongoDB 4.2+ Golang
result, err := collection.UpdateMany(ctx, bson.M{},
mongo.Pipeline{
bson.D{{"$set",
bson.M{"name": bson.M{"$concat": []string{"$lastName", " ", "$firstName"}}}
}},
)
I have a collection with multiple date type fields. I know I can change them based on their key, but is there a way to find all fields that have date as a type and change all of them in one script?
UPDATE
Many thanks to chridam for helping me out. Based upon his code I came up with this solution. (Note: I have mongo 3.2.9, and some code snippets from chridam's answer just wouldn't run. It might be valid but it didn't work for me.)
map = function() {
for (var key in this) {
if (key != null && this[key] != null && this[key] instanceof Date){
emit(key, null);
}
}
}
collectionName = "testcollection_copy";
mr = db.runCommand({
"mapreduce": collectionName,
"map": map,
"reduce": function() {},
"out": "map_reduce_test" // out is required
})
dateFields = db[mr.result].distinct("_id")
printjson(dateFields)
//updating documents
db[collectionName].find().forEach(function (document){
for(var i=0;i<dateFields.length;i++){
document[dateFields[i]] = new NumberLong(document[dateFields[i]].getTime());
}
db[collectionName].save(document);
});
Since projection didn't work, I used the above code for updating the documents.
My only question is why to use bulkWrite?
(Also, getTime() seemed better than substracting dates.)
An operation like this would involve two tasks; one to get a list of fields with the date type via MapReduce and the next to update the collection via aggregation or Bulk write operations.
NB: The following methodology assumes all the date fields are at the root level of the document and not embedded nor subdocuments.
MapReduce
The first thing you need is to run the following mapReduce operation. This will help you determine if each property with every document in the collection is of date type and returns a distinct list of the date fields:
// define helper function to determine if a key is of Date type
isDate = function(dt) {
return dt && dt instanceof Date && !isNaN(dt.valueOf());
}
// map function
map = function() {
for (var key in this) {
if (isDate(value[key])
emit(key, null);
}
}
// variable with collection name
collectionName = "yourCollectionName";
mr = db.runCommand({
"mapreduce": collectionName,
"map": map,
"reduce": function() {}
})
dateFields = db[mr.result].distinct("_id")
printjson(dateFields)
//output: [ "validFrom", "validTo", "registerDate"" ]
Option 1: Update collection via aggregation framework
You can use the aggregation framework to update your collection, in particular the $addFields operator available in MongoDB version 3.4 and newer. If your MongoDB server version does not support this, you can update your collection with the other workaround (as described in the next option).
The timestamp is calculated by using the $subtract arithmetic aggregation operator with the date field as minuend and the date since epoch new Date("1970-01-01") as subtrahend.
The resulting documents of the aggregation pipeline are then written to the same collection via the $out operator thus updating the collection with the new fields.
In essence, you'd want to end up running the following aggregation pipeline which converts the date fields to timestamps using the above algorithm:
pipeline = [
{
"$addFields": {
"validFrom": { "$subtract": [ "$validFrom", new Date("1970-01-01") ] },
"validTo": { "$subtract": [ "$validTo", new Date("1970-01-01") ] },
"registerDate": { "$subtract": [ "$registerDate", new Date("1970-01-01") ] }
}
},
{ "$out": collectionName }
]
db[collectionName].aggregate(pipeline)
You can dynamically create the above pipeline array given the list of the date fields as follows:
var addFields = { "$addFields": { } },
output = { "$out": collectionName };
dateFields.forEach(function(key){
var subtr = ["$"+key, new Date("1970-01-01")];
addFields["$addFields"][key] = { "$subtract": subtr };
});
db[collectionName].aggregate([addFields, output])
Option 2: Update collection via Bulk
Since this option is a workaround when $addFields operator from above is not supported, you can use the $project pipeline to create the new timestamp fields with the same $subtract implementation but instead of writing the results to the same collection, you can iterate the cursor from the aggregate results using forEach() method and with each document, update the collection using the bulkWrite() method.
The following example shows this approach:
ops = []
pipeline = [
{
"$project": {
"validFrom": { "$subtract": [ "$validFrom", new Date("1970-01-01") ] },
"validTo": { "$subtract": [ "$validTo", new Date("1970-01-01") ] },
"registerDate": { "$subtract": [ "$registerDate", new Date("1970-01-01") ] }
}
}
]
db[collectionName].aggregate(pipeline).forEach(function(doc) {
ops.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": {
"$set": {
"validFrom": doc.validFrom,
"validTo": doc.validTo,
"registerDate": doc.registerDate
}
}
}
});
if (ops.length === 500 ) {
db[collectionName].bulkWrite(ops);
ops = [];
}
})
if (ops.length > 0)
db[collectionName].bulkWrite(ops);
Using the same method as Option 1 above to create the pipeline and the bulk method objects dynamically:
var ops = [],
project = { "$project": { } },
dateFields.forEach(function(key){
var subtr = ["$"+key, new Date("1970-01-01")];
project["$project"][key] = { "$subtract": subtr };
});
setDocFields = function(doc, keysList) {
setObj = { "$set": { } };
return keysList.reduce(function(obj, key) {
obj["$set"][key] = doc[key];
return obj;
}, setObj )
}
db[collectionName].aggregate([project]).forEach(function(doc) {
ops.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": setDocFields(doc, dateFields)
}
});
if (ops.length === 500 ) {
db[collectionName].bulkWrite(ops);
ops = [];
}
})
if (ops.length > 0)
db[collectionName].bulkWrite(ops);
I have this MongoDB query:
var array=[]; //some string values
collection.aggregate(
{ $match: { '_id': { $in : array } } }
)
But this is not returning any results. How do I perform this?
As noted in the comments, your array variable is an array of hex string values ex :["57f36e94517f72bc09ee761e"] and for mongo shell, you need to first cast those string values to ObjectIds. Use the JavaScript map()
method to accomplish the casting in a list.
For example:
mongo shell
var array = ["585808969e39db5196444c07", "585808969e39db5196444c06"];
var ids = array.map(function(id){ return ObjectId(id); });
which you can then query using the aggregate function as in the following
db.collection.aggregate([
{ "$match": { "_id": { "$in" : ids } } }
])
The above is essentially the same as
db.collection.find({ "_id": { "$in": ids } })
Node.js
var {ObjectId} = require('mongodb'); // or ObjectID
var ids = array.map(id => ObjectId.isValid(id) ? new ObjectId(id) : null;);
I have the following query on mongo console
db.photos.find({'creation_date': {$gte: <somedate>)}}).
Is there a way to pluck the id's from query result just by using mongo shell ?
Try using the map() cursor method
var ids = db.photos.find({'creation_date': { '$gte': <somedate>) } }, {'_id': 1})
.map(function (doc){ return doc._id; })
You can also use the distinct() method as
var ids = db.photos.distinct('_id', {'creation_date': { '$gte': <somedate>) } })
or with the toArray() cursor method on aggregate() as
var ids = db.photos.aggregate([
{ '$match': {'creation_date': { '$gte': <somedate>) } } }
{ '$group': { '_id': 0, 'ids': { '$push': '$_id' } } }
]).toArray()[0].ids
MongoDb provides limit fields from query results
See more about in below link
http://docs.mongodb.org/manual/tutorial/project-fields-from-query-results/
db.photos.find({'creation_date': {$gte: )},{"_id":1}});
returns
{
"_id":1
}
{
"-id":2
}
...
You can use distinct instead of find:
collection.distinct('_id', {'creation_date': some_date})