I have a MongoDB document with a structure like this:
{
testValue: 10,
maxValue: 20
}
I want to set testValue to an arbitrary integer value between 0 and maxValue. Ideally, I would do something like this:
db.collection.update(
{},
{
$set: { 'testValue': newValue },
$min: { 'testValue': 0 },
$max: { 'textValue': $maxValue }
}
)
But that obviously doesn't work. There are a handful of threads that relate to this question (e.g. Update MongoDB field using value of another field), but they're all a few years old, and I can't find pertinent information in the official documentation. Is there a way to do what I want, or do I have to use find() to get maxValue then do a separate call to the database using update()?
This is still not possible in Mongo 3.2 or lower, check for example these JIRA tickets:
Self referential updates
Allow update to compute expressions using referenced fields
Related
{
questions: {
q1: "",
}
}
result after updating:
{
questions: {
q1: "",
q2: ""
}
}
I want to add q2 inside questions, without overwritting what's already inside it (q1).
One solution I found is to get the whole document, and modify it on my backend, and send the whole document to replace the current one. But it seems really in-effiecient as I have other fields in the document as well.
Is there a query that does it more efficiently? I looked at the mongodb docs, didn't seem to find a query that does it.
As turivishal said, you can use $set like this
But you also can use $addFields in this way:
db.collection.aggregate([
{
"$match": {
"questions.q1": "q1"
}
},
{
"$addFields": {
"questions.q2": "q2"
}
}
])
Example here
Also, reading your comment where you say Cannot create field 'q2' in element {questions: "q1"}. It seems your original schema is not the same you have into your DB.
Your schema says that questions is an object with field q1. But your error says that questions is the field and q1 the value.
I have a database like this:
{
"universe":"comics",
"saga":[
{
"name":"x-men",
"characters":[
{
"character":"wolverine",
"picture":"618035022351.png"
},
{
"character":"wolverine",
"picture":"618035022352.png"
}
]
}
]
},
{
"universe":"dc",
"saga":[
{
"name":"spiderman",
"characters":[
{
"character":"venom",
"picture":"618035022353.png"
}
]
}
]
}
And with this code, I update the field where name: wolverine:
db.getCollection('collection').findOneAndUpdate(
{
"universe": "comics"
},
{
$set: {
"saga.$[outer].characters.$[inner].character": "lobezno",
"saga.$[outer].characters.$[inner].picture": "618035022354.png"
}
},
/*{
"saga.characters": 1
},*/
{
"arrayFilters": [
{
"outer.name": "x-men"
},
{
"inner.character": "wolverine"
}
],
"multi":false
}
)
I want to just update the first object where there is a match, and stop it.
For example, if I have an array of 100,000 elements and the object where the match is, is in the tenth position, he will update that record, but he will continue going through the entire array and this seems ineffective to me even though he already did the update.
Note: if I did the update using an _id inside of universe.saga.characters instead of doing the update using the name, it would still loop through the rest of the elements.
How can I do it?
Update using arrayFilters conditions
I don't think it will find and update through loop, and It does not matter if collection have 100,000 sub documents, because here is nice explanation in $[<identifier>] and has mentioned:
The $[<identifier>] to define an identifier to update only those array elements that match the corresponding filter document in the arrayFilters
In the update document, use the $[<identifier>] filtered positional operator to define an identifier, which you then reference in the array filter documents. But make sure you cannot have an array filter document for an identifier if the identifier is not included in the update document.
Update using _id
Your point,
Note: if I did the update using an _id inside of universe.saga.characters instead of doing the update using the name, it would still loop through the rest of the elements.
MongoDB will certainly use the _id index. Here is the nice answer on question MongoDB Update Query Performance, from this you will get an better idea on above point
Update using indexed fields
You can create index according to your query section of update command, Here MongoDB Indexes and Indexing Strategies has explained why index is important,
In your example, lets see with examples:
Example 1: If document have 2 sub documents and when you update and check with explain("executionStats"), assume it will take 1 second to update,
quick use Mongo Playground (this platform will not support update query)
Example 2: If document have 1000 sub documents and when you update and check with explain("executionStats"), might be it will take more then 1 second,
If provide index on fields (universe, saga.characters.character and saga.characters.picture) then definitely it will take less time then usual without index, main benefit of index it will direct point to indexed fields.
quick use Mongo Playground (this platform will not support update query)
Create Index for your fields
db.maxData.createIndex({
"universe": 1,
"saga.characters.character": 1,
"saga.characters.picture": 1
})
For more experiment use above 2 examples data with index and without index and check executionStats you will get more clarity.
I am calling findAndModify() using the $max function to set the value of a field to the largest value.
For example, as shown in the MongoDB documentation.
db.scores.update( { _id: 1 }, { $max: { highScore: 950 } } )
I'd like to also set a lastUpdatedTimestamp only if the document is updated. I can't just perform a $set because that will always change the last updated timestamp. Is there a good mechanism within MongoDB to set another value only if the document is updated? Something similar to $setOnInsert but for any update.
If there isn't what might be a good approach here? Right now I'm thinking I could perform a regular find. Then do a local comparison. If the new value is greater than the old, then there is a good possibility that the update will update the document. So I just include the $set for the lastUpdatedTimestamp.
You can first make a query to find records having highScore less than your input value and then update. This will only set lastUpdatedTimestamp on updating the record.
db.scores.findAndModify({
query: { highScore: { $lt: 950 } },
update: { $set: { "highScore" : 950, "lastUpdatedTimestamp" : new Date() } },
})
as I see you wanted to . update your document only if your highScore can be updated .
only the documents's score is lower than new score value ,it will be updated with score field and lastUpdatedTimestamp
the best way is put your new score in the filter to find the documents match old score < new score
do it like this
db.scores.update(
{_id :4,highScore:{$lt:900}},
{$set:{highScore:900},
$currentDate: { lastModified: true }})
or set the modify time like
{$set:{highScore:900 ,lastupdatetime: new_time},
I have a collection in MongoDB where there are around (~3 million records). My sample record would look like,
{ "_id" = ObjectId("50731xxxxxxxxxxxxxxxxxxxx"),
"source_references" : [
"_id" : ObjectId("5045xxxxxxxxxxxxxx"),
"name" : "xxx",
"key" : 123
]
}
I am having a lot of duplicate records in the collection having same source_references.key. (By Duplicate I mean, source_references.key not the _id).
I want to remove duplicate records based on source_references.key, I'm thinking of writing some PHP code to traverse each record and remove the record if exists.
Is there a way to remove the duplicates in Mongo Internal command line?
This answer is obsolete : the dropDups option was removed in MongoDB 3.0, so a different approach will be required in most cases. For example, you could use aggregation as suggested on: MongoDB duplicate documents even after adding unique key.
If you are certain that the source_references.key identifies duplicate records, you can ensure a unique index with the dropDups:true index creation option in MongoDB 2.6 or older:
db.things.ensureIndex({'source_references.key' : 1}, {unique : true, dropDups : true})
This will keep the first unique document for each source_references.key value, and drop any subsequent documents that would otherwise cause a duplicate key violation.
Important Note: Any documents missing the source_references.key field will be considered as having a null value, so subsequent documents missing the key field will be deleted. You can add the sparse:true index creation option so the index only applies to documents with a source_references.key field.
Obvious caution: Take a backup of your database, and try this in a staging environment first if you are concerned about unintended data loss.
This is the easiest query I used on my MongoDB 3.2
db.myCollection.find({}, {myCustomKey:1}).sort({_id:1}).forEach(function(doc){
db.myCollection.remove({_id:{$gt:doc._id}, myCustomKey:doc.myCustomKey});
})
Index your customKey before running this to increase speed
While #Stennie's is a valid answer, it is not the only way. Infact the MongoDB manual asks you to be very cautious while doing that. There are two other options
Let the MongoDB do that for you using Map Reduce
Another way
You do programatically which is less efficient.
Here is a slightly more 'manual' way of doing it:
Essentially, first, get a list of all the unique keys you are interested.
Then perform a search using each of those keys and delete if that search returns bigger than one.
db.collection.distinct("key").forEach((num)=>{
var i = 0;
db.collection.find({key: num}).forEach((doc)=>{
if (i) db.collection.remove({key: num}, { justOne: true })
i++
})
});
I had a similar requirement but I wanted to retain the latest entry. The following query worked with my collection which had millions of records and duplicates.
/** Create a array to store all duplicate records ids*/
var duplicates = [];
/** Start Aggregation pipeline*/
db.collection.aggregate([
{
$match: { /** Add any filter here. Add index for filter keys*/
filterKey: {
$exists: false
}
}
},
{
$sort: { /** Sort it in such a way that you want to retain first element*/
createdAt: -1
}
},
{
$group: {
_id: {
key1: "$key1", key2:"$key2" /** These are the keys which define the duplicate. Here document with same value for key1 and key2 will be considered duplicate*/
},
dups: {
$push: {
_id: "$_id"
}
},
count: {
$sum: 1
}
}
},
{
$match: {
count: {
"$gt": 1
}
}
}
],
{
allowDiskUse: true
}).forEach(function(doc){
doc.dups.shift();
doc.dups.forEach(function(dupId){
duplicates.push(dupId._id);
})
})
/** Delete the duplicates*/
var i,j,temparray,chunk = 100000;
for (i=0,j=duplicates.length; i<j; i+=chunk) {
temparray = duplicates.slice(i,i+chunk);
db.collection.bulkWrite([{deleteMany:{"filter":{"_id":{"$in":temparray}}}}])
}
Expanding on Fernando's answer, I found that it was taking too long, so I modified it.
var x = 0;
db.collection.distinct("field").forEach(fieldValue => {
var i = 0;
db.collection.find({ "field": fieldValue }).forEach(doc => {
if (i) {
db.collection.remove({ _id: doc._id });
}
i++;
x += 1;
if (x % 100 === 0) {
print(x); // Every time we process 100 docs.
}
});
});
The improvement is basically using the document id for removing, which should be faster, and also adding the progress of the operation, you can change the iteration value to your desired amount.
Also, indexing the field before the operation helps.
pip install mongo_remove_duplicate_indexes
create a script in any language
iterate over your collection
create new collection and create new index in this collection with unique set to true ,remember this index has to be same as index u wish to remove duplicates from in ur original collection with same name
for ex-u have a collection gaming,and in this collection u have field genre which contains duplicates,which u wish to remove,so just create new collection
db.createCollection("cname")
create new index
db.cname.createIndex({'genre':1},unique:1)
now when u will insert document with similar genre only first will be accepted,other will be rejected with duplicae key error
now just insert the json format values u received into new collection and handle exception using exception handling
for ex pymongo.errors.DuplicateKeyError
check out the package source code for the mongo_remove_duplicate_indexes for better understanding
If you have enough memory, you can in scala do something like that:
cole.find().groupBy(_.customField).filter(_._2.size>1).map(_._2.tail).flatten.map(_.id)
.foreach(x=>cole.remove({id $eq x})
I am totally new to MongoDB... I am missing a "newbie" tag, so the experts would not have to see this question.
I am trying to update all documents in a collection using an expression. The query I was expecting to solve this was:
db.QUESTIONS.update({}, { $set: { i_pp : i_up * 100 - i_down * 20 } }, false, true);
That, however, results in the following error message:
ReferenceError: i_up is not defined (shell):1
At the same time, the database did not have any problem with eating this one:
db.QUESTIONS.update({}, { $set: { i_pp : 0 } }, false, true);
Do I have to do this one document at a time or something? That just seems excessively complicated.
Update
Thank you Sergio Tulentsev for telling me that it does not work. Now, I am really struggling with how to do this. I offer 500 Profit Points to the helpful soul, who can write this in a way that MongoDB understands. If you register on our forum I can add the Profit Points to your account there.
I just came across this while searching for the MongoDB equivalent of SQL like this:
update t
set c1 = c2
where ...
Sergio is correct that you can't reference another property as a value in a straight update. However, db.c.find(...) returns a cursor and that cursor has a forEach method:
Queries to MongoDB return a cursor, which can be iterated to retrieve
results. The exact way to query will vary with language driver.
Details below focus on queries from the MongoDB shell (i.e. the
mongo process).
The shell find() method returns a cursor object which we can then iterate to retrieve specific documents from the result. We use
hasNext() and next() methods for this purpose.
for( var c = db.parts.find(); c.hasNext(); ) {
print( c.next());
}
Additionally in the shell, forEach() may be used with a cursor:
db.users.find().forEach( function(u) { print("user: " + u.name); } );
So you can say things like this:
db.QUESTIONS.find({}, {_id: true, i_up: true, i_down: true}).forEach(function(q) {
db.QUESTIONS.update(
{ _id: q._id },
{ $set: { i_pp: q.i_up * 100 - q.i_down * 20 } }
);
});
to update them one at a time without leaving MongoDB.
If you're using a driver to connect to MongoDB then there should be some way to send a string of JavaScript into MongoDB; for example, with the Ruby driver you'd use eval:
connection.eval(%q{
db.QUESTIONS.find({}, {_id: true, i_up: true, i_down: true}).forEach(function(q) {
db.QUESTIONS.update(
{ _id: q._id },
{ $set: { i_pp: q.i_up * 100 - q.i_down * 20 } }
);
});
})
Other languages should be similar.
//the only differnce is to make it look like and aggregation pipeline
db.table.updateMany({}, [{
$set: {
col3:{"$sum":["$col1","$col2"]}
},
}]
)
You can't use expressions in updates. Or, rather, you can't use expressions that depend on fields of the document. Simple self-containing math expressions are fine (e.g. 2 * 2).
If you want to set a new field for all documents that is a function of other fields, you have to loop over them and update manually. Multi-update won't help here.
Rha7 gave a good idea, but the code above is not work without defining a temporary variable.
This sample code produces an approximate calculation of the age (leap years behinds the scene) based on 'birthday' field and inserts the value into suitable field for all documents not containing such:
db.employers.find({age: {$exists: false}}).forEach(function(doc){
var new_age = parseInt((ISODate() - doc.birthday)/(3600*1000*24*365));
db.employers.update({_id: doc._id}, {$set: {age: new_age}});
});
Example to remove "00" from the beginning of a caller id:
db.call_detail_records_201312.find(
{ destination: /^001/ },
{ "destination": true }
).forEach(function(row){
db.call_detail_records_201312.update(
{ _id: row["_id"] },
{ $set: {
destination: row["destination"].replace(/^001/, '1')
}
}
)
});