In MongoDB, method to keep the previous value of a field in a different field while updating an object? - mongodb

Say I have an object with field state, I want to update this field, while keeping the previous value of state in previous_state field. First, I have tried to make an update with unset-rename-set:
collection.update(query, {$unset: {previous_state: ""}, $rename: {state: "previous_state"}, $set: {state: value}})
no surprise it did not work. After reading:
Update MongoDB field using value of another field
MongoDB update: Generate new field based on existing field, or update in place
Update field with another field's value in the document
I am nearly convinced that I do not have a solution to perform this in a single query. So the question is what is the best practice to do it?

There are various ways to do it, depending on the version of MongoDB, and they are described in this answer from another thread: https://stackoverflow.com/a/37280419/5538923 .
For MongoDB 3.4+, for example, there is this query that can be put in MongoSH:
db.collection.aggregate(
[
{ "$addFields": {
"previous_state": { "$concat": [ "$state" ] },
"state": { "$concat": [ "$state", " customly modified" ] }
}},
{ "$out": "collection" }
])
Also note that this query works only when the MongoDB instance is not sharded. When sharded (e.g., often the case in Microsoft Azure CosmosDB), the method described in that answer for MongoDB 3.2+ works, or alternatively put a new collection as destination (in the $out close), and then import the data in the original collection, after removing all the data there.

One solution (if you've got onlty one writer) could be to trigger your update in two steps:
> var previousvalue = collection.findOne(query).state;
> collection.update(query, {$set: {"previous_state": previousvalue, "state": newstatevalue}});

Related

Is it possible to $setOnInsert with aggregation pipeline?

MongoDB has recently added an option to perform an update operation by providing an aggregation pipeline rather than the standard modifier object. Check MongoDB's docs on this topic.
The ability to use aggregation pipeline, whose statements can refer to existing document properties, can be extremely useful in situations when certain fields needs to be evaluated based on other fields, e.g. during data migration.
Moreover, most of the standard update operators like $set, $push, $inc, etc. can be successfully replicated with the aggregation expression language so in some sense this new functionality generalizes the good old modifiers technique. Though, I must admit the pipeline can become quite verbose if one tries to do things like $addToSet. This of course brings up a whole bunch of performance related questions, but let's ignore them for now.
So far, there's been just one thing which I haven't been able to fully replicate with the aggregation pipeline update, namely the $setOnInsert operator. Let's assume that I want to perform an upsert:
db.test.update(selector, pipeline, { upsert: true });
My initial intuition was that the $$ROOT variable (which I can use in the pipeline) will equal null unless there exists a document that matches selector. Unfortunately, but probably for a good reason, MongoDB developers decided that $$ROOT should be derived from selector by default. It makes sense when you think about how normal $setOnInsert works, but it also makes it practically impossible to distinguish between an update and an insert within pipeline.
I know what you're thinking. You can look at $$ROOT._id. This is a good idea, though if _id is part of the selector it doesn't work anymore. I have figured out that this can be bypassed by tricking MongoDB a little bit and doing things like:
selector = {
_id: { $in: [value, 'fake'] },
}
instead if the simpler { _id: value }, but this doesn't look clean. Please note that if $in only contains one element, then Mongo is actually clever enough to figure out what the identifier should be and it populates $$ROOT accordingly (sic!).
I am wondering if anyone has a better idea how to approach this. Maybe there's some hidden variable that I could potentially use inside the pipeline itself to distinguish between update and insert (e.g. in $merge stage there's $$new variable which serves a similar purpose)?
If there is no matching documents, $$ROOT will have only _id field. So you can transform $$ROOT to array by its key/value pairs and check if the size of that array is equal to 1. If it is then create a new document, and if it is not then do nothing.
$objectToArray and $size to convert $$ROOT to an array by its key/value pairs and to get the size of that array
$cond to check if the size of the array above is equal to 1. If it is then merge current $$ROOT (which is only _id field) with the update object. If it is not, return the current $$ROOT. In both scenarios, put result in result feild.
$mergeObjects to merge $$ROOT and the update that you are sending, and put that in the result field
$replaceRoot to replace root to the result field from previous stage
db.collection.update({
_id: 1
},
[
{
$set: {
result: {
$cond: {
if: {
"$eq": [
{
$size: {
$objectToArray: "$$ROOT"
},
},
1
]
},
then: {
$mergeObjects: [
"$$ROOT",
{
key: 3
}
]
},
else: "$$ROOT"
},
}
}
},
{
$replaceRoot: {
newRoot: "$result"
}
}
],
{
upsert: true
})
Working example

How to match a trigger on a specific field in mongodb stitch?

The $match expression on mongodb's stitch application does not work properly.
I am trying to set up a simple update trigger that will only work on one field in a collection.
The trigger setup provides a $match aggregation which seems simple enough to set up.
For example if I want the trigger to only fire when the field "online" in a specified collection gets set to "true" I would do:
{"updateDescription.updatedFields":{"online":"true"}}
which for a stitch trigger is the same as:
{$match:{{updateDescription.updatedFields:{online:"true"}}}
The problem is when i try to match an update on a field that is an object.(for example hours:{online:40,offline:120}
For some reason $exists or $in does not work
So doing:
{"updateDescription.updatedFields":{"hours":{"$exists":true}}
does not work,neither does something like:
{"updateDescription.updatedFields":{"hours.online":{"$exists":true}}
The $match for the trigger is supposed to work exactly like a normal mongo $match.
They just provide one example :
{
"updateDescription.updatedFields": {
"status": "blocked"
}
}
The example is from here:
https://docs.mongodb.com/stitch/triggers/database-triggers/
I tried 100's of variations but i can't seem to get it
The trigger is working fine if the match is a specific value like:
{"updateDescription.updatedFields":{"hours.online":{"$numberInt\":"20"}}
and then i set the hours.online to 20 in the database.
I was able to have it match items by using an explicit $expr operator or declare it as a single field not an embedded object. ie. "updateDescription.updatedFields.statue": "blocked"
I struggled with this myself, trying to get a trigger to fire when a certain nested field was updated to any value (rather than just one specific one).
The issue appears to have to do with how change streams report updated fields.
With credit and thanks to MongoDB support, I can finally offer this as a potential solution, at least for simpler cases:
{
"$expr": {
"$not": {
"$cmp": [{
"$let": {
"vars": { "updated": { "$objectToArray": "$updateDescription.updatedFields" } },
"in": { "$arrayElemAt": [ "$$updated.k", 0 ] }
}},
"example.subdocument.nested_field"
]
}
}
}
Of course replace example.subdocument.nested_field with your own dot-notation field path.

Set value of a key with value of another key in mongodb directly with query

Can I create a query, something like this below
db.getCollection('walkins.businesses').update(
{$and:[{"loyaltyModule.isPublished": true},{"loyaltyModule.publishAt": {"$eq":null}}]},
{$set : {"loyaltyModule.publishAt":"this.loyaltyModule.creationAt"}}, {multi:true}
)
to set value of creationAt as publishAt using update query directly where creationAt is already in collection.
Can I set the value of publishAt using another field creationAt in the same document?
With Aggregate
The best way to do this is to use the aggregation framework to compute our new field.
using the $addFields and the $out aggregation pipeline operators.
db.getCollection('walkins.businesses').aggregate(
[
{$match : {$and:[{"loyaltyModule.isPublished": true},{"loyaltyModule.publishAt": {"$eq":null}}]}},
{ "$addFields": {
"loyaltyModule.publishAt":"loyaltyModule.creationAt"
}},
{ "$out": "collection" }
]
)
Note that this does not update your collection but instead replace the existing collection or create a new one. Also for update operations that require "type casting" you will need client side processing, and depending on the operation, you may need to use the find() method instead of the .aggreate() method
With Iteration of cursor
you can iterate the cursor and update
db.getCollection('walkins.businesses').find({$and:[{"loyaltyModule.isPublished": true},{"loyaltyModule.publishAt": {"$eq":null}}]}).forEach(function(x){
db.getCollection('walkins.businesses').update({_id : x._id },
{$set : {"loyaltyModule.publishAt": x.loyaltyModule.creationAt}},
{multi:true}
)
})
here, you can't update multiple records at one update query due to update happening by matching with _id field

mongodb how to get a document which has max value of each "group with the same key" [duplicate]

This question already has answers here:
MongoDB - get documents with max attribute per group in a collection
(2 answers)
Closed 5 years ago.
I have a collection:
{'_id':'008','name':'ada','update':'1504501629','star':3.6,'desc':'ok', ...}
{'_id':'007','name':'bob','update':'1504501614','star':4.2,'desc':'gb', ...}
{'_id':'005','name':'ada','update':'1504501532','star':3.2,'desc':'ok', ...}
{'_id':'003','name':'bob','update':'1504501431','star':4.5,'desc':'bg', ...}
{'_id':'002','name':'ada','update':'1504501378','star':3.4,'desc':'no', ...}
{'_id':'001','name':'ada','update':'1504501325','star':3.6,'desc':'ok', ...}
{'_id':'000','name':'bob','update':'1504501268','star':4.3,'desc':'gg', ...}
...
if I want the result is, the max value of 'update' of the same 'name', means the newest document of 'name', get the whole document:
{'_id':'008','name':'ada','update':'1504501629','star':3.6,'desc':'ok', ...}
{'_id':'007','name':'bob','update':'1504501614','star':4.2,'desc':'gb', ...}
...
How to do it most effective?
I do it now in python is:
result=[]
for name in db.collection.distinct('name'):
result.append(db.collection.find({'name':name}).sort('update',-1)[0])
is it do 'find' too many times?
=====
I do this for crawl data with 'name', get many other keys, and every time I insert a document, I set a key named 'update'.
When I using the database, I want the newest document of specific 'name'. so it looks can not just use $group.
How should I do? re design the db structure or better way to find?
=====
Improved !
I've tried create index of 'name' & 'update', the process is shortened from half hour to 30 seconds!
But I still welcome for better solution ^_^
Your use case scenario suits real good for aggregation. As I see in your question you already know that but can't figure out how to use $group and take whole document that has the max update. If you $sort your documents before $groupyou can use $firstoperator. So no need to send a find query for each name.
db.collection.aggregate(
{ $sort: { "name": 1, "update": -1 } },
{ $group: { _id: "$name", "update": { $first: "$update" }, "doc_id": { $first: "$_id" } } }
)
I did not add an extra $projectoperation to aggregate, you can just add fields that you want in result to $groupwith $firstoperator.
Additionally, if you look closer to $sortoperation, you can see it uses your newly created index, so you did good to add that, otherwise I will recommend it too :)
Update: For your question in comment:
You should write all keys in $group. But if you think it will look bad or new fileds will come in future and does not want to rewrite $groupeach time, I would do that:
First get all _idfields of desired documents in aggregation and then get these documents in one findquery with $inoperator.
db.collection.find( { "_id": { $in: [<ids returned in aggregation] } } )

How to not list all fields one by one in project when aggregating?

I am using Mongo 3.2.14
I have a mongo collection that looks like this:
{
'_id':...
'field1':...
'field2':...
'field3':...
etc...
}
I want to aggregate this way:
db.collection.aggregate{
'$match':{},
'$project':{
'field1':1,
'field2':1,
'field3':1,
etc...(all fields)
}
}
Is there a way to include all fields in the project without listing each field one by one ? (I have around 30 fields, and growing...)
I have found info on this here:
MongoDB $project: Retain previous pipeline fields
Include all existing fields and add new fields to document
how to not write every field one by one in project
However, I'm using mongo 3.2.14 and I don't need to create a new field, so, I think I cannot use $addFields. But, if I could, can someone show me how to use it?
Basically, if you want all attributes of your documents to be passed to the next pipeline you can skip the $project pipeline. but if you want all the attributes except the "_id" value then you can pass
{ $project: { _id: 0 } }
which will return every value except the _id.
And if by any chance you have embedded lists or nests that you want to flatten, you can use the $unwind pipeline
you can use $replaceRoot
db.collection.aggregate{
"$match": {},
"$project": {
"document": "$$ROOT"
},
{
"$replaceRoot": { "newRoot": "$document" }
}
this way you can get the exact document returned with all the fields in the document...you don't need to add each field one by one in the $project...try using $replaceRoot at the end of the pipeline.