Ensuring exactly N items with value X remain in an array with mongodb - mongodb

Assuming we have a document in my MongoDB collection like the following:
{
"_id": "coffee",
"orders": [ "espresso", "cappuccino", "espresso", ... ],
}
How do I use a single update statement that ensures there are exactly say 2 espressos in this document, without knowing how many there are to begin with?
I know that using 2 consecutive statements I can do
db.test.update(
{ _id: "coffee" },
{ "$pull": { "orders": "espresso" } }
);
followed by
db.test.update(
{ "_id": "coffee" },
{ "$push": { "orders": { "$each": ["espresso", "espresso"] } } }
);
But when combining both into a single statement, MongoDB balks with an error 40, claiming Updating the path 'orders' would create a conflict at 'orders' (understandable enough - how does MongoDB what to do first?).
So, how can I do the above in a single statement? Please note that since I'll be using the above in the context of a larger unordered bulk operation, combining the above in an ordered bulk operation won't work.
Thanks for your help!

Related

How to upsert an element in mongodb array?

Suppose our document looks like this
{
a:1,
b:[
{c:120,d:100},
{c:121,d:110}
]
}
Now how could I upsert new objects in this array?
Suppose I want to perform update on the above document and add {c:200,d:120} to b so my expected result looks like this
{
a:1,
b:[
{c:120,d:100},
{c:121,d:110},
{c:200,d:120}
]
}
Also the update will of $inc, meaning suppose I want to increment d by 200 if c is present(lets say c is 200 and it is already present in the above document), if not present then I want to upsert the document itself.
Any help would be much appreciated.
It can be achieved using the $push update command.
db.<Collection-Name>.updateOne({"a": 1}, {"$push": {"b": {"c":200, "d":120}}})
Note: Use $addToSet if you don't want duplicates elements inside the array
UPDATE:
I don't think your precise requirement can't be achieved by a single command.
Your exact requirement can be achieved by the below script:
if (db.test9.findOne({"_id": ObjectId("5f19402abbc59a3864783fc7"), "b.c": 200}, {"_id": 1}) != null) {
db.test9.updateOne({
"_id": ObjectId("5f19402abbc59a3864783fc7"),
"b.c": 200,
}, {
"$set": {
"b.$.d": 120
}
})
} else {
db.test9.updateOne({
"_id": ObjectId("5f19402abbc59a3864783fc7"),
}, {
"$push": {
"b": {
"c":200, "d":120
}
}
});
}

Update Array Children Sorted Order

I have a collection containing objects with the following structure
{
"dep_id": "some_id",
"departament": "dep name",
"employees": [{
"name": "emp1",
"age": 31
},{
"name": "emp2",
"age": 35
}]
}
I would like to sort and save the array of employees for the object with id "some_id", by employees.age, descending. The best outcome would be to do this atomically using mongodb's query language. Is this possible?
If not, how can I rearrange the subdocuments without affecting the parent's other data or the data of the subdocuments? In case I have to download the data from the database and save back the sorted array of children, what would happen if something else performs an update to one of the children or children are added or removed in the meantime?
In the end, the data should be persisted to the database like this:
{
"dep_id": "some_id",
"departament": "dep name",
"employees": [{
"name": "emp2",
"age": 35
},{
"name": "emp1",
"age": 31
}]
}
The best way to do this is to actually apply the $sort modifier as you add items to the array. As you say in your comment "My actual objects have a "rank" and 'created_at'", which means that you really should have asked that in your question instead of writing a "contrived" case ( don't know why people do that ).
So for "sorting" by multiple properties, the following reference would adjust like this:
db.collection.update(
{ },
{ "$push": { "employees": { "$each": [], "$sort": { "rank": -1, "created_at": -1 } } } },
{ "multi": true }
)
But to update all the data you presently have "as is shown in the question", then you would sort on "age" with:
db.collection.update(
{ },
{ "$push": { "employees": { "$each": [], "$sort": { "age": -1 } } } },
{ "multi": true }
)
Which oddly uses $push to actually "modify" an array? Yes it's true, since the $each modifier says we are not actually adding anything new yet the $sort modifier is actually going to apply to the array in place and "re-order" it.
Of course this would then explain how "new" updates to the array should be written in order to apply that $sort and ensure that the "largest age" is always "first" in the array:
db.collection.update(
{ "dep_id": "some_id" },
{ "$push": {
"employees": {
"$each": [{ "name": "emp": 3, "age": 32 }],
"$sort": { "age": -1 }
}
}}
)
So what happens here is as you add the new entry to the array on update, the $sort modifier is applied and re-positions the new element between the two existing ones since that is where it would sort to.
This is a common pattern with MongoDB and is typically used in combination with the $slice modifier in order to keep arrays at a "maximum" length as new items are added, yet retain "ordered" results. And quite often "ranking" is the exact usage.
So overall, you can actually "update" your existing data and re-order it with "one simple atomic statement". No looping or collection renaming required. Furthermore, you now have a simple atomic method to "update" the data and maintain that order as you add new array items, or remove them.
In order to get what you want you can use the following query:
db.collection.aggregate({
$unwind: "$employees" // flatten employees array
}, {
$sort: {
"employees.name": -1 // sort all documents by employee name (descending)
}
}, {
$group: { // restore the previous structure
_id: "$_id",
"dep_id": {
$first: "$dep_id"
},
"departament": {
$first: "$departament"
},
"employees": {
$push: "$employees"
},
}
}, {
$out: "output" // write everything out to a separate collection
})
After this step you would want to drop your source table and rename the "output" collection to match your source table name.
This solution will, however, not deal with the concurrency issue. So you should remove write access from the collection first so nobody modifies it during the process and then restore it once you're done with the migration.
You could alternatively query all data first, then sort the employees array on the client side and then use either single update queries or - faster but more complicated - a bulk write operation with all the individual update calls in order to update the existing documents. Here, you could use the entire document that you've initially read as a filter for the update operation. So if an individual update does not modify any document you'd know straight away, that some other change must have modified the document you read before. Those cases you'd need to retry later (or straight away until the update does actually modify a document).

Trouble renaming key in array of documents [duplicate]

I have a Mongo document which holds an array of elements.
I'd like to reset the .handled attribute of all objects in the array where .profile = XX.
The document is in the following form:
{
"_id": ObjectId("4d2d8deff4e6c1d71fc29a07"),
"user_id": "714638ba-2e08-2168-2b99-00002f3d43c0",
"events": [{
"handled": 1,
"profile": 10,
"data": "....."
} {
"handled": 1,
"profile": 10,
"data": "....."
} {
"handled": 1,
"profile": 20,
"data": "....."
}
...
]
}
so, I tried the following:
.update({"events.profile":10},{$set:{"events.$.handled":0}},false,true)
However it updates only the first matched array element in each document. (That's the defined behaviour for $ - the positional operator.)
How can I update all matched array elements?
With the release of MongoDB 3.6 ( and available in the development branch from MongoDB 3.5.12 ) you can now update multiple array elements in a single request.
This uses the filtered positional $[<identifier>] update operator syntax introduced in this version:
db.collection.update(
{ "events.profile":10 },
{ "$set": { "events.$[elem].handled": 0 } },
{ "arrayFilters": [{ "elem.profile": 10 }], "multi": true }
)
The "arrayFilters" as passed to the options for .update() or even
.updateOne(), .updateMany(), .findOneAndUpdate() or .bulkWrite() method specifies the conditions to match on the identifier given in the update statement. Any elements that match the condition given will be updated.
Noting that the "multi" as given in the context of the question was used in the expectation that this would "update multiple elements" but this was not and still is not the case. It's usage here applies to "multiple documents" as has always been the case or now otherwise specified as the mandatory setting of .updateMany() in modern API versions.
NOTE Somewhat ironically, since this is specified in the "options" argument for .update() and like methods, the syntax is generally compatible with all recent release driver versions.
However this is not true of the mongo shell, since the way the method is implemented there ( "ironically for backward compatibility" ) the arrayFilters argument is not recognized and removed by an internal method that parses the options in order to deliver "backward compatibility" with prior MongoDB server versions and a "legacy" .update() API call syntax.
So if you want to use the command in the mongo shell or other "shell based" products ( notably Robo 3T ) you need a latest version from either the development branch or production release as of 3.6 or greater.
See also positional all $[] which also updates "multiple array elements" but without applying to specified conditions and applies to all elements in the array where that is the desired action.
Also see Updating a Nested Array with MongoDB for how these new positional operators apply to "nested" array structures, where "arrays are within other arrays".
IMPORTANT - Upgraded installations from previous versions "may" have not enabled MongoDB features, which can also cause statements to fail. You should ensure your upgrade procedure is complete with details such as index upgrades and then run
db.adminCommand( { setFeatureCompatibilityVersion: "3.6" } )
Or higher version as is applicable to your installed version. i.e "4.0" for version 4 and onwards at present. This enabled such features as the new positional update operators and others. You can also check with:
db.adminCommand( { getParameter: 1, featureCompatibilityVersion: 1 } )
To return the current setting
UPDATE:
As of Mongo version 3.6, this answer is no longer valid as the mentioned issue was fixed and there are ways to achieve this. Please check other answers.
At this moment it is not possible to use the positional operator to update all items in an array. See JIRA http://jira.mongodb.org/browse/SERVER-1243
As a work around you can:
Update each item individually
(events.0.handled events.1.handled
...) or...
Read the document, do the edits
manually and save it replacing the
older one (check "Update if
Current" if you want to ensure
atomic updates)
What worked for me was this:
db.collection.find({ _id: ObjectId('4d2d8deff4e6c1d71fc29a07') })
.forEach(function (doc) {
doc.events.forEach(function (event) {
if (event.profile === 10) {
event.handled=0;
}
});
db.collection.save(doc);
});
I think it's clearer for mongo newbies and anyone familiar with JQuery & friends.
This can also be accomplished with a while loop which checks to see if any documents remain that still have subdocuments that have not been updated. This method preserves the atomicity of your updates (which many of the other solutions here do not).
var query = {
events: {
$elemMatch: {
profile: 10,
handled: { $ne: 0 }
}
}
};
while (db.yourCollection.find(query).count() > 0) {
db.yourCollection.update(
query,
{ $set: { "events.$.handled": 0 } },
{ multi: true }
);
}
The number of times the loop is executed will equal the maximum number of times subdocuments with profile equal to 10 and handled not equal to 0 occur in any of the documents in your collection. So if you have 100 documents in your collection and one of them has three subdocuments that match query and all the other documents have fewer matching subdocuments, the loop will execute three times.
This method avoids the danger of clobbering other data that may be updated by another process while this script executes. It also minimizes the amount of data being transferred between client and server.
This does in fact relate to the long standing issue at http://jira.mongodb.org/browse/SERVER-1243 where there are in fact a number of challenges to a clear syntax that supports "all cases" where mutiple array matches are found. There are in fact methods already in place that "aid" in solutions to this problem, such as Bulk Operations which have been implemented after this original post.
It is still not possible to update more than a single matched array element in a single update statement, so even with a "multi" update all you will ever be able to update is just one mathed element in the array for each document in that single statement.
The best possible solution at present is to find and loop all matched documents and process Bulk updates which will at least allow many operations to be sent in a single request with a singular response. You can optionally use .aggregate() to reduce the array content returned in the search result to just those that match the conditions for the update selection:
db.collection.aggregate([
{ "$match": { "events.handled": 1 } },
{ "$project": {
"events": {
"$setDifference": [
{ "$map": {
"input": "$events",
"as": "event",
"in": {
"$cond": [
{ "$eq": [ "$$event.handled", 1 ] },
"$$el",
false
]
}
}},
[false]
]
}
}}
]).forEach(function(doc) {
doc.events.forEach(function(event) {
bulk.find({ "_id": doc._id, "events.handled": 1 }).updateOne({
"$set": { "events.$.handled": 0 }
});
count++;
if ( count % 1000 == 0 ) {
bulk.execute();
bulk = db.collection.initializeOrderedBulkOp();
}
});
});
if ( count % 1000 != 0 )
bulk.execute();
The .aggregate() portion there will work when there is a "unique" identifier for the array or all content for each element forms a "unique" element itself. This is due to the "set" operator in $setDifference used to filter any false values returned from the $map operation used to process the array for matches.
If your array content does not have unique elements you can try an alternate approach with $redact:
db.collection.aggregate([
{ "$match": { "events.handled": 1 } },
{ "$redact": {
"$cond": {
"if": {
"$eq": [ { "$ifNull": [ "$handled", 1 ] }, 1 ]
},
"then": "$$DESCEND",
"else": "$$PRUNE"
}
}}
])
Where it's limitation is that if "handled" was in fact a field meant to be present at other document levels then you are likely going to get unexepected results, but is fine where that field appears only in one document position and is an equality match.
Future releases ( post 3.1 MongoDB ) as of writing will have a $filter operation that is simpler:
db.collection.aggregate([
{ "$match": { "events.handled": 1 } },
{ "$project": {
"events": {
"$filter": {
"input": "$events",
"as": "event",
"cond": { "$eq": [ "$$event.handled", 1 ] }
}
}
}}
])
And all releases that support .aggregate() can use the following approach with $unwind, but the usage of that operator makes it the least efficient approach due to the array expansion in the pipeline:
db.collection.aggregate([
{ "$match": { "events.handled": 1 } },
{ "$unwind": "$events" },
{ "$match": { "events.handled": 1 } },
{ "$group": {
"_id": "$_id",
"events": { "$push": "$events" }
}}
])
In all cases where the MongoDB version supports a "cursor" from aggregate output, then this is just a matter of choosing an approach and iterating the results with the same block of code shown to process the Bulk update statements. Bulk Operations and "cursors" from aggregate output are introduced in the same version ( MongoDB 2.6 ) and therefore usually work hand in hand for processing.
In even earlier versions then it is probably best to just use .find() to return the cursor, and filter out the execution of statements to just the number of times the array element is matched for the .update() iterations:
db.collection.find({ "events.handled": 1 }).forEach(function(doc){
doc.events.filter(function(event){ return event.handled == 1 }).forEach(function(event){
db.collection.update({ "_id": doc._id },{ "$set": { "events.$.handled": 0 }});
});
});
If you are aboslutely determined to do "multi" updates or deem that to be ultimately more efficient than processing multiple updates for each matched document, then you can always determine the maximum number of possible array matches and just execute a "multi" update that many times, until basically there are no more documents to update.
A valid approach for MongoDB 2.4 and 2.2 versions could also use .aggregate() to find this value:
var result = db.collection.aggregate([
{ "$match": { "events.handled": 1 } },
{ "$unwind": "$events" },
{ "$match": { "events.handled": 1 } },
{ "$group": {
"_id": "$_id",
"count": { "$sum": 1 }
}},
{ "$group": {
"_id": null,
"count": { "$max": "$count" }
}}
]);
var max = result.result[0].count;
while ( max-- ) {
db.collection.update({ "events.handled": 1},{ "$set": { "events.$.handled": 0 }},{ "multi": true })
}
Whatever the case, there are certain things you do not want to do within the update:
Do not "one shot" update the array: Where if you think it might be more efficient to update the whole array content in code and then just $set the whole array in each document. This might seem faster to process, but there is no guarantee that the array content has not changed since it was read and the update is performed. Though $set is still an atomic operator, it will only update the array with what it "thinks" is the correct data, and thus is likely to overwrite any changes occurring between read and write.
Do not calculate index values to update: Where similar to the "one shot" approach you just work out that position 0 and position 2 ( and so on ) are the elements to update and code these in with and eventual statement like:
{ "$set": {
"events.0.handled": 0,
"events.2.handled": 0
}}
Again the problem here is the "presumption" that those index values found when the document was read are the same index values in th array at the time of update. If new items are added to the array in a way that changes the order then those positions are not longer valid and the wrong items are in fact updated.
So until there is a reasonable syntax determined for allowing multiple matched array elements to be processed in single update statement then the basic approach is to either update each matched array element in an indvidual statement ( ideally in Bulk ) or essentially work out the maximum array elements to update or keep updating until no more modified results are returned. At any rate, you should "always" be processing positional $ updates on the matched array element, even if that is only updating one element per statement.
Bulk Operations are in fact the "generalized" solution to processing any operations that work out to be "multiple operations", and since there are more applications for this than merely updating mutiple array elements with the same value, then it has of course been implemented already, and it is presently the best approach to solve this problem.
First: your code did not work because you were using the positional operator $ which only identifies an element to update in an array but does not even explicitly specify its position in the array.
What you need is the filtered positional operator $[<identifier>]. It would update all elements that match an array filter condition.
Solution:
db.collection.update({"events.profile":10}, { $set: { "events.$[elem].handled" : 0 } },
{
multi: true,
arrayFilters: [ { "elem.profile": 10 } ]
})
Visit mongodb doc here
What the code does:
{"events.profile":10} filters your collection and return the documents matching the filter
The $set update operator: modifies matching fields of documents it acts on.
{multi:true} It makes .update() modifies all documents matching the filter hence behaving like updateMany()
{ "events.$[elem].handled" : 0 } and arrayFilters: [ { "elem.profile": 10 } ]
This technique involves the use of the filtered positional array with arrayFilters. the filtered positional array here $[elem] acts as a placeholder for all elements in the array fields that match the conditions specified in the array filter.
Array filters
You can update all elements in MongoDB
db.collectioname.updateOne(
{ "key": /vikas/i },
{ $set: {
"arr.$[].status" : "completed"
} }
)
It will update all the "status" value to "completed" in the "arr" Array
If Only one document
db.collectioname.updateOne(
 { key:"someunique", "arr.key": "myuniq" },
 { $set: {
   "arr.$.status" : "completed",
   "arr.$.msgs":  {
        "result" : ""
        }
   
 } }
)
But if not one and also you don't want all the documents in the array to update then you need to loop through the element and inside the if block
db.collectioname.find({findCriteria })
.forEach(function (doc) {
doc.arr.forEach(function (singlearr) {
if (singlearr check) {
singlearr.handled =0
}
});
db.collection.save(doc);
});
I'm amazed this still hasn't been addressed in mongo. Overall mongo doesn't seem to be great when dealing with sub-arrays. You can't count sub-arrays simply for example.
I used Javier's first solution. Read the array into events then loop through and build the set exp:
var set = {}, i, l;
for(i=0,l=events.length;i<l;i++) {
if(events[i].profile == 10) {
set['events.' + i + '.handled'] = 0;
}
}
.update(objId, {$set:set});
This can be abstracted into a function using a callback for the conditional test
The thread is very old, but I came looking for answer here hence providing new solution.
With MongoDB version 3.6+, it is now possible to use the positional operator to update all items in an array. See official documentation here.
Following query would work for the question asked here. I have also verified with Java-MongoDB driver and it works successfully.
.update( // or updateMany directly, removing the flag for 'multi'
{"events.profile":10},
{$set:{"events.$[].handled":0}}, // notice the empty brackets after '$' opearor
false,
true
)
Hope this helps someone like me.
I've been looking for a solution to this using the newest driver for C# 3.6 and here's the fix I eventually settled on. The key here is using "$[]" which according to MongoDB is new as of version 3.6. See https://docs.mongodb.com/manual/reference/operator/update/positional-all/#up.S[] for more information.
Here's the code:
{
var filter = Builders<Scene>.Filter.Where(i => i.ID != null);
var update = Builders<Scene>.Update.Unset("area.$[].discoveredBy");
var result = collection.UpdateMany(filter, update, new UpdateOptions { IsUpsert = true});
}
For more context see my original post here:
Remove array element from ALL documents using MongoDB C# driver
$[] operator selects all nested array ..You can update all array items with '$[]'
.update({"events.profile":10},{$set:{"events.$[].handled":0}},false,true)
Reference
Please be aware that some answers in this thread suggesting use $[] is WRONG.
db.collection.update(
{"events.profile":10},
{$set:{"events.$[].handled":0}},
{multi:true}
)
The above code will update "handled" to 0 for all elements in "events" array, regardless of its "profile" value. The query {"events.profile":10} is only to filter the whole document, not the documents in the array. In this situation it is a must to use $[elem] with arrayFilters to specify the condition of array items so Neil Lunn's answer is correct.
Actually, The save command is only on instance of Document class.
That have a lot of methods and attribute. So you can use lean() function to reduce work load.
Refer here. https://hashnode.com/post/why-are-mongoose-mongodb-odm-lean-queries-faster-than-normal-queries-cillvawhq0062kj53asxoyn7j
Another problem with save function, that will make conflict data in with multi-save at a same time.
Model.Update will make data consistently.
So to update multi items in array of document. Use your familiar programming language and try something like this, I use mongoose in that:
User.findOne({'_id': '4d2d8deff4e6c1d71fc29a07'}).lean().exec()
.then(usr =>{
if(!usr) return
usr.events.forEach( e => {
if(e && e.profile==10 ) e.handled = 0
})
User.findOneAndUpdate(
{'_id': '4d2d8deff4e6c1d71fc29a07'},
{$set: {events: usr.events}},
{new: true}
).lean().exec().then(updatedUsr => console.log(updatedUsr))
})
Update array field in multiple documents in mongo db.
Use $pull or $push with update many query to update array elements in mongoDb.
Notification.updateMany(
{ "_id": { $in: req.body.notificationIds } },
{
$pull: { "receiversId": req.body.userId }
}, function (err) {
if (err) {
res.status(500).json({ "msg": err });
} else {
res.status(200).json({
"msg": "Notification Deleted Successfully."
});
}
});
if you want to update array inside array
await Booking.updateOne(
{
userId: req.currentUser?.id,
cart: {
$elemMatch: {
id: cartId,
date: date,
//timeSlots: {
//$elemMatch: {
//id: timeSlotId,
//},
//},
},
},
},
{
$set: {
version: booking.version + 1,
'cart.$[i].timeSlots.$[j].spots': spots,
},
},
{
arrayFilters: [
{
'i.id': cartId,
},
{
'j.id': timeSlotId,
},
],
new: true,
}
);
I tried the following and its working fine.
.update({'events.profile': 10}, { '$set': {'events.$.handled': 0 }},{ safe: true, multi:true }, callback function);
// callback function in case of nodejs

How to get (or aggregate) distinct keys of array in MongoDB

I'm trying to get MongoDB to aggregate for me over an array with different key-value pairs, without knowing keys (Just a simple sum would be ok.)
Example docs:
{data: [{a: 3}, {b: 7}]}
{data: [{a: 5}, {c: 12}, {f: 25}]}
{data: [{f: 1}]}
{data: []}
So basically each doc (or it's array really) can have 0 or many entries, and I don't know the keys for those objects, but I want to sum and average the values over those keys.
Right now I'm just loading a bunch of docs and doing it myself in Node, but I'd like to offload that work to MongoDB.
I know I can unwind those first, but how to proceed from there? How to sum/avg/min/max the values if I don't know the keys?
If you do not know the keys or cannot make a reasonable educated guess then you are basically stuck from going any further with the aggregation framework. You could supply "all of the keys" for consideration, but I supect your acutal data looks more like this:
{ "data": [{ "film": 10 }, { "televsion": 5 },{ "boardGames": 1 }] }
So there would be little point here findin out all the "key names" and then throwing that at an aggregation statement.
For the record though, "this is why you do not structure your data storage like this". Information like "film" here should not be used as a "key" name, because it is useful "data" that could be searched upon and most importantly "indexed" in a database system.
So your data should really look like this:
{
"data": [
{ "type": "film", "value": 10 },
{ "type": "televsion", "valule": 5 },
{ "type": "boardGames", "value": 1 }
]
}
Then the aggregation statement is simple, as are many other things:
db.collection.aggregate([
{ "$unwind": "$data" },
{ "$group": {
"_id": null,
"sum": { "$sum": "$data.value" },
"avg": { "$avg": "$data.value" }
}}
])
But since the key names are constantly changing in documents and do not have a uniform structure, then you need JavaScript processing on the server to traverse the keys, and that meand mapReduce:
db.collection.mapReduce(
function() {
this.data.forEach(function(data) {
Object.keys(data).forEach(function(key) {
emit(null,data[key]); // emit the value regardless of key name
});
});
},
function(key,values) {
return Array.sum(values); // Just summing for example
},
{ "out": { "inline": 1 } }
)
And of course the JavaScript execution here will work much more slowly than the native coded operators available to the aggregation framework.
So this should be an abject lesson as to why you don not use "data" as "key names" when storing data in a database. The aggregation framework works with standard structres and is fast, falling back to JavaScript processing is more flexible, but the cost is mostly in speed and other features.

Group result mongoDB

I have a collection with array countries values like this. I want to sum the values of the countries.
{
"_id": ObjectId("54cd5e7804f3b06c3c247428"),
"country_json": {
"AE": NumberLong("13"),
"RU": NumberLong("16"),
"BA": NumberLong("10"),
...
}
},
{
"_id": ObjectId("54cd5e7804f3b06c3c247429"),
"country_json": {
"RU": NumberLong("12"),
"ES": NumberLong("28"),
"DE": NumberLong("16"),
"AU": NumberLong("44"),
...
}
}
How to sum the values of countries to get a result like this?
{
"AE": 13,
"RU": 28,
..
}
This can simply be done using aggregation
> db.test.aggregate([
{$project: {
RU: "$country_json.RU",
AE: "$country_json.AE",
BA: "$country_json.BA"
}},
{$group: {
_id: null,
RU: {$sum: "$RU"},
AE: {$sum: "$AE"},
BA: {$sum: "$BA"}
}
])
Output:
{
"_id" : null,
"RU" : NumberLong(28),
"AE" : NumberLong(13),
"BA" : NumberLong(10)
}
This isn't a very good document structure if you intend to aggregate statistics across the "keys" like this. Not really a fan of "data as key names" anyway, but the main point is it does not "play well" with many MongoDB query forms due to the key names being different everywhere.
Particularly with the aggregation framework, a better form to store the data is within an actual array, like so:
{
"_id": ObjectId("54cd5e7804f3b06c3c247428"),
"countries": [
{ "key": "AE", "value": NumberLong("13"),
{ "key": "RU", "value": NumberLong("16"),
{ "key": "BA", "value": NumberLong("10")
]
}
With that you can simply use the aggregation operations:
db.collection.aggregate([
{ "$unwind": "$countries" },
{ "$group": {
"_id": "$countries.key",
"value": { "$sum": "$countries.value" }
}}
])
Which would give you results like:
{ "_id": "AE", "value": NumberLong(13) },
{ "_id": "RU", "value": NumberLong(28) }
That kind of structure does "play well" with the aggregation framework and other MongoDB query patterns because it really is how it's "expected" to be done when you want to use the data in this way.
Without changing the structure of the document you are forced to use JavaScript evaluation methods in order to traverse the keys of your documents because that is the only way to do it with MongoDB:
db.collection.mapReduce(
function() {
var country = this.country_json;
Object.keys(country).forEach(function(key) {
emit( key, country[key] );
});
},
function(key,values) {
return values.reduce(function(p,v) { return NumberLong(p+v) });
},
{ "out": { "inline": 1 } }
)
And that would produce exactly the same result as shown from the aggregation example output, but working with the current document structure. Of course, the use of JavaScript evaluation is not as efficient as the native methods used by the aggregation framework so it's not going to perform as well.
Also note the possible problems here with "large values" in your cast NumberLong fields, since the main reason they are represented that way to JavaScipt is that JavaScipt itself has limitations on the size of that value than can be represented. Likely your values are just trivial but simply "cast" that way, but for large enough numbers as per the intent, then the math will simply fail.
So it's generally a good idea to consider changing how you structure this data to make things easier. As a final note, the sort of output you were expecting with all the keys in a single document is similarly counter intuitive, as again it requires traversing keys of a "hash/map" rather than using the natural iterators of arrays or cursors.