Count Array Elements Matching Condition - mongodb

I have a collection in MongoDB and their objects look like this:
Object 1
{
"_id" : ObjectId("5afde62a91952a2980a1b751"),
"notificationName" : "Teste Agendamento Pontual",
"messages" : [
{
"timestamp" : ISODate("2018-05-17T20:29:33.045Z"),
"message" : "Teste Agendamento Pontual"
},
{
"timestamp" : ISODate("2018-05-17T20:29:33.051Z"),
"message" : "Teste"
},
{
"timestamp" : ISODate("2018-05-17T20:29:44.680Z"),
"message" : "OK"
}
]
}
Object 2
{
"_id" : ObjectId("5afde62a9194322980a1b751"),
"notificationName" : "Teste Agendamento Pontual",
"messages" : [
{
"timestamp" : ISODate("2018-05-17T20:29:33.045Z"),
"message" : "Teste Agendamento Pontual"
},
{
"timestamp" : ISODate("2018-05-17T20:29:33.051Z"),
"message" : "NOT OK"
},
{
"timestamp" : ISODate("2018-05-17T20:29:44.680Z"),
"message" : "asdsadasd"
}
]
}
...
I'm trying to get a result grouped by notificationName with a count of objects with OK messages and other count with objects with NOT OK .
I think I need to make a $cond in $group to check message property, but I'm not sure about that. And not sure if thats the best way to achieve this.

If you want matching array elements then you want $filter instead and count them using $size:
db.collection.aggregate([
{ "$group": {
"_id": "$notificationName",
"countOk": {
"$sum": {
"$size": {
"$filter": {
"input": "$messages.message",
"cond": { "$eq": [ "$$this", "OK" ] }
}
}
}
}
}}
])
The $filter has it's own cond argument which is a logical condition to return a boolean value determining whether the array element matches that condition or not and can be returned. Since this would only then return an array of values from message where the value is "OK" using the $eq comparison operator to test, then you "count" the resulting array members using $size.
Because you are "grouping" you use $group as the stage to do this, and because you are "accumulating" you use the $sum operator to "add up" all the returned $size results from each document sharing the same grouping key

Related

MongoDb query to get max of field inside array

How to get the maximum of sections.Id in below document where collection._id = some parameter
{
"_id" : ObjectId("571c5c87faf473f40fd0317c"),
"name" : "test 1",
"sections" : [
{
"Id" : 1,
"name" : "first section"
},
{
"Id" : 2,
"name" : "section 2"
},
{
"Id" : 3,
"name" : "section 3"
}
}
I have tried below
db.collection.aggregate(
[
{
"$match": {
"_id": ObjectId("571c5c87faf473f40fd0317c")
}
},
{
"$group" : {
"_id" : "$_id",
"maxSectionId" : {"$max" : "$sections.Id"}
}
}
]);
But instead of returning max int single value it is returning an array of all Ids in sections array.
Further same query when executed in node.js it returns an empty array.
You can do using simple $project stage
Something like this
db.collection.aggregate([
{ "$project": {
"maxSectionId": {
"$arrayElemAt": [
"$sections",
{
"$indexOfArray": [
"$sections.Id",
{ "$max": "$sections.Id" }
]
}
]
}
}}
])
your aggregation query need $unwind for opennig to "sections" array
add your aggregation query this
{$unwind : "$sections"}
and your refactoring aggregation query like this
db.collection.aggregate(
[
{$unwind : "$sections"},
{
"$match": {
"_id": ObjectId("571c5c87faf473f40fd0317c")
}
},
{
"$group" : {
"_id" : "$_id",
"maxSectionId" : {"$max" : "$sections.Id"}
}
}
]);
and more knowledge for $unwind : https://docs.mongodb.org/manual/reference/operator/aggregation/unwind/
Replace $group with $project
In the $group stage, if the expression resolves to an array, $max does not traverse the array and compares the array as a whole.
With a single expression as its operand, if the expression resolves to an array, $max traverses into the array to operate on the numerical elements of the array to return a single value
[sic]

mongodb check if all subdocuments in array have the same value in one field

I have a collection of documents, each has a field which is an array of subdocuments, and all subdocuments have a common field 'status'. I want to find all documents that have the same status for all subdocuments.
collection:
{
"name" : "John",
"wives" : [
{
"name" : "Mary",
"status" : "dead"
},
{
"name" : "Anne",
"status" : "alive"
}
]
},
{
"name" : "Bill",
"wives" : [
{
"name" : "Mary",
"status" : "dead"
},
{
"name" : "Anne",
"status" : "dead"
}
]
},
{
"name" : "Mohammed",
"wives" : [
{
"name" : "Jane",
"status" : "dead"
},
{
"name" : "Sarah",
"status" : "dying"
}
]
}
I want to check if all wives are dead and find only Bill.
You can use the following aggregation query to get records of person whose wives are all dead:
db.collection.aggregate(
{$project: {name:1, wives:1, size:{$size:'$wives'}}},
{$unwind:'$wives'},
{$match:{'wives.status':'dead'}},
{$group:{_id:'$_id',name:{$first:'$name'}, wives:{$push: '$wives'},size:{$first:'$size'},count:{$sum:1}}},
{$project:{_id:1, wives:1, name:1, cmp_value:{$cmp:['$size','$count']}}},
{$match:{cmp_value:0}}
)
Output:
{ "_id" : ObjectId("56d401de8b953f35aa92bfb8"), "name" : "Bill", "wives" : [ { "name" : "Mary", "status" : "dead" }, { "name" : "Anne", "status" : "dead" } ], "cmp_value" : 0 }
If you need to find records of users who has same status, then you may remove the initial match stage.
The most efficient way to handle this is always going to be to "match" on the status of "dead" as the opening query, otherwise you are processing items that cannot possibly match, and the logic really quite simply followed with $map and $allElementsTrue:
db.collection.aggregate([
{ "$match": { "wives.status": "dead" } },
{ "$redact": {
"$cond": {
"if": {
"$allElementsTrue": {
"$map": {
"input": "$wives",
"as": "wife",
"in": { "$eq": [ "$$wife.status", "dead" ] }
}
}
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}}
])
Or the same thing with $where:
db.collection.find({
"wives.status": "dead",
"$where": function() {
return this.wives.length
== this.wives.filter(function(el) {
el.status == "dead";
}).length;
}
})
Both essentially test the "status" value of all elements to make sure they match in the fastest possible way. But the aggregate pipeline with just $match and $redact should be faster. And "less" pipeline stages ( essentially each a pass through the data ) means faster as well.
Of course keeping a property on the document is always fastest, but it would involve logic to set that only where "all elements" are the same property. Which of course would typically mean inspecting the document by loading it from the server prior to each update.

How to return array of string with mongodb aggregation

I need to return array of string with mongodb aggregation. I did the following:
db.users.aggregate([{$group: {_id:"$emails.address"}}])
It return:
{ "_id" : [ "a#a.com" ] }
{ "_id" : [ "b#a.com" ] }
{ "_id" : [ "c#a.com" ] }
Is there a way to return array of string like this one:
["a#a.com","b#a.com","c#a.com"]
thank You very much anyone who taking your time for helping me..
EDIT
Adding data:
{
"_id" : "ukn9MLo3hRYEpCCty",
"createdAt" : ISODate("2015-10-24T03:52:11.960Z"),
"emails" : [
{
"address" : "a#a.com",
"verified" : false
}
]
}
{
"_id" : "5SXRXraariyhRQACe",
"createdAt" : ISODate("2015-10-24T03:52:12.093Z"),
"emails" : [
{
"address" : "b#a.com",
"verified" : false
}
]
}
{
"_id" : "WMHWxeymY4ATWLXjz",
"createdAt" : ISODate("2015-10-24T03:52:12.237Z"),
"emails" : [
{
"address" : "c#a.com",
"verified" : false
}
]
}
The .aggregate() method always returns Objects no matter what you do and that cannot change.
For your purpose you are probably better off using .distinct() instead, which does just return an array of the distinct values:
db.users.distinct("emails.address");
Which is exactly your desired output:
["a#a.com","b#a.com","c#a.com"]
If you really want to use .aggregate() then the tranformation to just the values needs to happen "outside" of the expression in post processing. And you should also be using $unwind when dealing with arrays like this.
You can do this with JavaScript .map() for example:
db.users.aggregate([
{ "$unwind": "$emails" },
{ "$group": { "_id": "$emails.address" } }
]).map(function(el) { return el._id })
Which gives the same output, but the .map() does the transformation client side instead of on the server.
What Blakes Seven has answered is right.
The .aggregate() method always returns Objects no matter what you do
and that cannot change.
However, that does not mean you cannot put them in an array and return the array in an object.
I believe that mapping in array in the database itself would be more better, as your node server can still be available to other requests.
db.users.aggregate([
{ "$unwind": "$emails" },
{ "$group": { "_id": null, emails:{$push:"$emails.address"} } },
{ "$project":{emails:true,_id:false}}
])
This will return:
{ "emails" : [ "a#a.com", "b#a.com", "c#a.com" ] }

How to get mongodb deeply embeded document id

I have the following mongo document, which is part of a bigger document called attributes, which also has Colour and Size
> db.attributes.find({'name': {'en-UK': 'Fabric'}}).pretty()
{
"_id" : ObjectId("543261cda14c971132fa2b91"),
"values" : [
{
"source" : [
{
"_id" : ObjectId("543261cda14c971132fa2b79"),
"name" : {
"en-UK" : "Combed Cotton"
}
},
],
"name" : [
{
"_id" : ObjectId("543261cda14c971132fa2b85"),
"name" : {
"en-UK" : "Brushed 3-ply"
}
},
{
"_id" : ObjectId("543261cda14c971132fa2b8f"),
"name" : {
"en-UK" : "Plain Weave"
}
},
{
"_id" : ObjectId("543261cda14c971132fa2b90"),
"name" : {
"en-UK" : "1x1 Rib"
}
}
]
}
],
"name" : {
"en-UK" : "Fabric"
}
}
I am trying to return the _id for a sub document and have the following:
db.attributes.aggregate([
{ '$match': {'name.en-UK': 'Fabric'} },
{ '$unwind' : '$values' },
{ '$project': { 'name' : '$values.name'} },
{ '$match': { '$and': [{"name.name.en-UK" : "1x1 Rib"} ] }}
])
What is the correct way to do this?
Also, the values of Fabric is an array with two items, source and name, but if I populate it like:
> db.attributes.find({'name': {'en-UK': 'Fabric'}}).pretty()
{
"_id" : ObjectId("543261cda14c971132fa2b91"),
"values" : {
"source" : [{ ... }]
"name": [{ ... }]
}
}
I get the following error
"errmsg" : "exception: $unwind: value at end of field path must be an array"
But if I wrap it inside a square brackets this then works, so that
> db.attributes.find({'name': {'en-UK': 'Fabric'}}).pretty()
{
"_id" : ObjectId("543261cda14c971132fa2b91"),
"values" : [{
"source" : [{ ... }],
"name": [{ ... }]
}]
}
what am I missing as values is an array of two objects, source and name each containing a list of arrays
Any advice much appreciated
What you seem to be "missing" here is that "some" of your documents do either not contain a "value" property at all or at the very least it is "not an array". This is the basic context of the error you have been given.
Fortunately there are a couple of ways to get around this. Namely, either "testing" for the presence of an array when submitting you original query. Or actually "substituting" the missing element for some kind of array when processing the pipeline.
Here are both approaches in what is effectively an redundant form since the first $match condition really sorts this out:
db.attributes.aggregate([
{ "$match": {
"name.en-UK": "Fabric",
"values.0": { "$exists": true }
}},
{ "$project": {
"name": 1,
"values": { "$ifNull": [ "$values", [] ] }
}},
{ "$unwind": "$values" },
{ "$unwind": "$values.name" },
{ "$match": { "values.name.name.en-UK" : "1x1 Rib" }}
])
So as I said. Really redundant in that the initial $match actually asks if an "initial array element" actually exists. Which kind of means that there is an array there.
The second $project phase actually uses the $ifNull operator to "fill in" a value ( or basically an empty array ) where the tested element does not exist. We tested for that anyway before, but this demonstrates the different approaches.
But the basic idea id either "avoiding" or "filling-in" where your document does not have the expected data that you want to process. Which is the cause of your error.

Aggregation framework flatten subdocument data with parent document

I am building a dashboard that rotates between different webpages. I am wanting to pull all slides that are part of the "Test" deck and order them appropriately. After the query my result would ideally look like.
[
{ "url" : "http://10.0.1.187", "position": 1, "duartion": 10 },
{ "url" : "http://10.0.1.189", "position": 2, "duartion": 3 }
]
I currently have a dataset that looks like the following
{
"_id" : ObjectId("53a612043c24d08167b26f82"),
"url" : "http://10.0.1.189",
"decks" : [
{
"title" : "Test",
"position" : 2,
"duration" : 3
}
]
}
{
"_id" : ObjectId("53a6103e3c24d08167b26f81"),
"decks" : [
{
"title" : "Test",
"position" : 1,
"duration" : 2
},
{
"title" : "Other Deck",
"position" : 1,
"duration" : 10
}
],
"url" : "http://10.0.1.187"
}
My attempted query looks like:
db.slides.aggregate([
{
"$match": {
"decks.title": "Test"
}
},
{
"$sort": {
"decks.position": 1
}
},
{
"$project": {
"_id": 0,
"position": "$decks.position",
"duration": "$decks.duration",
"url": 1
}
}
]);
But it does not yield my desired results. How can I query my dataset and get my expected results in a optimal way?
Well to truly "flatten" the document as your title suggests then $unwind is always going to be employed as there really is not other way to do that. There are however some different approaches if you can live with the array being filtered down to the matching element.
Basically speaking, if you really only have one thing to match in the array then your fastest approach is to simply use .find() matching the required element and projecting:
db.slides.find(
{ "decks.title": "Test" },
{ "decks.$": 1 }
).sort({ "decks.position": 1 }).pretty()
That is still an array but as long as you have only one element that matches then this does work. Also the items are sorted as expected, though of course the "title" field is not dropped from the matched documents, as that is beyond the possibilities for simple projection.
{
"_id" : ObjectId("53a6103e3c24d08167b26f81"),
"decks" : [
{
"title" : "Test",
"position" : 1,
"duration" : 2
}
]
}
{
"_id" : ObjectId("53a612043c24d08167b26f82"),
"decks" : [
{
"title" : "Test",
"position" : 2,
"duration" : 3
}
]
}
Another approach, as long as you have MongoDB 2.6 or greater available, is using the $map operator and some others in order to both "filter" and re-shape the array "in-place" without actually applying $unwind:
db.slides.aggregate([
{ "$project": {
"url": 1,
"decks": {
"$setDifference": [
{
"$map": {
"input": "$decks",
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el.title", "Test" ] },
{
"position": "$$el.position",
"duration": "$$el.duration"
},
false
]
}
}
},
[false]
]
}
}},
{ "$sort": { "decks.position": 1 }}
])
The advantage there is that you can make the changes without "unwinding", which can reduce processing time with large arrays as you are not essentially creating new documents for every array member and then running a separate $match stage to "filter" or another $project to reshape.
{
"_id" : ObjectId("53a6103e3c24d08167b26f81"),
"decks" : [
{
"position" : 1,
"duration" : 2
}
],
"url" : "http://10.0.1.187"
}
{
"_id" : ObjectId("53a612043c24d08167b26f82"),
"url" : "http://10.0.1.189",
"decks" : [
{
"position" : 2,
"duration" : 3
}
]
}
You can again either live with the "filtered" array or if you want you can again "flatten" this truly by adding in an additional $unwind where you do not need to filter with $match as the result already contains only the matched items.
But generally speaking if you can live with it then just use .find() as it will be the fastest way. Otherwise what you are doing is fine for small data, or there is the other option for consideration.
Well as soon as I posted I realized I should be using an $unwind. Is this query the optimal way to do it, or can it be done differently?
db.slides.aggregate([
{
"$unwind": "$decks"
},
{
"$match": {
"decks.title": "Test"
}
},
{
"$sort": {
"decks.position": 1
}
},
{
"$project": {
"_id": 0,
"position": "$decks.position",
"duration": "$decks.duration",
"url": 1
}
}
]);