MongoDb sum issue after match and group - mongodb

Suppose I have document as userDetails:
[
{
"roles": [
"author",
"reader"
],
"completed_roles": ["author", "reader"],
"address": {
"current_address": {
"city": "abc"
}
},
"is_verified": true
},
{
"roles": [
"reader"
],
"completed_roles": ["reader"],
"address": {
"current_address": {
"city": "abc"
}
},
"is_verified": true
},
{
"roles": [
"author"
],
"completed_roles": [],
"address": {
"current_address": {
"city": "xyz"
}
},
"is_verified": false
}
]
I want to fetch sum for all roles which has author based on city, total_roles_completed and is_verified.
So the O/P should look like:
[
{
"_id": {
"city": "abc"
},
"total_author": 1,
"total_roles_completed": 1,
"is_verified": 1
},
{
"_id": {
"city": "xyz"
},
"total_author": 1,
"total_roles_completed": 0,
"is_verified": 0
}
]
Basic O/P required:
Filter the document based on author in role (other roles may be present in role but author must be present)
Sum the author based on city
sum on basis of completed_profile has "author"
Sum on basis of documents if they are verified.
For this I tried as:
db.userDetails.aggregate([
{
$match: {
roles: {
$eleMatch: {
$eq: "author"
}
}
}
},
{
$unwind: "$completed_roles"
},
{
"$group": {
_id: { city: "$address.current_address.city"},
total_authors: {$sum: 1},
total_roles_completed: {
$sum: {
$cond: [
{
$eq: ["$completed_roles","author"]
}
]
}
},
is_verified: {
$sum: {
$cond: [
{
$eq: ["$is_verified",true]
}
]
}
}
}
}
]);
But the sum is incorrect. Please let me know where I made mistake. Also, if anyone needs any further information please let me know.
Edit: I figured that because of unwind it is giving me incorrect value, if I remove the unwind the sum is coming correct.
Is there any other way by which I can calculate the sum of total_roles_completed for each city?

If I've understood correctly you can try this query:
First $match to get only documents where roles contains author.
And then $group by the city (the document is not a valid JSON so I assume is address:{"current_addres:{city:"abc"}}). This $group get the authors for each city and also: $sum 1 if "author" is in completed_roles and check if is verified.
Here I don't know the way to know if the author is verified (I don't know if can be true in one document and false in other document. If is the same value over all documents you can use $first to get the first is_verified value). But I decided to use $allElementsTrue in a $project stage, so this only will be true if is_verified is true in all documents grouped by $group.
db.collection.aggregate([
{
"$match": {
"roles": "author"
}
},
{
"$group": {
"_id": "$address.current_address.city",
"total_author": {
"$sum": 1
},
"total_roles_completed": {
"$sum": {
"$cond": {
"if": {
"$in": [
"author",
"$completed_roles"
]
},
"then": 1,
"else": 0
}
}
},
"is_verified": {
"$addToSet": "$is_verified"
}
}
},
{
"$project": {
"_id": 0,
"city": "$_id",
"is_verified": {
"$allElementsTrue": "$is_verified"
},
"total_author": 1,
"total_roles_completed": 1
}
}
])
Example here
The result from this query is:
[
{
"city": "xyz",
"is_verified": false,
"total_author": 1,
"total_roles_completed": 0
},
{
"city": "abc",
"is_verified": true,
"total_author": 2,
"total_roles_completed": 2
}
]

Related

MongoDB Aggregate Query to find the documents with missing values

I am having a huge collection of objects where the data is stored for different employees.
{
"employee": "Joe",
"areAllAttributesMatched": false,
"characteristics": [
{
"step": "A",
"name": "house",
"score": "1"
},
{
"step": "B",
"name": "car"
},
{
"step": "C",
"name": "job",
"score": "3"
}
]
}
There are cases where the score for an object is completely missing and I want to find out all these details from the database.
In order to do this, I have written the following query, but seems I am going wrong somewhere due to which it is not displaying the output.
I want the data in the following format for this query, so that it is easy to find out which employee is missing the score for which step and which name.
db.collection.aggregate([
{
"$unwind": "$characteristics"
},
{
"$match": {
"characteristics.score": {
"$exists": false
}
}
},
{
"$project": {
"employee": 1,
"name": "$characteristics.name",
"step": "$characteristics.step",
_id: 0
}
}
])
You need to use $exists to check the existence
playground
You can use $ifNull to handle both cases of 1. the score field is missing 2. score is null.
db.collection.aggregate([
{
"$unwind": "$characteristics"
},
{
"$match": {
$expr: {
$eq: [
{
"$ifNull": [
"$characteristics.score",
null
]
},
null
]
}
}
},
{
"$group": {
_id: null,
documents: {
$push: {
"employee": "$employee",
"name": "$characteristics.name",
"step": "$characteristics.step",
}
}
}
},
{
$project: {
_id: false
}
}
])
Here is the Mongo playground for your reference.

Filter documents that have id in another collection in MongoDB with aggregation framework

So I have two collection. collectionA and collectionB
collection A has following documents
db={
"collectiona": [
{
"_id": "6173ddf33ed09368a094e68a",
"title": "a"
},
{
"_id": "61wefdf33ed09368a094e6dc",
"title": "b"
},
{
"_id": "61wefdfewf09368a094ezzz",
"title": "c"
},
],
"collectionb": [
{
"_id": "6173ddf33ed0wef368a094zq",
"collectionaID": "6173ddf33ed09368a094e68a",
"data": [
{
"userID": "123",
"visibility": false,
"response": false
},
{
"userID": "2345",
"visibility": true,
"response": true
}
]
},
{
"_id": "6173ddf33ed09368awef4e68g",
"collectionaID": "61wefdf33ed09368a094e6dc",
"data": [
{
"userID": "5678",
"visibility": false,
"response": false
},
{
"userID": "674",
"visibility": true,
"response": false
}
]
}
]
}
So What I need is documents from collection A which has response false in collection B
and document should be sorted by first the ones that have visibility false and then the ones that have visibility true
for eg. userID : 123 should get 3 documents
{
"_id": "6173ddf33ed09368a094e68a",
"title": "a"
},
{
"_id": "61wefdf33ed09368a094e6dc",
"title": "b"
},
{
"_id": "61wefdfewf09368a094ezzz",
"title": "c"
},
whereas userID 2345 should get two
{
"_id": "61wefdf33ed09368a094e6dc",
"title": "b"
},
{
"_id": "61wefdfewf09368a094ezzz",
"title": "c"
},
User 674 will receive 3 objects from collection A but second would be in the last as it has visibility true for that document
{
"_id": "6173ddf33ed09368a094e68a",
"title": "a"
},
{
"_id": "61wefdfewf09368a094ezzz",
"title": "c"
},
{
"_id": "61wefdf33ed09368a094e6dc",
"title": "b"
},
MongoDB Playground link : https://mongoplayground.net/p/3rLry0FPlw-
Really appreciate the help. Thanks
You can start from collectionA:
$lookup the collectionB for the record related to the user specified
filter out collectionB documents according to response
assign a helper sortrank field based on the visibility and whether collectionaID is a match
$sort according to sortrank
wrangle back to the raw collection A
db.collectiona.aggregate([
{
"$lookup": {
"from": "collectionb",
let: {
aid: "$_id"
},
"pipeline": [
{
$unwind: "$data"
},
{
$match: {
$expr: {
$and: [
{
$eq: [
"$data.userID",
"2345"
]
},
{
$eq: [
"$collectionaID",
"$$aid"
]
}
]
}
}
}
],
"as": "collB"
}
},
{
$match: {
"collB.data.response": {
$ne: true
}
}
},
{
"$unwind": {
path: "$collB",
preserveNullAndEmptyArrays: true
}
},
{
"$addFields": {
"sortrank": {
"$cond": {
"if": {
$eq: [
"$collB.data.visibility",
false
]
},
"then": 1,
"else": {
"$cond": {
"if": {
$eq: [
"$collB.collectionaID",
"$_id"
]
},
"then": 3,
"else": 2
}
}
}
}
}
},
{
$sort: {
sortrank: 1
}
},
{
$project: {
collB: false,
sortrank: false
}
}
])
Here is the Mongo playground for your reference.

MongoDB: Assign document objects to field in '$project' stage

I have a user collection:
[
{"_id": 1,"name": "John", "age": 25, "valid_user": true}
{"_id": 2, "name": "Bob", "age": 40, "valid_user": false}
{"_id": 3, "name": "Jacob","age": 27,"valid_user": null}
{"_id": 4, "name": "Amelia","age": 29,"valid_user": true}
]
I run a '$facet' stage on this collection. Checkout this MongoPlayground.
I want to talk about the first output from the facet stage. The following is the response currently:
{
"user_by_valid_status": [
{
"_id": false,
"count": 1
},
{
"_id": true,
"count": 2
},
{
"_id": null,
"count": 1
}
]
}
However, I want to restructure the output in this way:
"analytics": {
"invalid_user": {
"_id": false
"count": 1
},
"valid_user": {
"_id": true
"count": 2
},
"user_with_unknown_status": {
"_id": null
"count": 1
}
}
The problem with using a '$project' stage along with 'arrayElemAt' is that the order may not be definite for me to associate an index with an attribute like 'valid_users' or others. Also, it gets further complicated because unlike the sample documents that I have shared, my collection may not always contain all the three categories of users.
Is there some way I can do this?
You can use $switch conditional operator,
$project to show value part in v with _id and count field as object, k to put $switch condition
db.collection.aggregate([
{
"$facet": {
"user_by_valid_status": [
{
"$group": {
"_id": "$valid_user",
"count": { "$sum": 1 }
}
},
{
$project: {
_id: 0,
v: { _id: "$_id", count: "$count" },
k: {
$switch: {
branches: [
{ case: { $eq: ["$_id", null] }, then: "user_with_unknown_status" },
{ case: { $eq: ["$_id", false] }, then: "invalid_user" },
{ case: { $eq: ["$_id", true] }, then: "valid_user" }
]
}
}
}
}
],
"users_above_30": [{ "$match": { "age": { "$gt": 30 } } }]
}
},
$project stage in root, convert user_by_valid_status array to object using $arrayToObject
{
$project: {
analytics: { $arrayToObject: "$user_by_valid_status" },
users_above_30: 1
}
}
])
Playground

MongoDB multiple counts, single document, arrays

I have been searching on stackoverflow and cannot find exactly what I am looking for and hope someone can help. I want to submit a single query, get multiple counts back, for a single document, based on array of that document.
My data:
db.myCollection.InsertOne({
"_id": "1",
"age": 30,
"items": [
{
"id": "1",
"isSuccessful": true,
"name": null
},{
"id": "2",
"isSuccessful": true,
"name": null
},{
"id": "3",
"isSuccessful": true,
"name": "Bob"
},{
"id": "4",
"isSuccessful": null,
"name": "Todd"
}
]
});
db.myCollection.InsertOne({
"_id": "2",
"age": 22,
"items": [
{
"id": "6",
"isSuccessful": true,
"name": "Jeff"
}
]
});
What I need back is the document and the counts associated to the items array for said document. In this example where the document _id = "1":
{
"_id": "1",
"age": 30,
{
"totalIsSuccessful" : 2,
"totalNotIsSuccessful": 1,
"totalSuccessfulNull": 1,
"totalNameNull": 2
}
}
I have found that I can get this in 4 queries using something like this below, but I would really like it to be one query.
db.test1.aggregate([
{ $match : { _id : "1" } },
{ "$project": {
"total": {
"$size": {
"$filter": {
"input": "$items",
"cond": { "$eq": [ "$$this.isSuccessful", true ] }
}
}
}
}}
])
Thanks in advance.
I am assuming your expected result is invalid since you have an object literal in the middle of another object and also you have totalIsSuccessful for id:1 as 2 where it seems they should be 3. With that said ...
you can get similar output via $unwind and then grouping with $sum and $cond:
db.collection.aggregate([
{ $match: { _id: "1" } },
{ $unwind: "$items" },
{ $group: {
_id: "_id",
age: { $first: "$age" },
totalIsSuccessful: { $sum: { $cond: [{ "$eq": [ "$items.isSuccessful", true ] }, 1, 0 ] } },
totalNotIsSuccessful: { $sum: { $cond: [{ "$ne": [ "$items.isSuccessful", true ] }, 1, 0 ] } },
totalSuccessfulNull: { $sum: { $cond: [{ "$eq": [ "$items.isSuccessful", null ] }, 1, 0 ] } },
totalNameNull: { $sum: { $cond: [ { "$eq": [ "$items.name", null ]}, 1, 0] } } }
}
])
The output would be this:
[
{
"_id": "_id",
"age": 30,
"totalIsSuccessful": 3,
"totalNameNull": 2,
"totalNotIsSuccessful": 1,
"totalSuccessfulNull": 1
}
]
You can see it working here

Check if embed exists - aggregation framework mongodb

This is my test collection:
>db.test.find()
{
"_id": ObjectId("54906479e89cdf95f5fb2351"),
"reports": [
{
"desc": "xxx",
"order": {"$id": ObjectId("53fbede62827b89e4f86c12e")}
}
]
},
{
"_id": ObjectId("54906515e89cdf95f5fb2352"),
"reports": [
{
"desc": "xxx"
}
]
},
{
"_id": ObjectId("549067d3e89cdf95f5fb2353"),
"reports": [
{
"desc": "xxx"
}
]
}
I want to count all documents and documents with order, so:
>db.test.aggregate({
$group: {
_id: null,
all: {
$sum: 1
},
order: {
$sum: {
"$cond": [
{
"$ifNull": ["$reports.order", false]
},
1,
0
]
}
}
}
})
and my results:
{
"result" : [
{
"_id" : null,
"all" : 3,
"order" : 3
}
],
"ok" : 1
}
but expected:
{
"result" : [
{
"_id" : null,
"all" : 3,
"order" : 1
}
],
"ok" : 1
}
It makes no difference what I'll put - "$reports.order", "$reports.xxx", etc, aggregation framework check only if the field reports exists, ignores embed.
$ifNull and $eq dosn't work with embeded documents?
Is any way to do something like this
db.test.find({"reports.order": {$exists: 1}})
in aggregation framework?
Sorry for my english and I hope that you understood what I want to show you :)
I think it doesn't work because the field "reports" contain an array, not an object.
I mean, your aggregation works as you expect in this collection:
>db.test.find()
{
"_id": ObjectId("54906479e89cdf95f5fb2351"),
"reports":
{
"desc": "xxx",
"order": {"$id": ObjectId("53fbede62827b89e4f86c12e")}
}
},
{
"_id": ObjectId("54906515e89cdf95f5fb2352"),
"reports":
{
"desc": "xxx"
}
},
{
"_id": ObjectId("549067d3e89cdf95f5fb2353"),
"reports":
{
"desc": "xxx"
}
}
Note that I removed the "[" and "]", so now it's an object, not an array (one-to-one relation).
Because you have array inside the "report" field, you need to unwind the array to output one document for each element. I suppose that if you have two "order" fields inside the "reports" array, you only wants to count it once. I mean:
"reports": [
{
"desc": "xxx",
"order": {"$id": ObjectId("53fbede62827b89e4f86c12e")},
"order": "yyy",
}
]
Should only count as one for the object final "order" sum.
In this case, you need to unwind, group by _id (because the previous example outputs two documents for the same _id) and then group again to count all documents:
db.test.aggregate([
{$unwind: '$reports'},
{$group:{
_id:"$_id",
order:{$sum:{"$cond": [
{
"$ifNull": ["$reports.order", false]
},
1,
0
]
}
}
}},
{$group:{
_id:null,
all:{$sum:1},
order: {
$sum:{
"$cond": [{$eq: ['$order', 0]}, 0, 1]
}
}
}}])
Maybe there is a shorter solution, but this works.