How to use nested grouping in MongoDB - mongodb

I need to find total count of duplicate profiles per organization level. I have documents as shown below:
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "75"
}
"_id" : "1"
},
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "75"
}
"_id" : "2"
},
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "77"
}
"_id" : "3"
},
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "77"
}
"_id" : "4"
}
I have written query which is a group by ProfileId and OrganizationId. The results i am getting as shown below:
Organization Total
10 2
10 2
But i want to get the sum of total per organization level, that means Org 10 should have one row with sum of 4.
The query i am using as shown below:
db.getSiblingDB("dbName").OrgProfile.aggregate(
{ $project: { _id: 1, P: "$Profile._id", O: "$OrganizationId" } },
{ $group: {_id: { p: "$P", o: "$O"}, c: { $sum: 1 }} },
{ $match: { c: { $gt: 1 } } });
Any ideas ? Please help

The following pipeline should give you the desired output, whereas the last $project stage is just for cosmetic purposes to turn _id into OrganizationId but is not needed for the essential computation so you may omit it.
db.getCollection('yourCollection').aggregate([
{
$group: {
_id: { org: "$OrganizationId", profile: "$Profile._id" },
count: { $sum: 1 }
}
},
{
$group: {
_id: "$_id.org",
Total: {
$sum: {
$cond: {
if: { $gte: ["$count", 2] },
then: "$count",
else: 0
}
}
}
}
},
{
$project: {
_id: 0,
Organization: "$_id",
Total: 1
}
}
])
gives this output
{
"Total" : 4.0,
"Organization" : 10
}
To filter out organizations without duplicates you can use $match which will also result in a simplification of the second $group stage
...aggregate([
{
$group: {
_id: { org: "$OrganizationId", profile: "$Profile._id" },
count: { $sum: 1 }
}
},
{
$match: {
count: { $gte: 2 }
}
},
{
$group: {
_id: "$_id.org",
Total: { $sum: "$count" }
}
},
{
$project: {
_id: 0,
Organization: "$_id",
Total: 1
}
}
])

I think I have a solution for you. In that last step there, instead of matching, I think you want another $group.
.aggregate([
{ $project: { _id: 1, P: "$Profile._id", O: "$OrganizationId" } }
,{ $group: {_id: { p: "$P", o: "$O"}, c: { $sum: 1 }} }
,{ $group: { _id: "$_id.o" , c: { $sum: "$c" } }}
]);
You can probably read it and figure out yourself what's happening in that last step, but just in case I'll explain. the last step is group all documents that have the same organization id, and then summing the quantity specified by the previous c field. After the first group, you had two documents that both had a count c of 2 but different profile id. The next group ignores the profile id and just groups them if they have the same organization id and adds their counts.
When I ran this query, here is my result, which is what I think you're looking for:
{
"_id" : 10,
"c" : 4
}
Hope this helps. Let me know if you have any questions.

Related

MongoDB count number of non-missing fields

I'm using the following code to calculate average and standard deviation of a field named "b" in my collection.
db.ctg.aggregate(
[
{
$group:
{
_id: "b",
avg: { $avg: "$b" },
stdev: { $stdDevPop: "$b" }
}
}
]
)
The result is:
{ "_id" : "b", "avg" : 878.4397930385701, "stdev" : 893.8744489449962 }
I need to add number of non missing elements of "b" to my result so it looks like this:
{ "_id" : "b", "avg" : 878.4397930385701, "stdev" : 893.8744489449962, "nonmissing": 2126 }
How can I do this in the query above?
Result of $avg & $stdDevPop doesn't change even after removal of documents where b doesn't exists ($avg ignores all docs where field is non-numeric/missing), So you can try below query.
Query :
db.ctg.aggregate([
{ $match: { b: { $exists: true } } },
{
$group:
{
_id: "b",
avg: { $avg: "$b" },
stdev: { $stdDevPop: "$b" },
nonMissing: { $sum: 1 }
}
}
])

Mongodb Query Aggregation and Groupby complex filter , sum , percent query

I have a complex group query.
Data is as follows:
Aggregation as follows:
match by doc_id
group by name
project: name, name_count, amount, desc as { value: identifed by max sum of amount in that list of desc , count: sum of (percent*100)^2, percent:its percent considering amount in that list}
same with L1 and L2. But L1 L2 are referenced field {_id, name} from another collection. So, I need to project both _id, name and what I do in point 3 above.
Therefore after execution lets say result would be :
...
},
"_id" : {
"name" : "abc"
},
"amount" : 45.0,
"count" : 4.0,
"desc" : {
"value" : "Laptop", // based on highest sum amount in group:'abc' i.e. 25.0 for laptop
"count" : 5061.72, // (56*100)^2 + (44*100)^2
"percent" : 25.0*100/45.0 = 56.0
},
...
Test Data Link: MonogoDb Playground
Udpated: 07/11/2019
Added example for calculating count
Hope I was clear. Kindly help.
Don't understand the calculation you need for count. However, here's the query you can use to fit your need :
db.collection.aggregate([
{
$match: {
"doc_id": 1
}
},
{
$group: {
_id: {
name: "$name",
desc: "$desc"
},
amount: {
$sum: "$amount"
},
count: {
$sum: 1
},
}
},
{
$sort: {
"_id.name": 1,
"amount": -1
}
},
{
$group: {
_id: "$_id.name",
amount: {
$sum: "$amount"
},
count: {
$sum: "$count"
},
desc: {
$first: {
value: "$_id.desc",
descAmount: "$amount"
}
}
},
},
{
$addFields: {
"desc.percent": {
$multiply: [
{
$divide: [
"$desc.descAmount",
"$amount"
]
},
100
]
}
}
}
])
The tip is to group twice, with a sort between, to get sub-total and first element (the one with the biggest sub-total for each name).
Now you can adapt you count calculation as you need.
You can test it here.

How can query in MongoDB that count number of 2 equal field?

I have a collection in MongoDB for my survey results(name=surveyresults). I want to have a query that gives me the number of correct answers based on category, for example, category "Bee" number of correct answers 10.
I tried different ways but these are not results that I want.
I have searched and found this post Group count with MongoDB using aggregation framework useful but not working for me.
This is part of my data in my surveyResults collection :
[{"_id":"0eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJfaWQiOjE5LCJpYXQiOjE1MjQwMDgzOTl9.2YvhnXtCD7-fm4B14k10m6NF7xuv7moCTbekVekkbvY","category":"Wasp","photo":"A_wasp_565","description":"","answer":"Bee","__v":0},{"_id":"1eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJfaWQiOjE5LCJpYXQiOjE1MjQwMDgzOTl9.2YvhnXtCD7-fm4B14k10m6NF7xuv7moCTbekVekkbvY","category":"Wasp","photo":"A_Pompilid_wasp_007","description":"","answer":"Wasp","__v":0},{"_id":"2eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJfaWQiOjE5LCJpYXQiOjE1MjQwMDgzOTl9.2YvhnXtCD7-fm4B14k10m6NF7xuv7moCTbekVekkbvY","category":"Wasp","photo":"wasp_248","description":"","answer":"Wasp","__v":0},{"_id":"3eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJfaWQiOjE5LCJpYXQiOjE1MjQwMDgzOTl9.2YvhnXtCD7-fm4B14k10m6NF7xuv7moCTbekVekkbvY","category":"Fly","photo":"A_butterfly_291","description":"kjlkjlkjlk","answer":"Moth/Butterfly","__v":0},
I want result like this :
[{"category":"Fly","count":3, "correct":1},{"category":"Wasp","count":3, "correct":1},{"category":"Moth/Butterfly","count":4, "correct":2},{"category":"Bee","count":3, "correct":1}]
Now I have these two queries but not giving me correct results :
1.
SurveyResults.aggregate([
{ $group: {
_id: { answer: '$answer', category: '$category' }
}},
{ $group: {
_id: '$_id.answer',
answer_correct: { $sum: 1 }
}},
{ $project: {
_id: 0,
answer: '$_id',
answer_correct: 1
}}
]).exec(callback);
2.
SurveyResults.aggregate([
{
$group:{
_id:"$answer",
count: { $sum : {$cond : { if: { $eq: ["answer", "$category"]}, then: 1, else: 0} }}
}
}]).exec(callback);
Also, I can have the number of answers based on the category by this query:
SurveyResults.aggregate([
{
$group:{
_id:"$answer",
count: { $sum : 1 }
}
}]).exec(callback);
Results:
[{"_id":"Don't know","count":2},{"_id":"Fly","count":3},{"_id":"Wasp","count":3},{"_id":"Moth/Butterfly","count":4},{"_id":"Bee","count":3}]
Here's what you want:
SurveyResults.aggregate([
$group: {
_id: "$category",
"count": { $sum: 1 }, // simply count all questions per category
"correct": {
$sum: { // and sum up the correct ones in a field called "correct"
$cond: [ // ...where "correct ones" means
{ $eq: [ "$category", "$answer" ] }, // that "category" needs to match "answer"
1,
0
]
}
}
}
}, {
$project: { // this is just to effectively rename the "_id" field into "category" - may or may not be needed
_id: 0,
"category": "$_id",
"count": "$count",
"correct": "$correct"
}
}]).exec(callback);

Convert to lowercase in group aggregation

I want to return an aggregate of blog post tags and their total count. My blog posts are stored like so:
{
"_id" : ObjectId("532c323bb07ab5aace243c8e"),
"title" : "Fitframe.js - Responsive iframes made easy",
"tags" : [
"JavaScript",
"jQuery",
"RWD"
]
}
I'm then executing the following pipeline:
printjson(db.posts.aggregate(
{
$project: {
tags: 1,
count: { $add: 1 }
}
},
{
$unwind: '$tags'
},
{
$group: {
_id: '$tags',
count: {
$sum: '$count'
},
tags_lower: { $toLower: '$tags' }
}
},
{
$sort: {
_id: 1
}
}
));
So that the results are sorted correctly I need to sort on a lowercase version of each tag. However, when executing the above code I get the following error:
aggregate failed: {
"errmsg" : "exception: unknown group operator '$toLower'",
"code" : 15952,
"ok" : 0
}
Do I need to do another projection to add the lowercase tag?
Yes, you must add it to the projection. It will not work in the group, only specific operators like $sum ( http://docs.mongodb.org/manual/reference/operator/aggregation-group/ ) are counted as $group operators and capable of being used on that level of the group
You don't need to add another projection ... you could fix it when you do the $group:
db.posts.aggregate(
{
$project: {
tags: 1,
count: { $add: 1 }
}
},
{
$unwind: '$tags'
},
{
$group: {
_id: { tag: '$tags', lower: { $toLower : '$tags' } },
count: {
$sum: '$count'
}
}
},
{
$sort: {
"_id.lower": 1
}
}
)
In the above example, I've preserved the original name and added the lower case version to the _id.
Add another projection step between $unwind and $grop:
...
{$project: {
tags: {$toLower: '$tags'},
count: 1
}}
...
And remove tags_lower from $group

How to retrieve data from related documents by ID?

I'm trying to understand how to set up basic relations in mongoDB. I've read a bit about it in the documentation but it's a little terse.
This should be pretty simple: I'm trying to log a list of impressions and the users who are responsible for the impressions. Here's some examples of log documents:
{type: '1', userId:'xxx-12345'}
{type: '1', userId:'xxx-12345'}
{type: '1', userId:'xxx-12345'}
{type: '2', userId:'zzz-84638'}
{type: '2', userId:'xxx-12345'}
Here's an example of a user document:
{userId: 'xxx-12345', location: 'US'}
Is there a way to count the total number of documents which "belong" to a userId of xxx-12345, where type is 1?
In the above case, I'd want to see a result like { '1':3, '2':1 }.
Also, is the above an acceptable way of creating the relationships?
For your 1st question Is there a way to count the total number of documents which "belong" to a userId of xxx-12345, where type is 1?, below is the solution:
db.impressions.aggregate({
$match: {
userId: 'xxx-12345',
type: 1
}
},
{
$group: { _id: null, count: { $sum: 1 } }
});
To get the solution in format you specified (In the above case, I'd want to see a result like { '1':3, '2':1 }.), use below code:
db.impressions.aggregate({
$match: {
userId: 'xxx-12345',
}
},
{
$group: { _id: '$type', totalImpressions: { $sum: 1 } }
});
You can use the Aggregation Pipeline introduced in version 2.2:
db.a.aggregate([
{ $match: { userId: 'xxx-12345' } },
{ $group: { _id: "$type", total: { $sum: 1 } } }
])
This will output:
{
"result" : [
{
"_id" : "2",
"total" : 1
},
{
"_id" : "1",
"total" : 3
}
],
"ok" : 1
}
where "_id" is the type and "total" is the count that type appears in user "xxx-12345".
However, if you want to get only the total number of documents which belong to "xxx-12345" where the type is "1" you can do it like this:
db.a.aggregate([
{ $match: { userId: 'xxx-12345', type: "1" } },
{ $group: { _id: null, count: { $sum: 1} } }
])
which will output the following:
{ "result" : [ { "_id" : null, "count" : 3 } ], "ok" : 1 }
where "count" is what you're looking for.