Grouping With aggregation in MongoDB - mongodb

Currently I'm using aggregation in MongoDB. There is a field with province and religion in my collections. I'm doing this
const data = await submit.aggregate([
{ "$group": { _id: { province: "$province" ,religion:"$religion"}, count: { $sum: 1 } } },
])
My output Looks like this:
[
{ _id: { religion: 'a', province: 'aa' }, count: 1 },
{ _id: { religion: b, province: 'bb' }, count: 2 },
{ _id: { religion: 'c', province: 'bb'}, count: 2 },
{ _id: { religion: 'd', province: 'cc' }, count: 1 }
]
Expect Output:
[
{ _id: { religion: 'a ' }, count: 1 },
{ _id: { religion: 'a' }, count: 1 },
{ _id: { religion: null }, count: 6 },
{ _id: { religion: 'c' }, count: 1 },
{ _id: { religion: 'd' }, count: 2 },
{ _id: { religion: 'e' }, count: 6 },
{ _id: { religion: 'f' }, count: 15 },
{ _id: { religion: 'g' }, count: 2 },
] [
{ _id: { province: 'aa' }, count: 19 },
{ _id: { province: 'bb' }, count: 2 },
{ _id: { province: 'cc' }, count: 21 },
]

You seek 2 different $group at the same time -- this is exactly what $facet is for. Think of $facet like "multi-group." Given an input set similar to the following:
{ religion: 'a', province: 'aa' },
{ religion: 'b', province: 'aa' },
{ religion: 'c', province: 'aa' },
{ religion: 'c', province: 'bb' },
{ religion: 'd', province: 'bb' },
{ religion: 'e', province: 'cc' },
{ religion: 'f', province: 'aa' },
{ religion: 'f', province: 'aa' },
{ religion: 'f', province: 'aa' },
{ religion: 'f', province: 'cc' }
Then this pipeline:
db.foo.aggregate([
{$facet: {
"by_religion": [
{$group: {_id: '$religion', N:{$sum:1}}}
],
"by_province": [
{$group: {_id: '$province', N:{$sum:1}}}
],
}}
]);
yields this output:
{
"by_religion" : [
{
"_id" : "b",
"N" : 1
},
{
"_id" : "e",
"N" : 1
},
{
"_id" : "d",
"N" : 1
},
{
"_id" : "a",
"N" : 1
},
{
"_id" : "f",
"N" : 4
},
{
"_id" : "c",
"N" : 2
}
],
"by_province" : [
{
"_id" : "bb",
"N" : 2
},
{
"_id" : "cc",
"N" : 2
},
{
"_id" : "aa",
"N" : 6
}
]
}
The OP seeks to further refine the output by doing some data-as-LVAL workup and although this is in general considered a poor design practice, it has certain useful applications. Add this stage after $facet:
,{$project: {
// Reading this from insider-out:
// We use $map to turn the array of objects:
// [ {_id:'d',N:1},{_id:'f',N:4}, ... ]
// into an array of K-v pairs (array of array):
// [ ['d',1] , ['f',4] , ... ]
// That sets us up for $arrayToObject which will take
// that array of arrays and turn it into an object:
// {'d':1, 'f':4, ... }
// The target field name is the same as the input so
// we are simply overwriting the field.
"by_religion": {$arrayToObject: {$map: {
input: '$by_religion',
in: [ '$$this._id', '$$this.N' ]
}}
},
"by_province": {$arrayToObject: {$map: {
input: '$by_province',
in: [ '$$this._id', '$$this.N' ]
}}
}
}}
to yield:
{
"by_religion" : {
"d" : 1,
"b" : 1,
"c" : 2,
"f" : 4,
"a" : 1,
"e" : 1
},
"by_province" : {
"bb" : 2,
"cc" : 2,
"aa" : 6
}
}
A variation on the lval/rval workup uses this $project instead of the one immediately above:
,{$project: {
"by_religion": {$map: {
input: '$by_religion',
in: {$arrayToObject: [ [{k:'$$this._id',v:'$$this.N'}] ]}
}},
"by_province": {$map: {
input: '$by_province',
in: {$arrayToObject: [ [{k:'$$this._id',v:'$$this.N'}] ]}
}},
}}
which yields an array:
{
"by_religion" : [
{"b" : 1},
{"c" : 2},
{"a" : 1},
{"f" : 4},
{"d" : 1},
{"e" : 1}
],
"by_province" : [
{"cc" : 2},
{"aa" : 6},
{"bb" : 2}
]
}

Related

Count nested and outer data

I have the following mongo data structure:
[
{
_id: "......",
libraryName: "a1",
stages: [
{
_id: '....',
type: 'b1',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b3',
},
{
_id: '....',
type: 'b1',
},
],
},
{
_id: "......",
libraryName: "a1",
stages: [
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b1',
},
],
},
{
_id: "......",
libraryName: "a2",
stages: [
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b1',
},
],
},
]
Assume this is the Session collection. Now, each session document has some irrelevant _id and libraryName key. Furthermore, each document has array of stages documents. Each stage document has some irrelevant _id and type. I want to count 2 things.
First - I want to count for each libraryName, how many session objects it has.
The solution for this query would be:
const services = await Session.aggregate(
[
{
$group: {
_id: "$libraryName",
count: { $sum: 1 },
},
}
]
);
Second - I want, per libaryName to count for each stage type how many nested stages documents it has.
So the final result I wish to retrieve is:
[
{
libraryName: 'a1',
count: 456,
stages: [
{
type: 'b1',
count: 43,
},
{
type: 'b2',
count: 44,
}
],
},
{
libraryName: 'a2',
count: 4546,
stages: [
{
type: 'b1',
count: 43
},
{
type: 'b3',
count: 44
}
]
}
]
Changed to:
[
{
"_id": "a1",
"count": 2,
"stages": [
{
"count": 1,
"type": "b3"
},
{
"count": 3,
"type": "b1"
},
{
"count": 4,
"type": "b2"
}
]
},
{
"_id": "a2",
"count": 1,
"stages": [
{
"count": 1,
"type": "b1"
},
{
"count": 3,
"type": "b2"
}
]
}
]
Using the sample data in the question post and the aggregation query:
db.collection.aggregate([
{
$unwind: "$stages"
},
{
$group: {
_id: { libraryName: "$libraryName", type: "$stages.type" },
type_count: { "$sum": 1 }
}
},
{
$group: {
_id: { libraryName: "$_id.libraryName" },
count: { "$sum": "$type_count" },
stages: { $push: { type: "$_id.type", count: "$type_count" } }
}
},
{
$project: {
libraryName: "$_id.libraryName",
count: 1,
stages: 1,
_id: 0
}
}
])
I get the following results:
{
"libraryName" : "a2",
"count" : 4,
"stages" : [
{
"type" : "b1",
"count" : 1
},
{
"type" : "b2",
"count" : 3
}
]
}
{
"libraryName" : "a1",
"count" : 8,
"stages" : [
{
"type" : "b3",
"count" : 1
},
{
"type" : "b1",
"count" : 3
},
{
"type" : "b2",
"count" : 4
}
]
}
[ EDIT - ADD ] : This is an answer after the question post's expected result is modified. This query uses the question post's sample documents as input.
db.collection.aggregate([
{
$group: {
_id: { libraryName: "$libraryName" },
count: { "$sum": 1 },
stages: { $push: "$stages" }
}
},
{
$unwind: "$stages"
},
{
$unwind: "$stages"
},
{
$group: {
_id: { libraryName: "$_id.libraryName", type: "$stages.type" },
type_count: { "$sum": 1 },
count: { $first: "$count" }
}
},
{
$group: {
_id: "$_id.libraryName",
count: { $first: "$count" },
stages: { $push: { type: "$_id.type", count: "$type_count" } }
}
},
])
The result:
{
"_id" : "a2",
"count" : 1,
"stages" : [
{
"type" : "b2",
"count" : 3
},
{
"type" : "b1",
"count" : 1
}
]
}
{
"_id" : "a1",
"count" : 2,
"stages" : [
{
"type" : "b2",
"count" : 4
},
{
"type" : "b3",
"count" : 1
},
{
"type" : "b1",
"count" : 3
}
]
}

Apply multistage grouping in MongoDb Aggregation Framework

lets's assume I have the following data:
[
{ name: "Clint", hairColor: "brown", shoeSize: 8, income: 20000 },
{ name: "Clint", hairColor: "blond", shoeSize: 9, income: 30000 },
{ name: "George", hairColor: "brown", shoeSize: 7, income: 30000 },
{ name: "George", hairColor: "blond", shoeSize: 8, income: 10000 },
{ name: "George", hairColor: "blond", shoeSize: 9, income: 20000 }
]
I want to have the following output:
[
{
name: "Clint",
counts: 2,
avgShoesize: 8.5,
shoeSizeByHairColor: [
{ _id: "brown", counts: 1, avgShoesize: 8 },
{ _id: "blond", counts: 1, avgShoesize: 9 },
],
incomeByHairColor: [
{ _id: "brown", counts: 1, avgIncome: 20000 },
{ _id: "blond", counts: 1, avgIncome: 30000 },
]
},
{
name: "George",
counts: 3,
avgShoesize: 8,
shoeSizeByHairColor: [
{ _id: "brown", counts: 1, avgShoesize: 8 },
{ _id: "blond", counts: 2, avgShoesize: 8.5 },
],
incomeByHairColor: [
{ _id: "brown", counts: 1, avgIncome: 30000 },
{ _id: "blond", counts: 2, avgIncome: 15000 },
],
}
]
Basically I want to group my dataset by some key and then I want to have multiple groups of the subset.
First I thought of applying a $group with the key name. and the to use $facet in order to have various aggregations. I guess this will ot work since $facet does not use the subset from the previous $group. If I use $facet first I would need to split the result in multiple documents.
Any ideas how to properly solve my problem?
You need double $group, first one should aggregate by name and hairColor. And the second one can build nested array:
db.collection.aggregate([
{
$group: {
_id: { name: "$name", hairColor: "$hairColor" },
count: { $sum: 1 },
sumShoeSize: { $sum: "$shoeSize" },
avgShoeSize: { $avg: "$shoeSize" },
avgIncome: { $avg: "$income" },
docs: { $push: "$$ROOT" }
}
},
{
$group: {
_id: "$_id.name",
count: { $sum: "$count" },
sumShoeSize: { $sum: "$sumShoeSize" },
shoeSizeByHairColor: {
$push: {
_id: "$_id.hairColor", counts: "$count", avgShoeSize: "$avgShoeSize"
}
},
incomeByHairColor: {
$push: {
_id: "$_id.hairColor", counts: "$count", avgIncome: "$avgIncome"
}
}
}
},
{
$project: {
_id: 1,
count: 1,
avgShoeSize: { $divide: [ "$sumShoeSize", "$count" ] },
shoeSizeByHairColor: 1,
incomeByHairColor: 1
}
}
])
Mongo Playground
Phase 1: You can group by name and hairColor
and accumulate count, avgShoeSize, avgIncome, hairColors
Phase 2: Push accumulated into an array of incomeByHairColor, incomeByHairColor using $map operator.
Phase 3: Finally, in phase 3 you accumulate group by name and accumulate,
incomeByHairColor, incomeByHairColor and count
Pipeline:
db.users.aggregate([
{
$group :{
_id: {
name : "$name",
hairColor: "$hairColor"
},
count : {"$sum": 1},
avgShoeSize: {$avg: "$shoeSize"},
avgIncome : {$avg: "$income"},
hairColors : {$addToSet:"$hairColor" }
}
},
{
$project: {
_id:0,
name : "$_id.name",
hairColor: "$_id.hairColor",
count : "$count",
incomeByHairColor : {
$map: {
input: "$hairColors",
as: "key",
in: {
_id: "$$key",
counts: "$count",
avgIncome: "$avgIncome"
}
}
},
shoeSizeByHairColor:{
$map: {
input: "$hairColors",
as: "key",
in: {
_id: "$$key",
counts: "$count",
avgShoeSize: "$avgShoeSize"
}
}
}
}
},
{
$group: {
_id : "$name",
count : {$sum: "$count"},
incomeByHairColor: {$push : "$incomeByHairColor"},
shoeSizeByHairColor : {$push : "$shoeSizeByHairColor"}
}
}
]
)
Output:
/* 1 */
{
"_id" : "Clint",
"count" : 2,
"incomeByHairColor" : [
[
{
"_id" : "blond",
"counts" : 1,
"avgIncome" : 30000
}
],
[
{
"_id" : "brown",
"counts" : 1,
"avgIncome" : 20000
}
]
],
"shoeSizeByHairColor" : [
[
{
"_id" : "blond",
"counts" : 1,
"avgShoeSize" : 9
}
],
[
{
"_id" : "brown",
"counts" : 1,
"avgShoeSize" : 8
}
]
]
},
/* 2 */
{
"_id" : "George",
"count" : 3,
"incomeByHairColor" : [
[
{
"_id" : "blond",
"counts" : 2,
"avgIncome" : 15000
}
],
[
{
"_id" : "brown",
"counts" : 1,
"avgIncome" : 30000
}
]
],
"shoeSizeByHairColor" : [
[
{
"_id" : "blond",
"counts" : 2,
"avgShoeSize" : 8.5
}
],
[
{
"_id" : "brown",
"counts" : 1,
"avgShoeSize" : 7
}
]
]
}

Atomically move object by ID from one array to another in same document [duplicate]

This question already has an answer here:
Move an element from one array to another within same document MongoDB
(1 answer)
Closed 3 years ago.
I have data that looks like this:
{
"_id": ObjectId("4d525ab2924f0000000022ad"),
"arrayField": [
{ id: 1, other: 23 },
{ id: 2, other: 21 },
{ id: 0, other: 235 },
{ id: 3, other: 765 }
],
"someOtherArrayField": []
}
Given a nested object's ID (0), I'd like to $pull the element from one array (arrayField) and $push it to another array (someOtherArrayField) within the same document. The result should look like this:
{
"_id": ObjectId("id"),
"arrayField": [
{ id: 1, other: 23 },
{ id: 2, other: 21 },
{ id: 3, other: 765 }
],
"someOtherArrayField": [
{ id: 0, other: 235 }
]
}
I realize that I can accomplish this with a find followed by an update, i.e.
db.foo.findOne({"_id": param._id})
.then((doc)=>{
db.foo.update(
{
"_id": param._id
},
{
"$pull": {"arrayField": {id: 0}},
"$push": {"someOtherArrayField": {doc.array[2]} }
}
)
})
But I'm looking for an atomic operation like, in pseudocode, this:
db.foo.update({"_id": param._id}, {"$move": [{"arrayField": {id: 0}}, {"someOtherArrayField": 1}]}
Is there an atomic way to do this, perhaps using MongoDB 4.2's ability to specify a pipeline to an update command? How would that look?
I found this post that generously provided the data I used, but the provided solution isn't an atomic operation. Has an atomic solution become possible with MongoDB 4.2?
Here's an example:
> db.baz.find()
> db.baz.insert({
... "_id": ObjectId("4d525ab2924f0000000022ad"),
... "arrayField": [
... { id: 1, other: 23 },
... { id: 2, other: 21 },
... { id: 0, other: 235 },
... { id: 3, other: 765 }
... ],
... "someOtherArrayField": []
... })
WriteResult({ "nInserted" : 1 })
function extractIdZero(arrayFieldName) {
return {$arrayElemAt: [
{$filter: {input: arrayFieldName, cond: {$eq: ["$$this.id", 0]}}},
0
]};
}
extractIdZero("$arrayField")
{
"$arrayElemAt" : [
{
"$filter" : {
"input" : "$arrayField",
"cond" : {
"$eq" : [
"$$this.id",
0
]
}
}
},
0
]
}
db.baz.updateOne(
{_id: ObjectId("4d525ab2924f0000000022ad")},
[{$set: {
arrayField: {$filter: {
input: "$arrayField",
cond: {$ne: ["$$this.id", 0]}
}},
someOtherArrayField: {$concatArrays: [
"$someOtherArrayField",
[extractIdZero("$arrayField")]
]}
}}
])
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }
> db.baz.findOne()
{
"_id" : ObjectId("4d525ab2924f0000000022ad"),
"arrayField" : [
{
"id" : 1,
"other" : 23
},
{
"id" : 2,
"other" : 21
},
{
"id" : 3,
"other" : 765
}
],
"someOtherArrayField" : [
{
"id" : 0,
"other" : 235
}
]
}

MongoDB aggregation, counting each item in array and grouping by item

here I have an array of duplicate items like this
[
'gg',
'bb',
'dd',
'cc',
'll',
'aa',
'cc',
'gg',
'bb',
'dd',
'cc',
'bb',
'dd',
'll',
'aa',
]
and what I'm willing to return is like this
{
'gg': 2,
'bb': 3,
'dd': 3,
'cc': 2,
'll': 2,
'aa': 2,
}
Can it be done with MongoDB aggregation ??? Appreciate any help
Use $unwind and $group as stages of aggregation pipiline:
Query:
db.collection.aggregate([
{
$unwind: "$items"
},
{
$group: {
_id: "$items",
count: {
$sum: 1
}
}
}
])
Result:
{
"_id": "ll",
"count": 2
},
{
"_id": "gg",
"count": 2
},
{
"_id": "bb",
"count": 3
},
{
"_id": "cc",
"count": 3
},
{
"_id": "aa",
"count": 2
},
{
"_id": "dd",
"count": 3
}
This also works really well...
db.users.aggregate([
{
$group: {
_id: "$email",
count: { $sum: 1 }
}
},
{
$match: {
count: { $gt: 1 }
}
}
])
Output:
{ "_id" : "a#gmail.com", "count" : 2 }
{ "_id" : "b#gmail.com", "count" : 2 }
{ "_id" : "c#gmaiL.com", "count" : 8 }
{ "_id" : "d#gmail.com", "count" : 2 }
{ "_id" : "e#gmail.com", "count" : 2 }
{ "_id" : "f#gmail.com", "count" : 2 }

find documents having a specific count of matches array

I've searched high and low but not been able to find what i'm looking for so apologies if this has already been asked.
Consider the following documents
{
_id: 1,
items: [
{
category: "A"
},
{
category: "A"
},
{
category: "B"
},
{
category: "C"
}]
},
{
_id: 2,
items: [
{
category: "A"
},
{
category: "B"
}]
},
{
_id: 3,
items: [
{
category: "A"
},
{
category: "A"
},
{
category: "A"
}]
}
I'd like to be able to find those documents which have more than 1 category "A" item in the items array. So this should find documents 1 and 3.
Is this possible?
Using aggregation
> db.spam.aggregate([
{$unwind: "$items"},
{$match: {"items.category" :"A"}},
{$group: {
_id: "$_id",
item: {$push: "$items.category"}, count: {$sum: 1}}
},
{$match: {count: {$gt: 1}}}
])
Output
{ "_id" : 3, "item" : [ "A", "A", "A" ], "count" : 3 }
{ "_id" : 1, "item" : [ "A", "A" ], "count" : 2 }