Currently I'm using aggregation in MongoDB. There is a field with province and religion in my collections. I'm doing this
const data = await submit.aggregate([
{ "$group": { _id: { province: "$province" ,religion:"$religion"}, count: { $sum: 1 } } },
])
My output Looks like this:
[
{ _id: { religion: 'a', province: 'aa' }, count: 1 },
{ _id: { religion: b, province: 'bb' }, count: 2 },
{ _id: { religion: 'c', province: 'bb'}, count: 2 },
{ _id: { religion: 'd', province: 'cc' }, count: 1 }
]
Expect Output:
[
{ _id: { religion: 'a ' }, count: 1 },
{ _id: { religion: 'a' }, count: 1 },
{ _id: { religion: null }, count: 6 },
{ _id: { religion: 'c' }, count: 1 },
{ _id: { religion: 'd' }, count: 2 },
{ _id: { religion: 'e' }, count: 6 },
{ _id: { religion: 'f' }, count: 15 },
{ _id: { religion: 'g' }, count: 2 },
] [
{ _id: { province: 'aa' }, count: 19 },
{ _id: { province: 'bb' }, count: 2 },
{ _id: { province: 'cc' }, count: 21 },
]
You seek 2 different $group at the same time -- this is exactly what $facet is for. Think of $facet like "multi-group." Given an input set similar to the following:
{ religion: 'a', province: 'aa' },
{ religion: 'b', province: 'aa' },
{ religion: 'c', province: 'aa' },
{ religion: 'c', province: 'bb' },
{ religion: 'd', province: 'bb' },
{ religion: 'e', province: 'cc' },
{ religion: 'f', province: 'aa' },
{ religion: 'f', province: 'aa' },
{ religion: 'f', province: 'aa' },
{ religion: 'f', province: 'cc' }
Then this pipeline:
db.foo.aggregate([
{$facet: {
"by_religion": [
{$group: {_id: '$religion', N:{$sum:1}}}
],
"by_province": [
{$group: {_id: '$province', N:{$sum:1}}}
],
}}
]);
yields this output:
{
"by_religion" : [
{
"_id" : "b",
"N" : 1
},
{
"_id" : "e",
"N" : 1
},
{
"_id" : "d",
"N" : 1
},
{
"_id" : "a",
"N" : 1
},
{
"_id" : "f",
"N" : 4
},
{
"_id" : "c",
"N" : 2
}
],
"by_province" : [
{
"_id" : "bb",
"N" : 2
},
{
"_id" : "cc",
"N" : 2
},
{
"_id" : "aa",
"N" : 6
}
]
}
The OP seeks to further refine the output by doing some data-as-LVAL workup and although this is in general considered a poor design practice, it has certain useful applications. Add this stage after $facet:
,{$project: {
// Reading this from insider-out:
// We use $map to turn the array of objects:
// [ {_id:'d',N:1},{_id:'f',N:4}, ... ]
// into an array of K-v pairs (array of array):
// [ ['d',1] , ['f',4] , ... ]
// That sets us up for $arrayToObject which will take
// that array of arrays and turn it into an object:
// {'d':1, 'f':4, ... }
// The target field name is the same as the input so
// we are simply overwriting the field.
"by_religion": {$arrayToObject: {$map: {
input: '$by_religion',
in: [ '$$this._id', '$$this.N' ]
}}
},
"by_province": {$arrayToObject: {$map: {
input: '$by_province',
in: [ '$$this._id', '$$this.N' ]
}}
}
}}
to yield:
{
"by_religion" : {
"d" : 1,
"b" : 1,
"c" : 2,
"f" : 4,
"a" : 1,
"e" : 1
},
"by_province" : {
"bb" : 2,
"cc" : 2,
"aa" : 6
}
}
A variation on the lval/rval workup uses this $project instead of the one immediately above:
,{$project: {
"by_religion": {$map: {
input: '$by_religion',
in: {$arrayToObject: [ [{k:'$$this._id',v:'$$this.N'}] ]}
}},
"by_province": {$map: {
input: '$by_province',
in: {$arrayToObject: [ [{k:'$$this._id',v:'$$this.N'}] ]}
}},
}}
which yields an array:
{
"by_religion" : [
{"b" : 1},
{"c" : 2},
{"a" : 1},
{"f" : 4},
{"d" : 1},
{"e" : 1}
],
"by_province" : [
{"cc" : 2},
{"aa" : 6},
{"bb" : 2}
]
}
Related
I have the following mongo data structure:
[
{
_id: "......",
libraryName: "a1",
stages: [
{
_id: '....',
type: 'b1',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b3',
},
{
_id: '....',
type: 'b1',
},
],
},
{
_id: "......",
libraryName: "a1",
stages: [
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b1',
},
],
},
{
_id: "......",
libraryName: "a2",
stages: [
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b1',
},
],
},
]
Assume this is the Session collection. Now, each session document has some irrelevant _id and libraryName key. Furthermore, each document has array of stages documents. Each stage document has some irrelevant _id and type. I want to count 2 things.
First - I want to count for each libraryName, how many session objects it has.
The solution for this query would be:
const services = await Session.aggregate(
[
{
$group: {
_id: "$libraryName",
count: { $sum: 1 },
},
}
]
);
Second - I want, per libaryName to count for each stage type how many nested stages documents it has.
So the final result I wish to retrieve is:
[
{
libraryName: 'a1',
count: 456,
stages: [
{
type: 'b1',
count: 43,
},
{
type: 'b2',
count: 44,
}
],
},
{
libraryName: 'a2',
count: 4546,
stages: [
{
type: 'b1',
count: 43
},
{
type: 'b3',
count: 44
}
]
}
]
Changed to:
[
{
"_id": "a1",
"count": 2,
"stages": [
{
"count": 1,
"type": "b3"
},
{
"count": 3,
"type": "b1"
},
{
"count": 4,
"type": "b2"
}
]
},
{
"_id": "a2",
"count": 1,
"stages": [
{
"count": 1,
"type": "b1"
},
{
"count": 3,
"type": "b2"
}
]
}
]
Using the sample data in the question post and the aggregation query:
db.collection.aggregate([
{
$unwind: "$stages"
},
{
$group: {
_id: { libraryName: "$libraryName", type: "$stages.type" },
type_count: { "$sum": 1 }
}
},
{
$group: {
_id: { libraryName: "$_id.libraryName" },
count: { "$sum": "$type_count" },
stages: { $push: { type: "$_id.type", count: "$type_count" } }
}
},
{
$project: {
libraryName: "$_id.libraryName",
count: 1,
stages: 1,
_id: 0
}
}
])
I get the following results:
{
"libraryName" : "a2",
"count" : 4,
"stages" : [
{
"type" : "b1",
"count" : 1
},
{
"type" : "b2",
"count" : 3
}
]
}
{
"libraryName" : "a1",
"count" : 8,
"stages" : [
{
"type" : "b3",
"count" : 1
},
{
"type" : "b1",
"count" : 3
},
{
"type" : "b2",
"count" : 4
}
]
}
[ EDIT - ADD ] : This is an answer after the question post's expected result is modified. This query uses the question post's sample documents as input.
db.collection.aggregate([
{
$group: {
_id: { libraryName: "$libraryName" },
count: { "$sum": 1 },
stages: { $push: "$stages" }
}
},
{
$unwind: "$stages"
},
{
$unwind: "$stages"
},
{
$group: {
_id: { libraryName: "$_id.libraryName", type: "$stages.type" },
type_count: { "$sum": 1 },
count: { $first: "$count" }
}
},
{
$group: {
_id: "$_id.libraryName",
count: { $first: "$count" },
stages: { $push: { type: "$_id.type", count: "$type_count" } }
}
},
])
The result:
{
"_id" : "a2",
"count" : 1,
"stages" : [
{
"type" : "b2",
"count" : 3
},
{
"type" : "b1",
"count" : 1
}
]
}
{
"_id" : "a1",
"count" : 2,
"stages" : [
{
"type" : "b2",
"count" : 4
},
{
"type" : "b3",
"count" : 1
},
{
"type" : "b1",
"count" : 3
}
]
}
lets's assume I have the following data:
[
{ name: "Clint", hairColor: "brown", shoeSize: 8, income: 20000 },
{ name: "Clint", hairColor: "blond", shoeSize: 9, income: 30000 },
{ name: "George", hairColor: "brown", shoeSize: 7, income: 30000 },
{ name: "George", hairColor: "blond", shoeSize: 8, income: 10000 },
{ name: "George", hairColor: "blond", shoeSize: 9, income: 20000 }
]
I want to have the following output:
[
{
name: "Clint",
counts: 2,
avgShoesize: 8.5,
shoeSizeByHairColor: [
{ _id: "brown", counts: 1, avgShoesize: 8 },
{ _id: "blond", counts: 1, avgShoesize: 9 },
],
incomeByHairColor: [
{ _id: "brown", counts: 1, avgIncome: 20000 },
{ _id: "blond", counts: 1, avgIncome: 30000 },
]
},
{
name: "George",
counts: 3,
avgShoesize: 8,
shoeSizeByHairColor: [
{ _id: "brown", counts: 1, avgShoesize: 8 },
{ _id: "blond", counts: 2, avgShoesize: 8.5 },
],
incomeByHairColor: [
{ _id: "brown", counts: 1, avgIncome: 30000 },
{ _id: "blond", counts: 2, avgIncome: 15000 },
],
}
]
Basically I want to group my dataset by some key and then I want to have multiple groups of the subset.
First I thought of applying a $group with the key name. and the to use $facet in order to have various aggregations. I guess this will ot work since $facet does not use the subset from the previous $group. If I use $facet first I would need to split the result in multiple documents.
Any ideas how to properly solve my problem?
You need double $group, first one should aggregate by name and hairColor. And the second one can build nested array:
db.collection.aggregate([
{
$group: {
_id: { name: "$name", hairColor: "$hairColor" },
count: { $sum: 1 },
sumShoeSize: { $sum: "$shoeSize" },
avgShoeSize: { $avg: "$shoeSize" },
avgIncome: { $avg: "$income" },
docs: { $push: "$$ROOT" }
}
},
{
$group: {
_id: "$_id.name",
count: { $sum: "$count" },
sumShoeSize: { $sum: "$sumShoeSize" },
shoeSizeByHairColor: {
$push: {
_id: "$_id.hairColor", counts: "$count", avgShoeSize: "$avgShoeSize"
}
},
incomeByHairColor: {
$push: {
_id: "$_id.hairColor", counts: "$count", avgIncome: "$avgIncome"
}
}
}
},
{
$project: {
_id: 1,
count: 1,
avgShoeSize: { $divide: [ "$sumShoeSize", "$count" ] },
shoeSizeByHairColor: 1,
incomeByHairColor: 1
}
}
])
Mongo Playground
Phase 1: You can group by name and hairColor
and accumulate count, avgShoeSize, avgIncome, hairColors
Phase 2: Push accumulated into an array of incomeByHairColor, incomeByHairColor using $map operator.
Phase 3: Finally, in phase 3 you accumulate group by name and accumulate,
incomeByHairColor, incomeByHairColor and count
Pipeline:
db.users.aggregate([
{
$group :{
_id: {
name : "$name",
hairColor: "$hairColor"
},
count : {"$sum": 1},
avgShoeSize: {$avg: "$shoeSize"},
avgIncome : {$avg: "$income"},
hairColors : {$addToSet:"$hairColor" }
}
},
{
$project: {
_id:0,
name : "$_id.name",
hairColor: "$_id.hairColor",
count : "$count",
incomeByHairColor : {
$map: {
input: "$hairColors",
as: "key",
in: {
_id: "$$key",
counts: "$count",
avgIncome: "$avgIncome"
}
}
},
shoeSizeByHairColor:{
$map: {
input: "$hairColors",
as: "key",
in: {
_id: "$$key",
counts: "$count",
avgShoeSize: "$avgShoeSize"
}
}
}
}
},
{
$group: {
_id : "$name",
count : {$sum: "$count"},
incomeByHairColor: {$push : "$incomeByHairColor"},
shoeSizeByHairColor : {$push : "$shoeSizeByHairColor"}
}
}
]
)
Output:
/* 1 */
{
"_id" : "Clint",
"count" : 2,
"incomeByHairColor" : [
[
{
"_id" : "blond",
"counts" : 1,
"avgIncome" : 30000
}
],
[
{
"_id" : "brown",
"counts" : 1,
"avgIncome" : 20000
}
]
],
"shoeSizeByHairColor" : [
[
{
"_id" : "blond",
"counts" : 1,
"avgShoeSize" : 9
}
],
[
{
"_id" : "brown",
"counts" : 1,
"avgShoeSize" : 8
}
]
]
},
/* 2 */
{
"_id" : "George",
"count" : 3,
"incomeByHairColor" : [
[
{
"_id" : "blond",
"counts" : 2,
"avgIncome" : 15000
}
],
[
{
"_id" : "brown",
"counts" : 1,
"avgIncome" : 30000
}
]
],
"shoeSizeByHairColor" : [
[
{
"_id" : "blond",
"counts" : 2,
"avgShoeSize" : 8.5
}
],
[
{
"_id" : "brown",
"counts" : 1,
"avgShoeSize" : 7
}
]
]
}
This question already has an answer here:
Move an element from one array to another within same document MongoDB
(1 answer)
Closed 3 years ago.
I have data that looks like this:
{
"_id": ObjectId("4d525ab2924f0000000022ad"),
"arrayField": [
{ id: 1, other: 23 },
{ id: 2, other: 21 },
{ id: 0, other: 235 },
{ id: 3, other: 765 }
],
"someOtherArrayField": []
}
Given a nested object's ID (0), I'd like to $pull the element from one array (arrayField) and $push it to another array (someOtherArrayField) within the same document. The result should look like this:
{
"_id": ObjectId("id"),
"arrayField": [
{ id: 1, other: 23 },
{ id: 2, other: 21 },
{ id: 3, other: 765 }
],
"someOtherArrayField": [
{ id: 0, other: 235 }
]
}
I realize that I can accomplish this with a find followed by an update, i.e.
db.foo.findOne({"_id": param._id})
.then((doc)=>{
db.foo.update(
{
"_id": param._id
},
{
"$pull": {"arrayField": {id: 0}},
"$push": {"someOtherArrayField": {doc.array[2]} }
}
)
})
But I'm looking for an atomic operation like, in pseudocode, this:
db.foo.update({"_id": param._id}, {"$move": [{"arrayField": {id: 0}}, {"someOtherArrayField": 1}]}
Is there an atomic way to do this, perhaps using MongoDB 4.2's ability to specify a pipeline to an update command? How would that look?
I found this post that generously provided the data I used, but the provided solution isn't an atomic operation. Has an atomic solution become possible with MongoDB 4.2?
Here's an example:
> db.baz.find()
> db.baz.insert({
... "_id": ObjectId("4d525ab2924f0000000022ad"),
... "arrayField": [
... { id: 1, other: 23 },
... { id: 2, other: 21 },
... { id: 0, other: 235 },
... { id: 3, other: 765 }
... ],
... "someOtherArrayField": []
... })
WriteResult({ "nInserted" : 1 })
function extractIdZero(arrayFieldName) {
return {$arrayElemAt: [
{$filter: {input: arrayFieldName, cond: {$eq: ["$$this.id", 0]}}},
0
]};
}
extractIdZero("$arrayField")
{
"$arrayElemAt" : [
{
"$filter" : {
"input" : "$arrayField",
"cond" : {
"$eq" : [
"$$this.id",
0
]
}
}
},
0
]
}
db.baz.updateOne(
{_id: ObjectId("4d525ab2924f0000000022ad")},
[{$set: {
arrayField: {$filter: {
input: "$arrayField",
cond: {$ne: ["$$this.id", 0]}
}},
someOtherArrayField: {$concatArrays: [
"$someOtherArrayField",
[extractIdZero("$arrayField")]
]}
}}
])
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }
> db.baz.findOne()
{
"_id" : ObjectId("4d525ab2924f0000000022ad"),
"arrayField" : [
{
"id" : 1,
"other" : 23
},
{
"id" : 2,
"other" : 21
},
{
"id" : 3,
"other" : 765
}
],
"someOtherArrayField" : [
{
"id" : 0,
"other" : 235
}
]
}
here I have an array of duplicate items like this
[
'gg',
'bb',
'dd',
'cc',
'll',
'aa',
'cc',
'gg',
'bb',
'dd',
'cc',
'bb',
'dd',
'll',
'aa',
]
and what I'm willing to return is like this
{
'gg': 2,
'bb': 3,
'dd': 3,
'cc': 2,
'll': 2,
'aa': 2,
}
Can it be done with MongoDB aggregation ??? Appreciate any help
Use $unwind and $group as stages of aggregation pipiline:
Query:
db.collection.aggregate([
{
$unwind: "$items"
},
{
$group: {
_id: "$items",
count: {
$sum: 1
}
}
}
])
Result:
{
"_id": "ll",
"count": 2
},
{
"_id": "gg",
"count": 2
},
{
"_id": "bb",
"count": 3
},
{
"_id": "cc",
"count": 3
},
{
"_id": "aa",
"count": 2
},
{
"_id": "dd",
"count": 3
}
This also works really well...
db.users.aggregate([
{
$group: {
_id: "$email",
count: { $sum: 1 }
}
},
{
$match: {
count: { $gt: 1 }
}
}
])
Output:
{ "_id" : "a#gmail.com", "count" : 2 }
{ "_id" : "b#gmail.com", "count" : 2 }
{ "_id" : "c#gmaiL.com", "count" : 8 }
{ "_id" : "d#gmail.com", "count" : 2 }
{ "_id" : "e#gmail.com", "count" : 2 }
{ "_id" : "f#gmail.com", "count" : 2 }
I've searched high and low but not been able to find what i'm looking for so apologies if this has already been asked.
Consider the following documents
{
_id: 1,
items: [
{
category: "A"
},
{
category: "A"
},
{
category: "B"
},
{
category: "C"
}]
},
{
_id: 2,
items: [
{
category: "A"
},
{
category: "B"
}]
},
{
_id: 3,
items: [
{
category: "A"
},
{
category: "A"
},
{
category: "A"
}]
}
I'd like to be able to find those documents which have more than 1 category "A" item in the items array. So this should find documents 1 and 3.
Is this possible?
Using aggregation
> db.spam.aggregate([
{$unwind: "$items"},
{$match: {"items.category" :"A"}},
{$group: {
_id: "$_id",
item: {$push: "$items.category"}, count: {$sum: 1}}
},
{$match: {count: {$gt: 1}}}
])
Output
{ "_id" : 3, "item" : [ "A", "A", "A" ], "count" : 3 }
{ "_id" : 1, "item" : [ "A", "A" ], "count" : 2 }