Customising groups in mongo during aggregation - mongodb

I'm using an example from mongo docs which I've changed a bit:
db.books.aggregate(
[
{ $group : { _id : "$genre", books: { $push: "$$ROOT" } } }
]
)
This query will return an array of books by genre.
I want to customise it a bit, so that I would not get extra data. The following example would be a dummy one, but I'm curious if it could be implemented in mongo. I want my aggregation to return an array of groups where if genre is 'Tragedy' only 1 book would be fetched and there would be a booksCount field and in all the other cases books will be an array and there won't be a booksCount.
So the aggregation result would look something like this:
[
{ _id: '_id of tragedy genre', book: {some book}, booksCount: some int },
{ _id: '_id of some other genre', books: [books] },
...
]
So I want groups to have different keys depending on some condition

One way to do this is with the $facet aggregation pipeline stage. This stage allows us to create multiple pipelines with the same input documents. In this case we have one pipeline for the tragedy genre and another for all the other genres. In order to get your desired output we need to merge the two pipeline stages. From the docs:
Each sub-pipeline has its own field in the output document where its results are stored as an array of documents.
Because the facet stage returns an array of documents for each pipeline, we need to: concatenate these arrays together, unwind the resulting array so that each element is its own document, and then replace the root of each document to get rid of the unwanted key.
Example
Say you have the following documents:
db.books.insertMany([{
genre: "Tragedy",
title: "Romeo and Juliet"
}, {
genre: "Tragedy",
title: "Titanic"
}, {
genre: "Comedy",
title: "Hitchhikers Guide to the Galaxy"
}, {
genre: "Comedy",
title: "Blazing Saddles"
}, {
genre: "Thriller",
title: "Shutter Island"
}, {
genre: "Thriller",
title: "Hannibal"
}])
Then you can use the following query:
db.books.aggregate([{
$facet: {
tragedy: [{
$match: {genre: "Tragedy"}
}, {
$group: {
_id: "$genre",
books: {$push: "$$ROOT"}
}
}, {
$project: {
book: {$arrayElemAt: ["$books", 1]},
booksCount: {$size: "$books"}
}
}],
other: [{
$match: {
genre: {$ne: "Tragedy"}
}
}, {
$group: {
_id: "$genre",
books: {$push: "$$ROOT"}
}
}]
}
}, {
$project: {
documents: {$concatArrays: ["$tragedy", "$other"]}
}
}, {
$unwind: "$documents"
}, {
$replaceRoot: {newRoot: "$documents"}
}])
To produce:
{
"_id" : "Tragedy",
"book" : {
"_id" : ObjectId("5c59f15bc59454560b36a5c7"),
"genre" : "Tragedy",
"title" : "Titanic"
},
"booksCount" : 2
}
{
"_id" : "Thriller",
"books" : [
{
"_id" : ObjectId("5c59f15bc59454560b36a5ca"),
"genre" : "Thriller",
"title" : "Shutter Island"
},
{
"_id" : ObjectId("5c59f15bc59454560b36a5cb"),
"genre" : "Thriller",
"title" : "Hannibal"
}
]
}
{
"_id" : "Comedy",
"books" : [
{
"_id" : ObjectId("5c59f15bc59454560b36a5c8"),
"genre" : "Comedy",
"title" : "Hitchhikers Guide to the Galaxy"
},
{
"_id" : ObjectId("5c59f15bc59454560b36a5c9"),
"genre" : "Comedy",
"title" : "Blazing Saddles"
}
]
}

Related

I have this two collections namely Inward and Outward, Both collections have similar embeded sub documents contains product, batch and quantity fields

Inward collections
{"ord" : 1,
"products" : [
{
"name" : "apple",
"qty" : "10",
"batch" : "jun-2021"
},
{
"name" : "banana",
"qty" : 20,
"batch" : "jan-2021"
}
]
}
outward collections
{
"_id" : ObjectId("5edde5487957d9efea972a74"),
"inv" : 1,
"products" : [
{
"name" : "apple",
"qty" : 13,
"batch" : "jun-2021"
}
]
}
Now, I would like to perform actual stock quantity check for particular product and batch (grouping together) both the collections
You may try this way:
We join them with inward.ord = outward.inv condition.
Flatten products field.
Group by product's name and batch to sum qty value.
db.inward.aggregate([
{
$lookup: {
from: "outward",
let: {
ord: "$ord",
products: "$products"
},
pipeline: [
{
$match: {
$expr: {
$eq: [ "$$ord", "$inv" ]
}
}
},
{
$project: {
products: {
$concatArrays: [
"$$products",
"$products"
]
}
}
},
{
$unwind: "$products"
},
{
$replaceWith: "$products"
}
],
as: "products"
}
},
{
$unwind: "$products"
},
{
$group: {
_id: {
batch: "$products.batch",
name: "$products.name"
},
qty: {
$sum: "$products.qty"
}
}
}
])
MongoPlayground
Note: You need to have MongoDB v4.2

How can i count total documents and also grouped counts simultanously in mongodb aggregation?

I have a dataset in mongodb collection named visitorsSession like
{ip : 192.2.1.1,country : 'US', type : 'Visitors',date : '2019-12-15T00:00:00.359Z'},
{ip : 192.3.1.8,country : 'UK', type : 'Visitors',date : '2019-12-15T00:00:00.359Z'},
{ip : 192.5.1.4,country : 'UK', type : 'Visitors',date : '2019-12-15T00:00:00.359Z'},
{ip : 192.8.1.7,country : 'US', type : 'Visitors',date : '2019-12-15T00:00:00.359Z'},
{ip : 192.1.1.3,country : 'US', type : 'Visitors',date : '2019-12-15T00:00:00.359Z'}
I am using this mongodb aggregation
[{$match: {
nsp : "/hrm.sbtjapan.com",
creationDate : {
$gte: "2019-12-15T00:00:00.359Z",
$lte: "2019-12-20T23:00:00.359Z"
},
type : "Visitors"
}}, {$group: {
_id : "$country",
totalSessions : {
$sum: 1
}
}}, {$project: {
_id : 0,
country : "$_id",
totalSessions : 1
}}, {$sort: {
country: -1
}}]
using above aggregation i am getting results like this
[{country : 'US',totalSessions : 3},{country : 'UK',totalSessions : 2}]
But i also total visitors also along with result like totalVisitors : 5
How can i do this in mongodb aggregation ?
You can use $facet aggregation stage to calculate total visitors as well as visitors by country in a single pass:
db.visitorsSession.aggregate( [
{
$match: {
nsp : "/hrm.sbtjapan.com",
creationDate : {
$gte: "2019-12-15T00:00:00.359Z",
$lte: "2019-12-20T23:00:00.359Z"
},
type : "Visitors"
}
},
{
$facet: {
totalVisitors: [
{
$count: "count"
}
],
countrySessions: [
{
$group: {
_id : "$country",
sessions : { $sum: 1 }
}
},
{
$project: {
country: "$_id",
_id: 0,
sessions: 1
}
}
],
}
},
{
$addFields: {
totalVisitors: { $arrayElemAt: [ "$totalVisitors.count" , 0 ] },
}
}
] )
The output:
{
"totalVisitors" : 5,
"countrySessions" : [
{
"sessions" : 2,
"country" : "UK"
},
{
"sessions" : 3,
"country" : "US"
}
]
}
You could be better off with two queries to do this.
To save the two db round trips following aggregation can be used which IMO is kinda verbose (and might be little expensive if documents are very large) to just count the documents.
Idea: Is to have a $group at the top to count documents and preserve the original documents using $push and $$ROOT. And then before other matches/filter ops $unwind the created array of original docs.
db.collection.aggregate([
{
$group: {
_id: null,
docsCount: {
$sum: 1
},
originals: {
$push: "$$ROOT"
}
}
},
{
$unwind: "$originals"
},
{ $match: "..." }, //and other stages on `originals` which contains the source documents
{
$group: {
_id: "$originals.country",
totalSessions: {
$sum: 1
},
totalVisitors: {
$first: "$docsCount"
}
}
}
]);
Sample O/P: Playground Link
[
{
"_id": "UK",
"totalSessions": 2,
"totalVisitors": 5
},
{
"_id": "US",
"totalSessions": 3,
"totalVisitors": 5
}
]

Lookup and sort the foreign collection

so I have a collection users, and each document in this collection, as well as other properties, has an array of ids of documents in the other collection: workouts.
Every document in the collection workouts has a property named date.
And here's what I want to get:
For a specific user, I want to get an array of {workoutId, workoutDate} for the workouts that belong to that user, sorted by date.
This is my attempt, which is working fine.
Users.aggregate([
{
$match : {
_id : ObjectId("whateverTheUserIdIs")
}
},
{
$unwind : {
path : "$workouts"
}
}, {
$lookup : {
from : "workouts",
localField : "workouts",
foreignField : "_id",
as : "workoutDocumentsArray"
}
}, {
$project : {
_id : false,
workoutData : {
$arrayElemAt : [
$workoutDocumentsArray,
0
]
}
}
}, {
$project : {
date : "$workoutData.date",
id : "$workoutData._id"
}
}, {
$sort : {date : -1}
}
])
However I refuse to believe I need all this for what would be such a simple query in SQL!? I believe I must at least be able to merge the two $project stages into one? But I've not been able to figure out how looking at the docs.
Thanks in advance for taking the time! ;)
====
EDIT - This is some sample data
Collection users:
[{
_id:xxx,
workouts: [2,4,6]
},{
_id: yyy,
workouts: [1,3,5]
}]
Colleciton workouts:
[{
_id:1,
date: 1/1/1901
},{
_id:2,
date: 2/2/1902
},{
_id:3,
date: 3/3/1903
},{
_id:4,
date: 4/4/1904
},{
_id:5,
date: 5/5/1905
},{
_id:6,
date: 6/6/1906
}]
And after running my query, for example for user xxx, I would like to get only the workouts that belong to him (whose ids appear in his workouts array), so the result I want would look like:
[{
id:6,
date: 6/6/1906
},{
id:4,
date: 4/4/1904
},{
id:2,
date: 2/2/1902
}]
You don't need to $unwind the workouts array as it already contains array of _ids and use $replaceRoot instead of doing $project
Users.aggregate([
{ "$match": { "_id" : ObjectId("whateverTheUserIdIs") }},
{ "$lookup": {
"from" : "workouts",
"localField" : "workouts",
"foreignField" : "_id",
"as" : "workoutDocumentsArray"
}},
{ "$unwind": "$workoutDocumentsArray" },
{ "$replaceRoot": { "newRoot": "$workoutDocumentsArray" }}
{ "$sort" : { "date" : -1 }}
])
or even with new $lookup syntax
Users.aggregate([
{ "$match" : { "_id": ObjectId("whateverTheUserIdIs") }},
{ "$lookup" : {
"from" : "workouts",
"let": { "workouts": "$workouts" },
"pipeline": [
{ "$match": { "$expr": { "$in": ["$_id", "$$workouts"] }}},
{ "$sort" : { "date" : -1 }}
]
"as" : "workoutDocumentsArray"
}},
{ "$unwind": "$workoutDocumentsArray" },
{ "$replaceRoot": { "newRoot": "$workoutDocumentsArray" }}
])

Finding all documents which share the same value in an array

Consider I have the following data below:
{
"id":123,
"name":"apple",
"codes":["ABC", "DEF", "EFG"]
}
{
"id":234,
"name":"pineapple",
"codes":["DEF"]
}
{
"id":345,
"name":"banana",
"codes":["HIJ","KLM"]
}
If I didn't want to search by a specific code, is there a way to find all fruits in my mongodb collection which shares the same code?
db.collection.aggregate([
{ $unwind: '$codes' },
{ $group: { _id: '$codes', count: {$sum:1}, fruits: {$push: '$name'}}},
{ $match: {'count': {$gt:1}}},
{ $group:{_id:null, total:{$sum:1}, data:{$push:{fruits: '$fruits', code:'$_id'}}}}
])
result:
{ "_id" : null, "total" : 1, "data" : [ { "fruits" : [ "apple", "pineapple" ], "code" : "DEF" } ] }

Mongodb Aggregate using $group twice

I have a bunch of documents in mongo with the following structure:
{
"_id" : "",
"number" : 2,
"colour" : {
"_id" : "",
"name" : "Green",
"hex" : "00ff00"
},
"position" : {
"_id" : "",
"name" : "Defence",
"type" : "position"
},
"ageGroup" : {
"_id" : "",
"name" : "Minor Peewee",
"type" : "age"
},
"companyId" : ""
}
I'm currently using Mongo's aggregate to group the documents by ageGroup.name which returns:
//Query
Jerseys.aggregate([
{$match: { companyId: { $in: companyId } } },
{$group: {_id: "$ageGroup.name", jerseys: { $push: "$$ROOT" }} }
]);
//returns
{
_id: "Minor Peewee",
jerseys: array[]
}
but I'd like it to also group by position.name within the age groups. ie:
{
_id: "Minor Peewee",
positions: array[]
}
//in positions array...
{
_id: "Defence",
jerseys: array[]
}
// or ageGroups->positions->jerseys if that makes more sense.
I've tried multiple groups but I don't think I'm setting them up correctly I always seem to get an array of _id's. I'm using Meteor as the server and I'm doing it within a meteor method.
You can use a composite aggregate _id in the first grouping stage.
Then, you can use one of those keys as the "main" _id of the final aggregate and $push the other into another array.
Jerseys.aggregate([
{
$match: { companyId: { $in: companyId } }
},
{
$group: { // each position and age group have an array of jerseys
_id: { position: "$position", ageGroup: "$ageGroup" },
jerseys: { $push: "$$ROOT" }
}
},
{
$group: { // for each age group, create an array of positions
_id: { ageGroup: "$_id.ageGroup" },
positions: { $push: { position: "$_id.position", jerseys:"$jerseys" } }
}
}
]);