MongoDB aggregate pipeline group - mongodb

I am trying to build a pipeline which will search for documents based on certain criteria and will group certain fields to give desired output. Document structure of deals is
{
"_id":"123",
"status":"New",
"deal_amount":"5200",
"deal_date":"2018-03-05",
"data_source":"API",
"deal_type":"New Business",
"account_id":"A1"
},
{
"_id":"456",
"status":"New",
"deal_amount":"770",
"deal_date":"2018-02-11",
"data_source":"API",
"deal_type":"New Business",
"account_id":"A2"
},
{
"_id":"885",
"status":"Old",
"deal_amount":"4070",
"deal_date":"2017-09-22",
"data_source":"API",
"deal_type":"New Business",
"account_id":"A2"
},
Account name is referenced field. Account document goes like this:
{
"_id":"A1",
"name":"Sarah",
},
{
"_id":"A2",
"name":"Amber",
},
The pipeline should search for documents whose 'status' is 'New' and 'deal amount' is more than 2000 and it should group by 'account name'. Pipeline i have used goes like this
db.deal.aggregate([{
$match: {
status: New,
deal_amount: {
$gte: 2000,
}
}
}, {
$group: {
_id: "$account_name",
}
},{
$lookup:{
from:"accounts",
localField:"account_id",
foreignField:"_id",
as:"acc",
}
}
])
I want to show fields deal_amount, deal_type, deal_date and account name only in result.
Expected Result:
{
"_id": "123",
"deal_amount": "5200",
"deal_date": "2018-03-05",
"deal_type": "New Business",
"account_name": "Sarah"
}, {
"_id": "885",
"deal_amount": "4070",
"deal_date": "2017-09-22",
"deal_type": "New Business",
"account_name": "Amber"
},
Do i have to include all the these fields,deal_amount, deal_type, deal_date & account name, in 'group' stage in order to show in result or is there any other ways to do it. Any help is highly appreciated.

Please use this query.
aggregate([{
$match: {
status: "New",
deal_amount: {
$gte: 2000,
}
}
},
{
$lookup:{
from:"accounts",
localField:"account_id",
foreignField:"_id",
as:"acc",
}
},
{
$unwind: {
path: '$acc',
preserveNullAndEmptyArrays: true,
},
},
{
$group: {
_id: "$acc._id",
deal_amount: { $first: '$deal_amount' },
deal_date: { $first: '$deal_date' },
deal_type: { $first: '$deal_type' },
}
}
])
You can do by :
1) using $$ROOT
reference: link
{ $group : {
_id : "$author",
data: { $push : "$$ROOT" }
}}
2) by assign single parameter
{
$group: {
_id: "$account_name",
deal_amount: { $first: '$deal_amount' },
deal_date: { $first: '$deal_date' },
.
.
}
}

Not sure why you need $group stage. You just need to add $project stage to output the account name from the referenced collection.
{
"$project": {
"deal_amount": 1,
"deal_type": 1,
"deal_date": 1,
"account_name": {"$let":{"vars":{"accl":{"$arrayElemAt":["$acc", 0]}}, in:"$$accl.name}}
}
}

One thing to start with, your $gte operator doesn't work on the string field deal_amount, so you might want to change the field to integers or something similar:
// Convert String to Integer
db.deals.find().forEach(function(data) {
db.deals.update(
{_id:data._id},
{$set:{deal_amount:parseInt(data.deal_amount)}});
Then, to get just the fields you need, reshape the document using $project:
db.deals.aggregate([{
$match: {
"status": "New",
"deal_amount" : {
"$gte" : 2000
}
}
},
{
$lookup:{
from:"accounts",
localField:"account_id",
foreignField:"_id",
as:"acc",
}
},
{
$project: {
_id: 1,
deal_amount: 1,
deal_type: 1,
deal_date: 1,
"account_name": {"$let":{"vars":{"accl":{"$arrayElemAt":["$acc", 0]}}, in:"$$accl.name"}}
}
}
]);
For me, this produced:
{
"_id" : "123",
"deal_amount" : 5200.0,
"deal_date" : "2018-03-05",
"deal_type" : "New Business",
"account_name" : "Sarah"
}

db.deal.aggregate([{$match: {status: {$eq: 'New'}, deal_amount: {$gte: '2000'}}}, {$group: {_id: {accountName: '$account_id', type: '$deal_type', 'amount': '$deal_amount'}}}])

Related

MongoDB Aggregation to get count and Y sample entries

MongoDB version:4.2.17.
Trying out aggregation on data in a collection.
Example data:
{
"_id" : "244",
"pubName" : "p1",
"serviceIdRef" : "36e9c779-7865-4b74-a30b-e4d6a0cc5295",
"serviceName" : "my-service",
"subName" : "c1",
"pubState" : "INVITED"
}
I would like to:
Do a match by something (let’s say subName) and group by serviceIdRef and then limit to return X entries
Also return for each of the serviceIdRefs, the count of the documents in each of ACTIVE or INVITED states. And Y (for this example, say Y=3) documents that are in this state.
For example, the output would appear as (in brief):
[
{
serviceIdRef: "36e9c779-7865-4b74-a30b-e4d6a0cc5295",
serviceName:
state:[
{
pubState: "INVITED"
count: 200
sample: [ // Get those Y entries (here Y=3)
{
// sample1 like:
"_id" : "244",
"pubName" : "p1",
"serviceIdRef" : "36e9c779-7865-4b74-a30b-e4d6a0cc5295",
"serviceName" : "my-service",
"subName" : "c1",
"pubState" : "INVITED"
},
{
sample2
},
{
sample3
}
]
},
{
pubState: "ACTIVE", // For this state, repeat as we did for "INVITED" state above.
......
}
]
}
{
repeat for another service
}
]
So far I have written this but am not able to get those Y entries. Is there a (better) way?
This is what I have so far (not complete and not exactly outputs in the format above):
db.sub.aggregate(
[{
$match:
{
"subName": {
$in: ["c1", "c2"]
},
"$or": [
{
"pubState": "INVITED",
},
{
"pubState": "ACTIVE",
}
]
}
},
{
$group: {
_id: "$serviceIdRef",
subs: {
$push: "$$ROOT",
}
}
},
{
$sort: {
_id: -1,
}
},
{
$limit: 22
},
{
$facet:
{
facet1: [
{
$unwind: "$subs",
},
{
$group:
{
_id: {
"serviceName" : "$_id",
"pubState": "$subs.pubState",
"subState": "$subs.subsState"
},
count: {
$sum: 1
}
}
}
]
}
}
])
You have to do the second $group stage to manage nested structure,
$match your conditions
$sort by _id in descending order
$group by serviceIdRef and pubState, get first required fields and prepare the array for sample, and get count of documents
$group by only serviceIdRef and construct the state array
$slice for limit the document in sample
db.collection.aggregate([
{
$match: {
subName: { $in: ["c1", "c2"] },
pubState: { $in: ["INVITED", "ACTIVE"] }
}
},
{ $sort: { _id: -1 } },
{
$group: {
_id: {
serviceIdRef: "$serviceIdRef",
pubState: "$pubState"
},
serviceName: { $first: "$serviceName" },
sample: { $push: "$$ROOT" },
count: { $sum: 1 }
}
},
{
$group: {
_id: "$_id.serviceIdRef",
serviceName: { $first: "$serviceName" },
state: {
$push: {
pubState: "$_id.pubState",
count: "$count",
sample: { $slice: ["$sample", 22] }
}
}
}
}
])
Playground

How to do LEFT JOIN in MongoDB aggregate function? [duplicate]

I have a collection of users where each document has following structure:
{
"_id": "<id>",
"login": "xxx",
"solved": [
{
"problem": "<problemID>",
"points": 10
},
...
]
}
The field solved may be empty or contain arbitrary many subdocuments. My goal is to get a list of users together with the total score (sum of points) where users that haven't solved any problem yet will be assigned total score of 0. Is this possible to do this with a single query (ideally using aggregation framework)?
I was trying to use following query in aggregation framework:
{ "$group": {
"_id": "$_id",
"login": { "$first": "$login" },
"solved": { "$addToSet": { "points": 0 } }
} }
{ "$unwind": "$solved" }
{ "$group": {
"_id": "$_id",
"login": { "$first": "$login" },
"solved": { "$sum": "$solved.points" }
} }
However I am getting following error:
exception: The top-level _id field is the only field currently supported for exclusion
Thank you in advance
With MongoDB 3.2 version and newer, the $unwind operator now has some options where in particular the preserveNullAndEmptyArrays option will solve this.
If this option is set to true and if the path is null, missing, or an empty array, $unwind outputs the document. If false, $unwind does not output a document if the path is null, missing, or an empty array. In your case, set it to true:
db.collection.aggregate([
{ "$unwind": {
"path": "$solved",
"preserveNullAndEmptyArrays": true
} },
{ "$group": {
"_id": "$_id",
"login": { "$first": "$login" },
"solved": { "$sum": "$solved.points" }
} }
])
Here is the solution - it assumes that the field "solved" is either absent, is equal to null or has an array of problems and scores solved. The case it does not handle is "solved" being an empty array - although that would be a simple additional adjustment you could add.
project = {$project : {
"s" : {
"$ifNull" : [
"$solved",
[
{
"points" : 0
}
]
]
},
"login" : 1
}
};
unwind={$unwind:"$s"};
group= { "$group" : {
"_id" : "$_id",
"login" : {
"$first" : "$login"
},
"score" : {
"$sum" : "$s.points"
}
}
}
db.students.aggregate( [ project, unwind, group ] );
$lookup then $unwind inside look up array and that could be empty
let posts = await Post.aggregate<ActivityDoc>([
{
$match: {
_id: new mongoose.Types.ObjectId(req.params.id),
},
},
{
$lookup: {
from: 'users',
localField: 'user',
foreignField: '_id',
as: 'user',
},
},
{
$unwind: '$user',
},
{
$unwind: {
path: '$user.follower',
preserveNullAndEmptyArrays: true,
},
},
{
$match: {
$or: [
{
$and: [
{
'privacy.mode': {
$eq: PrivacyMode.EveryOne,
},
},
],
},
{
$and: [
{
'privacy.mode': {
$eq: PrivacyMode.MyCircle,
},
},
{
'user.follower.id': {
$eq: req.currentUser?.id,
},
},
],
},
],
},
},
]);

Adding up values from array elements in MongoDB

I have done some aggregation to arrive at the below document structure for my given data:
{
"_id" : "test",
"NoOfQuestions" : 3.0,
"info" : [
{
"AnswerrCount" : 3
},
{
"AnswerrCount" : 3
},
{
"AnswerrCount" : 2
}
]
}
However, I am trying to add up all the values in the AnswerrCount column. So from the above example, I want another column that says TotalAnswers:8, (3+3+2) and then eventually have a from using the NoOfQuestions, FinalTotal:11, (8+3)
You can use $sum aggregation to add array values
db.collection.aggregate([
{ "$addFields": {
"TotalAnswers": {
"$sum": "$info.AnswerrCount"
},
"FinalTotal": {
"$add": [{ "$sum": "$info.AnswerrCount" }, "$NoOfQuestions"]
}
}}
])
db.collection.aggregate([{
$unwind: "$info"
}, {
$group: {
_id: null,
TotalAnswers: {
$sum: '$info.AnswerrCount'
},
doc: {
$first: '$$CURRENT'
}
}
}, {
$project: {
TotalAnswers: 1,
FinalTotal: {
'$add': ['$TotalAnswers', '$doc.NoOfQuestions']
},
_id: 0
}
}])

$group after $lookup is taking way too long

I have following mongo collection:
{
"_id" : "22pTvYLd7azAAPL5T",
"plate" : "ABC-123",
"company": "AMZ",
"_portfolioType" : "account"
},
{
"_id" : "22pTvYLd7azAAPL5T",
"plate" : "ABC-123",
"_portfolioType" : "sale",
"price": 87.3
},
{
"_id" : "22pTvYLd7azAAPL5T",
"plate" : "ABC-123",
"_portfolioType" : "sale",
"price": 88.9
}
And I am trying to aggregate all documents which have same value in plate field. Below is the query I have written so far:
db.getCollection('temp').aggregate([
{
$lookup: {
from: 'temp',
let: { 'p': '$plate', 't': '$_portfolioType' },
pipeline: [{
'$match': {
'_portfolioType': 'sale',
'$expr': { '$and': [
{ '$eq': [ '$plate', '$$p' ] },
{ '$eq': [ '$$t', 'account' ] }
]}
}
}],
as: 'revenues'
},
},
{
$project: {
plate: 1,
company: 1,
totalTrades: { $arrayElemAt: ['$revenues', 0] },
},
},
{
$addFields: {
revenue: { $add: [{ $multiply: ['$totalTrades.price', 100] }, 99] },
},
},
{
$group: {
_id: '$company',
revenue: { $sum: '$revenue' },
}
}
])
Query works fine if I remove $group stage, however, as soon as I add $group stage mongo starts an infinite processing. I tried adding $match as the first stage so to limit number of documents to process but without any luck. E.g:
{
$match: { $or: [{ _portfolioType: 'account' }, { _portfolioType: 'sale' }] }
},
I also tried using { explain: true } but it doesn't return anything helpful.
As Neil Lunn noticed, you very likely don't need the lookup to reach your "end goal", which is still quite vague.
Please read comments and adjust as needed:
db.temp.aggregate([
{$group:{
// Get unique plates
_id: "$plate",
// Not clear what you expect if there are documents with
// different company, and the same plate.
// Assuming "it never happens"
// You may need to $cond it here with {$eq: ["$_portfolioType", "account"]}
// but you never voiced it.
company: {$first:"$company"},
// Not exactly all documents with _portfolioType: sale,
// but rather price from all documents for this plate.
// Assuming price field is available only in documents
// with "_portfolioType" : "sale". Otherwise add a $cond here.
// If you really need "all documents", push $$ROOT instead.
prices: {$push: "$price"}
}},
{$project: {
company: 1,
// Apply your math here, or on the previous stage
// to calculate revenue per plate
revenue: "$prices"
}}
{$group: {
// Get document for each "company"
_id: "$company",
// Revenue associated with plate
revenuePerPlate: {$push: {"k":"$_id", "v":"$revenue"}}
}},
{$project:{
_id: 0,
company: "$_id",
// Count of unique plate
platesCnt: {$size: "$revenuePerPlate"},
// arrayToObject if you wish plate names as properties
revenuePerPlate: {$arrayToObject: "$revenuePerPlate"}
}}
])

Distinct array element with condition

My documents look like this:
{
"_id": "1",
"tags": [
{ "code": "01-01", "type": "machine" },
{ "code": "04-06", "type": "gearbox" },
{ "code": "07-01", "type": "machine" }
]
},
{
"_id": "2",
"tags": [
{ "code": "03-04","type": "gearbox" },
{ "code": "01-01", "type": "machine" },
{ "code": "04-11", "type": "machine" }
]
}
I want to get distinct codes only for tags whose type is "machine". so, for the example above, the result should be ["01-01", "07-01", "04-11"].
How do I do this?
Using $unwind and then $group with the tag as the key will give you each tag in a separate document in your result set:
db.collection_name.aggregate([
{
$unwind: "$tags"
},
{
$match: {
"tags.type": "machine"
}
},
{
$group: {
_id: "$tags.code"
}
},
{
$project:{
_id:false
code: "$_id"
}
}
]);
Or, if you want them put into an array within a single document, you can use $push within a second $group stage:
db.collection_name.aggregate([
{
$unwind: "$tags"
},
{
$match: {
"tags.type": "machine"
}
},
{
$group: {
_id: "$tags.code"
}
},
{
$group:{
_id: null,
codes: {$push: "$_id"}
}
}
]);
Another user suggested including an initial stage of { $match: { "tags.type": "machine" } }. This is a good idea if your data is likely to contain a significant number of documents that do not include "machine" tags. That way you will eliminate unnecessary processing of those documents. Your pipeline would look like this:
db.collection_name.aggregate([
{
$match: {
"tags.type": "machine"
}
},
{
$unwind: "$tags"
},
{
$match: {
"tags.type": "machine"
}
},
{
$group: {
_id: "$tags.code"
}
},
{
$group:{
_id: null,
codes: {$push: "$_id"}
}
}
]);
> db.foo.aggregate( [
... { $unwind : "$tags" },
... { $match : { "tags.type" : "machine" } },
... { $group : { "_id" : "$tags.code" } },
... { $group : { _id : null , "codes" : {$push : "$_id"} }}
... ] )
{ "_id" : null, "codes" : [ "04-11", "07-01", "01-01" ] }
A better way would be to group directly on tags.type and use addToSet on tags.code.
Here's how we can achieve the same output in 3 stages of aggregation :
db.name.aggregate([
{$unwind:"$tags"},
{$match:{"tags.type":"machine"}},
{$group:{_id:"$tags.type","codes":{$addToSet:"$tags.code"}}}
])
Output : { "_id" : "machine", "codes" : [ "04-11", "07-01", "01-01" ] }
Also, if you wish to filter out tag.type codes, we just need to replace "machine" in match stage with desired tag.type.