Related
I have below data in my collection:
[
{
"_id":{
"month":"Jan",
"year":"2022"
},
"products":[
{
"product":"ProdA",
"status":"failed",
"count":15
},
{
"product":"ProdA",
"status":"success",
"count":5
},
{
"product":"ProdB",
"status":"failed",
"count":20
},
{
"product":"ProdB",
"status":"success",
"count":10
}
]
},
...//more such data
]
I want to group the elements of products array on the name of the product, so that we have record of how what was the count of failure of success of each product in each month. Every record is guaranteed to have both success and failure count each month. The output should look like below:
[
{
"_id":{
"month":"Jan",
"year":"2022"
},
"products":[
{
"product":"ProdA","status":[{"name":"success","count":5},{"name":"failed","count":15}]
},
{
"product":"ProdB","status":[{"name":"success","count":10},{"name":"failed","count":20}]
}
]
},
...//data for succeeding months
]
I have tried to do something like this:
db.collection.aggregate([{ $unwind: "$products" },
{
$group: {
"_id": {
month: "$_id.month",
year: "$_id.year"
},
products: { $push: { "product": "$product", status: { $push: { name: "$status", count: "$count" } } } }
}
}]);
But above query doesn't work.
On which level I need to group fields so as to obtain above output.
Please help me to find out what I am doing wrong.
Thank You!
Your first group stage needs to group by both the _id and the product name, aggregate a list of status counts and then another group stage which then forms the products list:
db.collection.aggregate([
{$unwind: "$products"},
{$group: {
_id: {
id: "$_id",
product: "$products.product",
},
status: {
$push: {
name: "$products.status",
count: "$products.count"
}
}
}
},
{$group: {
_id: "$_id.id",
products: {
$push: {
product: "$_id.product",
status: "$status"
}
}
}
}
])
Mongo Playground
Imagine a data set like this:
db.test.insertMany([
{ '_id':1, 'name':'aa1', 'price':10, 'quantity': 2, 'category': ['coffe'] },
{ '_id':2, 'name':'aa2', 'price':20, 'quantity': 1, 'category': ['coffe', 'snack'] },
{ '_id':3, 'name':'aa3', 'price':5, 'quantity':10, 'category': ['snack', 'coffe'] },
{ '_id':4, 'name':'aa4', 'price':5, 'quantity':20, 'category': ['coffe', 'cake'] },
{ '_id':5, 'name':'aa5', 'price':10, 'quantity':10, 'category': ['animal', 'dog'] },
{ '_id':6, 'name':'aa6', 'price':5, 'quantity': 5, 'category': ['dog', 'animal'] },
{ '_id':7, 'name':'aa7', 'price':5, 'quantity':10, 'category': ['animal', 'cat'] },
{ '_id':8, 'name':'aa8', 'price':10, 'quantity': 5, 'category': ['cat', 'animal'] },
]);
I'm trying to make a query with this result (or something like it):
[
{ ['animal', 'dog'], 125 },
{ ['animal', 'cat'], 100 },
{ ['coffe', 'cake'], 100 },
{ ['coffe', 'snack'], 70 },
{ ['coffe'], 20 }
]
Meaning that it is:
Grouped by category.
The category is treated as a set (i.e. order is not important).
The result is sorted by price*quantity per unique category 'set'.
I've tried everything I know (which is very limited) and googled for days without getting anywhere.
Is this even possible in an aggregate query or do I have find a different way?
I suppose you need something like this:
db.collection.aggregate([
{
$unwind: "$category"
},
{
$sort: {
_id:-1,
category: -1
}
},
{
$group: {
_id: "$_id",
category: {
$push: "$category"
},
price: {
$first: "$price"
},
quantity: {
$first: "$quantity"
}
}
},
{
$group: {
_id: "$category",
sum: {
$sum: {
$multiply: [
"$price",
"$quantity"
]
}
}
}
},
{
$project: {
mySet: "$_id",
total: "$sum"
}
},
{
$sort: {
total: -1
}
}
])
Explained:
$unwind the $category array so you can sort the categories in same order.
$sort by category & _id so you can have same order per category & _id
$group by _id so you can push the categories back to array but sorted
$group by category set so you can sum the price*quantity
$project the needed fields
$sort by descending order as requested.
Please, note output has name for the set and total for the sum to be valid JSON since it is not possible to have the output as {[X,Y],Z} and need to be {m:[X,Y],z:Z}
playground
db.collection.aggregate([
{
"$match": {}
},
{
"$group": {
"_id": {
$function: {
body: "function(arr) { return arr.sort((a,b) => a.localeCompare(b))}",
args: [ "$category" ],
lang: "js"
}
},
"sum": {
"$sum": { "$multiply": [ "$price", "$quantity" ] }
}
}
},
{
"$sort": { sum: -1 }
}
])
mongoplayground
In mongodb 5.2 version you can use $sortArray instead of function sort that I used.
I have a document like this:
{
_id: 1,
data: [
{
_id: 2,
rows: [
{
myFormat: [1,2,3,4]
},
{
myFormat: [1,1,1,1]
}
]
},
{
_id: 3,
rows: [
{
myFormat: [1,2,7,8]
},
{
myFormat: [1,1,1,1]
}
]
}
]
},
I want to get distinct myFormat values as a complete array.
For example: I need the result as: [1,2,3,4], [1,1,1,1], [1,2,7,8]
How can I write mongoDB query for this?
Thanks for the help.
Please try this, if every object in rows has only one field myFormat :
db.getCollection('yourCollection').distinct('data.rows')
Ref : mongoDB Distinct Values for a field
Or if you need it in an array & also objects in rows have multiple other fields, try this :
db.yourCollection.aggregate([{$project :{'data.rows.myFormat':1}},{ $unwind: '$data' }, { $unwind: '$data.rows' },
{ $group: { _id: '$data.rows.myFormat' } },
{ $group: { _id: '', distinctValues: { $push: '$_id' } } },
{ $project: { distinctValues: 1, _id: 0 } }])
Or else:
db.yourCollection.aggregate([{ $project: { values: '$data.rows.myFormat' } }, { $unwind: '$values' }, { $unwind: '$values' },
{ $group: { _id: '', distinctValues: { $addToSet: '$values' } } }, { $project: { distinctValues: 1, _id: 0 } }])
Above aggregation queries would get what you wanted, but those can be tedious on large datasets, try to run those and check if there is any slowness, if you're using for one-time then if needed you can consider using {allowDiskUse: true} & irrespective of one-time or not you need to check on whether to use preserveNullAndEmptyArrays:true or not.
Ref : allowDiskUse , $unwind preserveNullAndEmptyArrays
i did this Aggregate pipeline , and i want add a field contains the Global Total of all groups total.
{ "$match": query },
{ "$sort": cursor.sort },
{ "$group": {
_id: { key:"$paymentFromId"},
items: {
$push: {
_id:"$_id",
value:"$value",
transaction:"$transaction",
paymentMethod:"$paymentMethod",
createdAt:"$createdAt",
...
}
},
count:{$sum:1},
total:{$sum:"$value"}
}}
{
//i want to get
...project groups , goupsTotal , groupsCount
}
,{
"$skip":cursor.skip
},{
"$limit":cursor.limit
},
])
you need to use $facet (avaialble from MongoDB 3.4) to apply multiple pipelines on the same set of docs
first pipeline: skip and limit docs
second pipeline: calculate total of all groups
{ "$match": query },
{ "$sort": cursor.sort },
{ "$group": {
_id: { key:"$paymentFromId"},
items: {
$push: "$$CURRENT"
},
count:{$sum:1},
total:{$sum:"$value"}
}
},
{
$facet: {
docs: [
{ $skip:cursor.skip },
{ $limit:cursor.limit }
],
overall: [
{$group: {
_id: null,
groupsTotal: {$sum: '$total'},
groupsCount:{ $sum: '$count'}
}
}
]
}
the final output will be
{
docs: [ .... ], // array of {_id, items, count, total}
overall: { } // object with properties groupsTotal, groupsCount
}
PS: I've replaced the items in the third pipe stage with $$CURRENT which adds the whole document for the sake of simplicity, if you need custom properties then specify them.
i did it in this way , project the $group result in new field doc and $sum the sub totals.
{
$project: {
"doc": {
"_id": "$_id",
"total": "$total",
"items":"$items",
"count":"$count"
}
}
},{
$group: {
"_id": null,
"globalTotal": {
$sum: "$doc.total"
},
"result": {
$push: "$doc"
}
}
},
{
$project: {
"result": 1,
//paging "result": {$slice: [ "$result", cursor.skip,cursor.limit ] },
"_id": 0,
"globalTotal": 1
}
}
the output
[
{
globalTotal: 121500,
result: [ [group1], [group2], [group3], ... ]
}
]
I am trying to build a pipeline which will search for documents based on certain criteria and will group certain fields to give desired output. Document structure of deals is
{
"_id":"123",
"status":"New",
"deal_amount":"5200",
"deal_date":"2018-03-05",
"data_source":"API",
"deal_type":"New Business",
"account_id":"A1"
},
{
"_id":"456",
"status":"New",
"deal_amount":"770",
"deal_date":"2018-02-11",
"data_source":"API",
"deal_type":"New Business",
"account_id":"A2"
},
{
"_id":"885",
"status":"Old",
"deal_amount":"4070",
"deal_date":"2017-09-22",
"data_source":"API",
"deal_type":"New Business",
"account_id":"A2"
},
Account name is referenced field. Account document goes like this:
{
"_id":"A1",
"name":"Sarah",
},
{
"_id":"A2",
"name":"Amber",
},
The pipeline should search for documents whose 'status' is 'New' and 'deal amount' is more than 2000 and it should group by 'account name'. Pipeline i have used goes like this
db.deal.aggregate([{
$match: {
status: New,
deal_amount: {
$gte: 2000,
}
}
}, {
$group: {
_id: "$account_name",
}
},{
$lookup:{
from:"accounts",
localField:"account_id",
foreignField:"_id",
as:"acc",
}
}
])
I want to show fields deal_amount, deal_type, deal_date and account name only in result.
Expected Result:
{
"_id": "123",
"deal_amount": "5200",
"deal_date": "2018-03-05",
"deal_type": "New Business",
"account_name": "Sarah"
}, {
"_id": "885",
"deal_amount": "4070",
"deal_date": "2017-09-22",
"deal_type": "New Business",
"account_name": "Amber"
},
Do i have to include all the these fields,deal_amount, deal_type, deal_date & account name, in 'group' stage in order to show in result or is there any other ways to do it. Any help is highly appreciated.
Please use this query.
aggregate([{
$match: {
status: "New",
deal_amount: {
$gte: 2000,
}
}
},
{
$lookup:{
from:"accounts",
localField:"account_id",
foreignField:"_id",
as:"acc",
}
},
{
$unwind: {
path: '$acc',
preserveNullAndEmptyArrays: true,
},
},
{
$group: {
_id: "$acc._id",
deal_amount: { $first: '$deal_amount' },
deal_date: { $first: '$deal_date' },
deal_type: { $first: '$deal_type' },
}
}
])
You can do by :
1) using $$ROOT
reference: link
{ $group : {
_id : "$author",
data: { $push : "$$ROOT" }
}}
2) by assign single parameter
{
$group: {
_id: "$account_name",
deal_amount: { $first: '$deal_amount' },
deal_date: { $first: '$deal_date' },
.
.
}
}
Not sure why you need $group stage. You just need to add $project stage to output the account name from the referenced collection.
{
"$project": {
"deal_amount": 1,
"deal_type": 1,
"deal_date": 1,
"account_name": {"$let":{"vars":{"accl":{"$arrayElemAt":["$acc", 0]}}, in:"$$accl.name}}
}
}
One thing to start with, your $gte operator doesn't work on the string field deal_amount, so you might want to change the field to integers or something similar:
// Convert String to Integer
db.deals.find().forEach(function(data) {
db.deals.update(
{_id:data._id},
{$set:{deal_amount:parseInt(data.deal_amount)}});
Then, to get just the fields you need, reshape the document using $project:
db.deals.aggregate([{
$match: {
"status": "New",
"deal_amount" : {
"$gte" : 2000
}
}
},
{
$lookup:{
from:"accounts",
localField:"account_id",
foreignField:"_id",
as:"acc",
}
},
{
$project: {
_id: 1,
deal_amount: 1,
deal_type: 1,
deal_date: 1,
"account_name": {"$let":{"vars":{"accl":{"$arrayElemAt":["$acc", 0]}}, in:"$$accl.name"}}
}
}
]);
For me, this produced:
{
"_id" : "123",
"deal_amount" : 5200.0,
"deal_date" : "2018-03-05",
"deal_type" : "New Business",
"account_name" : "Sarah"
}
db.deal.aggregate([{$match: {status: {$eq: 'New'}, deal_amount: {$gte: '2000'}}}, {$group: {_id: {accountName: '$account_id', type: '$deal_type', 'amount': '$deal_amount'}}}])