I have a JSON data like this and i wanted to apply aggregation on this data in such a way that i should group by from data:
{
"series": [
{
"id": "1",
"element": "111",
"data": [
{
"timeFrame": {
"from": "2016-01-01T00:00:00Z",
"to": "2016-01-31T23:59:59Z"
},
"value": 1
},
{
"timeFrame": {
"from": "2016-02-01T00:00:00Z",
"to": "2016-02-29T23:59:59Z"
},
"value": 2
}
]
}
]
}
and i have acheived this by the above aggregation:
db.getCollection('col1').aggregate([
{$unwind: "$data"},
{$group :{
element: {$first:"$relatedElement"},
_id : {
day : {$dayOfMonth: "$values.timeFrame.from"},
month:{$month: "$values.timeFrame.from"},
year:{$year: "$values.timeFrame.from"}
},
fromDate : { $first : "$values.timeFrame.from" },
total : {$sum : "$values.value"},
count : {$sum : 1},
}
},
{
$project: {
_id : 0,
element:1,
fromDate : '$fromDate',
avgValue : { $divide: [ "$total", "$count" ] }
}
}])
OutPut:
{
"id" : "1",
"element" : "3",
"fromDate" : ISODate("2017-05-01T00:00:00.000Z"),
"avgValue" : 0.0378787878787879
}
{
"id" : "1",
"element" : "3",
"fromDate" : ISODate("2017-04-30T22:00:00.000Z"),
"avgValue" : 0.416666666666667
}
But, i am getting two document and this i want to merge as a single document like :
{
"id" : "1",
"element" : "3",
"average" : [
{
"fromDate" : ISODate("2017-05-01T00:00:00.000Z"),
"avgValue" : 0.0378787878787879
},
{
"fromDate" : ISODate("2017-04-30T22:00:00.000Z"),
"avgValue" : 0.416666666666667
}
]
}
Can anyone help me on this.
Add following $group at the end of your aggregate pipeline to merge current output documents into single document -
{$group:{
_id:"$_id",
element: {$first: "$element"},
average:{$push:{
"fromDate": "$fromDate",
"avgValue": "$avgValue"
}}
}}
Related
I have a document with multiple level of embedded subdocument each has some nested array. Using $unwind and sort, do sorting based on day in descending and using push to combine each row records into single array. This Push is working only at one level means it allows only one push. If want to do the same things on the nested level and retains the top level data, got "errmsg" : "Unrecognized expression '$push'".
{
"_id" : ObjectId("5f5638d0ff25e01482432803"),
"name" : "XXXX",
"mobileNo" : 323232323,
"payroll" : [
{
"_id" : ObjectId("5f5638d0ff25e01482432801"),
"month" : "Jan",
"salary" : 18200,
"payrollDetails" : [
{
"day" : "1",
"salary" : 200,
},
{
"day" : "2",
"salary" : 201,
}
]
},
{
"_id" : ObjectId("5f5638d0ff25e01482432802"),
"month" : "Feb",
"salary" : 8300,
"payrollDetails" : [
{
"day" : "1",
"salary" : 300,
},
{
"day" : "2",
"salary" : 400,
}
]
}
],
}
Expected Result:
{
"_id" : ObjectId("5f5638d0ff25e01482432803"),
"name" : "XXXX",
"mobileNo" : 323232323,
"payroll" : [
{
"_id" : ObjectId("5f5638d0ff25e01482432801"),
"month" : "Jan",
"salary" : 18200,
"payrollDetails" : [
{
"day" : "2",
"salary" : 201
},
{
"day" : "1",
"salary" : 200
}
]
},
{
"_id" : ObjectId("5f5638d0ff25e01482432802"),
"month" : "Feb",
"salary" : 8300,
"payrollDetails" : [
{
"day" : "2",
"salary" : 400
},
{
"day" : "1",
"salary" : 300
}
]
}
],
}
Just day will be sorted and remaining things are same
I have tried but it got unrecognized expression '$push'
db.employee.aggregate([
{$unwind: '$payroll'},
{$unwind: '$payroll.payrollDetails'},
{$sort: {'payroll.payrollDetails.day': -1}},
{$group: {_id: '$_id', payroll: {$push: {payrollDetails:{$push:
'$payroll.payrollDetails'} }}}}])
It requires two time $group, you can't use $push operator two times in a field,
$group by main id and payroll id, construct payrollDetails array
$sort by payroll id (you can skip if not required)
$group by main id and construct payroll array
db.employee.aggregate([
{ $unwind: "$payroll" },
{ $unwind: "$payroll.payrollDetails" },
{ $sort: { "payroll.payrollDetails.day": -1 } },
{
$group: {
_id: {
_id: "$_id",
pid: "$payroll._id"
},
name: { $first: "$name" },
mobileNo: { $first: "$mobileNo" },
payrollDetails: { $push: "$payroll.payrollDetails" },
month: { $first: "$payroll.month" },
salary: { $first: "$payroll.salary" }
}
},
{ $sort: { "payroll._id": -1 } },
{
$group: {
_id: "$_id._id",
name: { $first: "$name" },
mobileNo: { $first: "$mobileNo" },
payroll: {
$push: {
_id: "$_id.pid",
month: "$month",
salary: "$salary",
payrollDetails: "$payrollDetails"
}
}
}
}
])
Playground
I have a collection like
{
"_id" : ObjectId("5738cb363bb56eb8f76c2ba8"),
"records" : [
{
"Name" : "Joe",
"Salary" : 70000,
"Department" : "IT"
}
]
},
{
"_id" : ObjectId("5738cb363bb56eb8f76c2ba9"),
"records" : [
{
"Name" : "Henry",
"Salary" : 80000,
"Department" : "Sales"
},
{
"Name" : "Jake",
"Salary" : 40000,
"Department" : "Sales"
}
]
},
{
"_id" : ObjectId("5738cb363bb56eb8f76c2baa"),
"records" : [
{
"Name" : "Sam",
"Salary" : 90000,
"Department" : "IT"
},
{
"Name" : "Tom",
"Salary" : 50000,
"Department" : "Sales"
}
]
}
I want to have the results with the highest salary by each department
{"Name": "Sam", "Salary": 90000, "Department": "IT"}
{"Name": "Henry", "Salary": 80000, "Department": "Sales"}
I could get the highest salary. But I could not get the corresponding employee names.
db.HR.aggregate([
{ "$unwind": "$records" },
{ "$group":
{
"_id": "$records.Department",
"max_salary": { "$max": "$records.Salary" }
}
}
])
Could somebody help me?
You need to $sort your document after $unwind and use the $first operator in the $group stage. You can also use the $last operator in which case you will need to sort your documents in ascending order
db.HR.aggregate([
{ '$unwind': '$records' },
{ '$sort': { 'records.Salary': -1 } },
{ '$group': {
'_id': '$records.Department',
'Name': { '$first': '$records.Name' } ,
'Salary': { '$first': '$records.Salary' }
}}
])
which produces:
{ "_id" : "Sales", "Name" : "Henry", "Salary" : 80000 }
{ "_id" : "IT", "Name" : "Sam", "Salary" : 90000 }
To return the maximum salary and employees list for each department you need to use the $max in your group stage to return the maximum "Salary" for each group then use $push accumulator operator to return a list of "Name" and "Salary" for all employees for each group. From there you need to use the $map operator in your $project stage to return a list of names alongside the maximum salary. Of course the $cond here is used to compare each employee salary to the maximum value. The $setDifference does his work which is filter out all false and is fine as long as the data being filtered is "unique". In this case it "should" be fine, but if any two results contained the same "name" then it would skew results by considering the two to be one.
db.HR.aggregate([
{ '$unwind': '$records' },
{ '$group': {
'_id': '$records.Department',
'maxSalary': { '$max': '$records.Salary' },
'persons': {
'$push': {
'Name': '$records.Name',
'Salary': '$records.Salary'
}
}
}},
{ '$project': {
'maxSalary': 1,
'persons': {
'$setDifference': [
{ '$map': {
'input': '$persons',
'as': 'person',
'in': {
'$cond': [
{ '$eq': [ '$$person.Salary', '$maxSalary' ] },
'$$person.Name',
false
]
}
}},
[false]
]
}
}}
])
which yields:
{ "_id" : "Sales", "maxSalary" : 80000, "persons" : [ "Henry" ] }
{ "_id" : "IT", "maxSalary" : 90000, "persons" : [ "Sam" ] }
Its not the most intuitive thing, but instead of $max you should be using $sort and $first:
{ "$unwind": "$records" },
{ "$sort": { "$records.Salary": -1},
{ "$group" :
{
"_id": "$records.Department",
"max_salary": { "$first": "$records.Salary" },
"name": {$first: "$records.Name"}
}
}
Alternatively, I think this is doable using the $$ROOT operator (fair warning: I've not actually tried this) -
{ "$unwind": "$records" },
{ "$group":
{
"_id": "$records.Department",
"max_salary": { "$max": "$records.Salary" }
"name" : "$$ROOT.records.Name"
}
}
}
Another possible solution:
db.HR.aggregate([
{"$unwind": "$records"},
{"$group":{
"_id": "$records.Department",
"arr": {"$push": {"Name":"$records.Name", "Salary":"$records.Salary"}},
"maxSalary": {"$max":"$records.Salary"}
}},
{"$unwind": "$arr"},
{"$project": {
"_id":1,
"arr":1,
"isMax":{"$eq":["$arr.Salary", "$maxSalary"]}
}},
{"$match":{
"isMax":true
}}
])
This solution takes advantage of the $eq operator to compare two fields in the $project stage.
Test case:
db.HR.insert({"records": [{"Name": "Joe", "Salary": 70000, "Department": "IT"}]})
db.HR.insert({"records": [{"Name": "Henry", "Salary": 80000, "Department": "Sales"}, {"Name": "Jake", "Salary": 40000, "Department": "Sales"}, {"Name": "Santa", "Salary": 90000, "Department": "IT"}]})
db.HR.insert({"records": [{"Name": "Sam", "Salary": 90000, "Department": "IT"}, {"Name": "Tom", "Salary": 50000, "Department": "Sales"}]})
Result:
{ "_id" : "Sales", "arr" : { "Name" : "Henry", "Salary" : 80000 }, "isMax" : true }
{ "_id" : "IT", "arr" : { "Name" : "Santa", "Salary" : 90000 }, "isMax" : true }
{ "_id" : "IT", "arr" : { "Name" : "Sam", "Salary" : 90000 }, "isMax" : true }
I got a collection of companies that looks like this. I also want to merge other documents deals.
I need this:
{
"_id" : ObjectId("561637942d25a7644cae993e"),
"locations" : [
{
"deals" : [
{
"name" : "1",
"_id" : ObjectId("561637942d25a7644cae9940")
},
{
"name" : "2",
"_id" : ObjectId("562f868ce73962c626a16b15")
}
]
}
],
"deals" : [
{
"name" : "3",
"_id" : ObjectId("562f86ebe73962c626a16b17")
}
]
}
{
"_id" : ObjectId("561637942d25a7644cae993e"),
"locations" : [
{
"deals" : [
{
"name" : "4",
"_id" : ObjectId("561637942d25a7644cae9940")
}
]
}
],
"deals" : []
}
To be like this:
{
"deals": [{
"name" : "1",
"_id" : ObjectId("561637942d25a7644cae9940")
},{
"name" : "2",
"_id" : ObjectId("562f868ce73962c626a16b15")
},{
"name" : "3",
"_id" : ObjectId("562f86ebe73962c626a16b17")
},{
"name" : "4",
"_id" : ObjectId("561637942d25a7644cae9949")
}]
}
But I have only failed to do this. It seems like if I want all the deals to be grouped together into one array I should not use unwind since that create more documents because I only need to group once.
This is my attempt which does not work at all.
{
"$project": {
"_id": 1,
"locations": 1,
"deals": 1
}
}, {
"$unwind": "$locations"
}, {
"$unwind": "$locations.deals"
}, {
"$unwind": "$deals"
}, {
"$group": {
"_id": null,
"deals": {
"$addToSet": "$locations.deals",
"$addToSet": "$deals"
}
}
}
You should first use filter your documents to reduce the size of documents to process in the pipeline using the $match operator. Then we need to $unwind the "locations" array after that we use the $project operator to reshape your documents. The $cond operator is used to return a single element array [false] if the deals field is empty array or the deals value because $unwinding empty array will throw an exception. Of course the $setUnion operator does return an array of element that appear in the locations.deals array or the deals array. We then use the $setDifference operator to filter out the false element from the merged array. We then need another $unwind stage where we deconstruct the deals array. From there we can easily $group your documents.
db.collection.aggregate([
{ "$match": { "locations.0": { "$exists": true } } },
{ "$unwind": "$locations" },
{ "$project": {
"deals": {
"$setDifference": [
{ "$setUnion": [
{ "$cond": [
{ "$eq" : [ { "$size": "$deals" }, 0 ] },
[false],
"$deals"
]},
"$locations.deals"
]},
[false]
]
}
}},
{ "$unwind": "$deals" },
{ "$group": {
"_id": null,
"deals": { "$addToSet": "$deals" }
}}
])
Which returns:
{
"_id" : null,
"deals" : [
{
"name" : "1",
"_id" : ObjectId("561637942d25a7644cae9940")
},
{
"name" : "2",
"_id" : ObjectId("562f868ce73962c626a16b15")
},
{
"name" : "3",
"_id" : ObjectId("562f86ebe73962c626a16b17")
},
{
"name" : "4",
"_id" : ObjectId("561637942d25a7644cae9940")
}
]
}
Here's an example of documents I use :
{
"_id" : ObjectId("554a1f5fe36a768b362ea5c0"),
"store_state" : 1,
"services" : [
{
"id" : "XXX",
"state" : 1,
"active": true
},
{
"id" : "YYY",
"state" : 1,
"active": true
},
...
]
}
I want to output a new field with "Y" if the id is "XXX" and active is true and "N" in any other cases. The service element with "XXX" as id is not present on every documents (output "N" in this case).
Here's my query for the moment :
db.stores.aggregate({
$match : {"store_state":1}
},
{ $project : {
"XXX_active": {
$cond: [ {
$and:[
{$eq:["services.$id","XXX"]},
{$eq:["services.$active",true]}
]},"Y","N"
] }
}
}).pretty()
But it always output "N" for "XXX_active" field.
The expected output I need is :
{
"_id" : ObjectId("554a1f5de36a768b362e7e6f"),
"XXX_active" : "Y"
},
{
"_id" : ObjectId("554a1f5ee36a768b362e9d25"),
"XXX_active" : "N"
},
{
"_id" : ObjectId("554a1f5de36a768b362e73a5"),
"XXX_active" : "Y"
}
Other example of possible result :
{
"_id" : ObjectId("554a1f5de36a768b362e7e6f"),
"XXX_active" : "Y",
"YYY_active" : "N"
},
{
"_id" : ObjectId("554a1f5ee36a768b362e9d25"),
"XXX_active" : "N",
"YYY_active" : "N"
},
{
"_id" : ObjectId("554a1f5de36a768b362e73a5"),
"XXX_active" : "Y",
"YYY_active" : "Y"
}
Only one XXX_active per object and no duplicates objects but I need all objects with an XXX_active even if the services id element "XXX" is not present. Could someone help please?
First $unwind services array and then used $cond as below :
db.stores.aggregate({
"$match": {
"store_state": 1
}
}, {
"$unwind": "$services"
}, {
"$project": {
"XXX_active": {
"$cond": [{
"$and": [{
"$eq": ["$services.id", "XXX"]
}, {
"$eq": ["$services.active", true]
}]
}, "Y", "N"]
}
}
},{"$group":{"_id":"$_id","XXX_active":{"$first":"$XXX_active"}}}) //group by id
The following aggregation pipeline will give the desired result. You would need to first apply the $unwind operator on the services array field first as your initial aggregation pipeline step. This will deconstruct the services array field from the input documents to output a document for each element. Each output document replaces the array with an element value.
db.stores.aggregate([
{
"$match" : {"store_state": 1}
},
{
"$unwind": "$services"
},
{
"$project": {
"store_state" : 1,
"services": 1,
"XXX_active": {
"$cond": [
{
"$and": [
{"$eq":["$services.id", "XXX"]},
{"$eq":["$services.active",true]}
]
},"Y","N"
]
}
}
},
{
"$match": {
"services.id": "XXX"
}
},
{
"$group": {
"_id": {
"_id": "$_id",
"store_state": "$store_state",
"XXX_active": "$XXX_active"
},
"services": {
"$push": "$services"
}
}
},
{
"$project": {
"_id": "$_id._id",
"store_state" : "$_id.store_state",
"services": 1,
"XXX_active": "$_id.XXX_active"
}
}
])
I have a collection with documents like this:
"_id" : "15",
"name" : "empty",
"location" : "5th Ave",
"owner" : "machine",
"visitors" : [
{
"type" : "M",
"color" : "blue",
"owner" : "Steve Cooper"
},
{
"type" : "K",
"color" : "red",
"owner" : "Luis Martinez"
},
// A lot more of these
]
}
I want to group by visitors.owner to find which owner has the most visits, I tried this:
db.mycol.aggregate(
[
{$group: {
_id: {owner: "$visitors.owner"},
visits: {$addToSet: "$visits"},
count: {$sum: "comments"}
}},
{$sort: {count: -1}},
{$limit: 1}
]
)
But I always get count = 0 and visits not corresponding to one owner :/
Please help
Try the following aggregation pipeline:
db.mycol.aggregate([
{
"$unwind": "$visitors"
},
{
"$group": {
"_id": "$visitors.owner",
"count": { "$sum": 1}
}
},
{
"$project": {
"_id": 0,
"owner": "$_id",
"visits": "$count"
}
}
]);
Using the sample document you provided in your question, the result is:
/* 0 */
{
"result" : [
{
"owner" : "Luis Martinez",
"visits" : 1
},
{
"owner" : "Steve Cooper",
"visits" : 1
}
],
"ok" : 1
}