My collection's data are something like this :
[
{
ANumberAreaCode: "+98",
BNumberAreaCode: "+1",
AccountingTime: 1629754886,
Length: 123
},
{
ANumberAreaCode: "+44",
BNumberAreaCode: "+98",
AccountingTime: 1629754786,
Length: 123
},
{
ANumberAreaCode: "+98",
BNumberAreaCode: "+96",
AccountingTime: 1629754886,
Length: 998
}
]
I'm going to group on countries codes and count result (summing country codes in ANumberAreaCode and BNumberAreaCode ) .
This is my group sample :
{ "$group": {
"_id": {
"ANumberAreaCode": "$ANumberAreaCode",
},
"count": { "$sum": 1 }
}},
{ "$group": {
"_id": {
"BNumberAreaCode": "$BNumberAreaCode",
},
"count": { "$sum": 1 }
}},
now , how can i summing count result of two above queries for common countries ?
I'm looking for a query that give me this result :
+98 : 3
+44 : 1
+1 :1
+96 :1
You can use this aggregation pipeline:
$facet to get both group, by A and B. This creates two objects: groupA and groupB.
Then using $concatArrays into $project stage it will concat two ouputs.
Deconstructs the array using $unwind
And $group again by values using $sum to get the total.
db.collection.aggregate([
{
"$facet": {
"groupA": [
{
"$group": {
"_id": "$ANumberAreaCode",
"total": {
"$sum": 1
}
}
}
],
"groupB": [
{
"$group": {
"_id": "$BNumberAreaCode",
"total": {
"$sum": 1
}
}
}
]
}
},
{
"$project": {
"result": {
"$concatArrays": [
"$groupA",
"$groupB"
]
}
}
},
{
"$unwind": "$result"
},
{
"$group": {
"_id": "$result._id",
"total": {
"$sum": "$result.total"
}
}
}
])
Example here
Related
In my MondoDB, I would like to group my data by a number (machine_quality), and then compare this number with maximum value of ALL machine_quality, not just maximum value per every single group.
My nonworking query:
db.records.aggregate([
{
'$group': {
'_id': '$machine_quality',
'total': {'$sum': 1}
}
},
{
'$match': {
'_id': {
'$gte': {
'$subtract': [{'$max': '$_id'}, 3]
}
}
}
}
])
Question:
Part of query {'$max': '$_id'} only reffers to each group separately, and therefore will be always equal to group's _id. However I would like max to compare with maximum _id across ALL groups. Is there any convenient way to do that?
Any thoughts appreciated.
One way to do this is to use $facet, this way you can do 2 "parallel looking" group into 1 pipeline. (the second group will be your group, group by null is to find the global max)
Test code here
Query (after the facet,you can unwind your groups)
db.collection.aggregate([
{
"$facet": {
"global_max": [
{
"$group": {
"_id": null,
"m": {
"$max": "$machine_quality"
}
}
},
{
"$project": {
"_id": 0
}
}
],
"groups": [
{
"$group": {
"_id": "$machine_quality",
"names": {
"$push": "$name"
}
}
},
{
"$addFields": {
"machine_quality": "$_id"
}
},
{
"$project": {
"_id": 0
}
}
]
}
},
{
"$project": {
"global_max": {
"$let": {
"vars": {
"v": {
"$arrayElemAt": [
"$global_max",
0
]
}
},
"in": "$$v.m"
}
},
"groups": 1
}
}
])
This has the limitations of $facet 16MB document size see
My collection is structured like this:
{
"_id": 1,
"Trips": [
{
"EndID": 5,
"Tripcount": 12
},
{
"EndID": 6,
"Tripcount": 19
}
],
"_id": 2,
"Trips": [
{
"EndID": 4,
"Tripcount": 12
},
{
"EndID": 5,
"Tripcount": 19
}
], ...
}
As it can be seen, every document has a Trips array. Now what I want to find, is the top N Tripcounts of all the Trips arrays combined across the documents in the collection. Is that possible?
I already have the following, however this only takes the single greatest Tripcount from each Trips array and then outputs 50 of them. So actually having the top 2 trips in one Trips array results in this query dropping the second one:
var group = db.eplat1.aggregate([
{ "$unwind": "$Trips"},
{ "$sort": {
"Trips.Tripcount": -1
}
},
{ "$limit": 50 },
{ "$group": {
"_id": 1,
"Trips": {
"$push": {
"Start": "$_id",
"Trips": "$Trips"
}
}
}}
], {allowDiskUse: true})
Note that I believe this problem is different to this one, as there only one document is given.
Basically you need to sort the array elements ($unwind/$sort/$group) and then you can do your $sort for the top values and $limit the results.
Finally you $slice for the "top N" in the documents in the array.
db.eplat1.aggregate([
{ "$unwind": "$Trips" },
{ "$sort": { "_id": 1, "Tips.TripCount": -1 } },
{ "$group": {
"_id": "$_id",
"Trips": { "$push": "$Trips" },
"maxTrip": { "$max": "$Trips.TripCount" }
}},
{ "$sort": { "maxTrip": -1 } },
{ "$limit": 50 },
{ "$addFields": { "Trips": { "$slice": [ "$Trips", 0 , 2 ] } } }
])
I have documents like:
{
"platform":"android",
"install_date":20151029
}
platform - can have one value from [android|ios|kindle|facebook ] .
install_date - there are many install_dates
There are also many fields.
Aim : I am calculating installs per platform on particular date.
So I am using group by in aggregation framework and make counts by platform. Document should look like like:
{
"install_date":20151029,
"platform" : {
"android":1000,
"ios": 2000,
"facebook":1500
}
}
I have done like:
db.collection.aggregate([
{
$group: {
_id: { platform: "$platform",install_date:"$install_date"},
count: { "$sum": 1 }
}
},
{
$group: {
_id: { install_date:"$_id.install_date"},
platform: { $push : {platform :"$_id.platform", count:"$count" } }
}
},
{
$project : { _id: 0, install_date: "$_id.install_date", platform: 1 }
}
])
which Gives document like:
{
"platform": [
{
"platform": "facebook",
"count": 1500
},
{
"platform": "ios",
"count": 2000
},
{
"platform": "android",
"count": 1000
}
],
"install_date": 20151027
}
Problem:
Projecting array to single object as "platform"
With MongoDb 3.4 and newer, you can leverage the use of $arrayToObject operator to get the desired result. You would need to run the following aggregate pipeline:
db.collection.aggregate([
{ "$group": {
"_id": {
"date": "$install_date",
"platform": { "$toLower": "$platform" }
},
"count": { "$sum": 1 }
} },
{ "$group": {
"_id": "$_id.date",
"counts": {
"$push": {
"k": "$_id.platform",
"v": "$count"
}
}
} },
{ "$addFields": {
"install_date": "$_id",
"platform": { "$arrayToObject": "$counts" }
} },
{ "$project": { "counts": 0, "_id": 0 } }
])
For older versions, take advantage of the $cond operator in the $group pipeline step to evaluate the counts based on the platform field value, something like the following:
db.collection.aggregate([
{ "$group": {
"_id": "$install_date",
"android_count": {
"$sum": {
"$cond": [ { "$eq": [ "$platform", "android" ] }, 1, 0 ]
}
},
"ios_count": {
"$sum": {
"$cond": [ { "$eq": [ "$platform", "ios" ] }, 1, 0 ]
}
},
"facebook_count": {
"$sum": {
"$cond": [ { "$eq": [ "$platform", "facebook" ] }, 1, 0 ]
}
},
"kindle_count": {
"$sum": {
"$cond": [ { "$eq": [ "$platform", "kindle" ] }, 1, 0 ]
}
}
} },
{ "$project": {
"_id": 0, "install_date": "$_id",
"platform": {
"android": "$android_count",
"ios": "$ios_count",
"facebook": "$facebook_count",
"kindle": "$kindle_count"
}
} }
])
In the above, $cond takes a logical condition as it's first argument (if) and then returns the second argument where the evaluation is true (then) or the third argument where false (else). This makes true/false returns into 1 and 0 to feed to $sum respectively.
So for example, if { "$eq": [ "$platform", "facebook" ] }, is true then the expression will evaluate to { $sum: 1 } else it will be { $sum: 0 }
[
{ "$match": {
"created":{
"$gte": ISODate("2015-07-19T07:26:49.045Z")
},
"created":{
"$lte": ISODate("2015-07-20T07:37:56.045Z")
}
}},
{ "$group:{
"_id":{
"ln":"$l.ln",
"cid":"$cid"
},
"appCount":{ "$sum": 1 }
}},
{ "$group": {
"_id": { "ln":"$_id.ln" },
"cusappCount": { "$sum": 1 }
}},
{ "$sort":{ "_id.ln":1 } }
]
In above mongo db query I am not able to display the appcount in result.. I am able to display cusappCount. Could anyone please help me on this
The $match is wrong to start with and does not do what you think. It is only selecting the "second" statement:
"created":{
"$lte": ISODate("2015-07-20T07:37:56.045Z")
}
So your selections are incorrect to start with.
That and other corrections below:
[
{ "$match": {
"created": {
"$gte": ISODate("2015-07-19T07:26:49.045Z"),
"$lte": ISODate("2015-07-20T07:37:56.045Z")
}
}},
{ "$group":{
"_id": {
"ln":"$l.ln",
"cid":"$cid"
},
"appCount":{ "$sum": 1 }
}},
{ "$group": {
"_id": "$_id.ln",
"cusappCount": { "$sum": "$appCount" },
"distinctCustCount": { "$sum": 1 }
}},
{ "$sort":{ "_id": 1 } }
]
Which seems to be what you are trying to do.
So your earier "count" is then passed to $sum when grouping at a "broader" level. The "second" count is just for the "distinct" items in the earlier key.
If you are trying to "retain" the values of "appCount", then the problem here is that your "grouping" is "taking away" the detail level that appears at. So for what it is woth, then this is where you use "arrays" in an output structure:
[
{ "$match": {
"created": {
"$gte": ISODate("2015-07-19T07:26:49.045Z"),
"$lte": ISODate("2015-07-20T07:37:56.045Z")
}
}},
{ "$group":{
"_id": {
"ln":"$l.ln",
"cid":"$cid"
},
"appCount":{ "$sum": 1 }
}},
{ "$group": {
"_id": "$_id.ln",
"cusappCount": { "$sum": 1 },
"custs": { "$push": {
"cid": "$_id.cid", "appCount": "$appCount"
}}
}},
{ "$sort":{ "_id": 1 } }
]
The below collection named "coll" was maintained in the mongodb.
{
{"_id":1, "set":[1,2,3,4,5]},
{"_id":2, "set":[0,2,6,4,5]},
{"_id":3, "set":[1,2,5,10,22]}
}
How to find the intersection of the set elements in the above collection documents with _id's 1 and 3.
Use the aggregation framework to get the desired result. The aggregation set operator that would do the magic is $setIntersection.
The following aggregation pipeline achieves what you are after:
db.test.aggregate([
{
"$match": {
"_id": { "$in": [1, 3] }
}
},
{
"$group": {
"_id": 0,
"set1": { "$first": "$set" },
"set2": { "$last": "$set" }
}
},
{
"$project": {
"set1": 1,
"set2": 1,
"commonToBoth": { "$setIntersection": [ "$set1", "$set2" ] },
"_id": 0
}
}
])
Output:
/* 0 */
{
"result" : [
{
"set1" : [1,2,3,4,5],
"set2" : [1,2,5,10,22],
"commonToBoth" : [1,2,5]
}
],
"ok" : 1
}
UPDATE
For three or more documents to be intersected, you'd need the $reduce operator to flatten the arrays. This will allow you to intersect any number of arrays, so instead of just doing an intersection of the two arrays from docs 1 and 3, this will apply to multiple arrays as well.
Consider running the following aggregate operation:
db.test.aggregate([
{ "$match": { "_id": { "$in": [1, 3] } } },
{
"$group": {
"_id": 0,
"sets": { "$push": "$set" },
"initialSet": { "$first": "$set" }
}
},
{
"$project": {
"commonSets": {
"$reduce": {
"input": "$sets",
"initialValue": "$initialSet",
"in": { "$setIntersection": ["$$value", "$$this"] }
}
}
}
}
])