I have the following count queries :
db.collection.count({code : {$in : ['delta', 'beta']}});
db.collection.count({code : 'alpha'});
for the given data set
[
{
"code": "detla",
"name": "delta1"
},
{
"code": "detla",
"name": "delta2"
},
{
"code": "detla",
"name": "delta3"
},
{
"code": "detla",
"name": "delta4"
},
{
"code": "beta",
"name": "beta1"
},
{
"code": "beta",
"name": "beta2"
},
{
"code": "beta",
"name": "beta3"
},
{
"code": "beta",
"name": "beta4"
},
{
"code": "beta",
"name": "beta5"
},
{
"code": "alpha",
"name": "alpha1"
},
{
"code": "alpha",
"name": "alpha2"
},
{
"code": "alpha",
"name": "alpha3"
}
]
Is there any way to achieve this using single aggregation() and nested $group in mongodb#3.2?
I understand that this is a classic usecase for mapReduce(), but I am out of option here as the $group by parameters (code in the above example), is a dynamically generated attribute, hence the aggregation stages building is also dynamic.
Expected output is same as the result of the two count queries.
-> First count query (delta, beta combined count) -> 9
-> Second count query (alpha count) -> 3
You can use below $group aggregation
db.collection.aggregate([
{ "$group": {
"_id": null,
"first": {
"$sum": {
"$cond": [{ "$in": ["$code", ["detla", "beta"]] }, 1, 0]
}
},
"second": {
"$sum": {
"$cond": [{ "$eq": ["alpha", "$code"] }, 1, 0]
}
}
}}
])
But I will prefer to go with the two count queries for the better performance
$in is also released in version 3.4. You can either use $setIsSubset for the prior versions
db.collection.aggregate([
{ "$group": {
"_id": null,
"first": {
"$sum": {
"$cond": [{ "$setIsSubset": [["$code"], ["detla", "beta"]] }, 1, 0]
}
},
"second": {
"$sum": {
"$cond": [{ "$eq": ["alpha", "$code"] }, 1, 0]
}
}
}}
])
Related
I am working on making a query which can sort the result after grouping keys in MongoDB.
Following is the example data in DB
[
{
"_id": ObjectId("5a934e000102030405000000"),
"code": "code",
"groupId": "L0LV7ENT",
"version": {
"id": "1.0.0.0"
},
"status": "Done",
"type": "main"
},
{
"_id": ObjectId("5a934e000102030405000001"),
"code": "code",
"groupId": "L0LV7ENT",
"version": {
"id": "2.0.0.0"
},
"status": "Done",
"type": "main"
},
{
"_id": ObjectId("5a934e000102030405000002"),
"code": "code",
"groupId": "F6WJ9QP7",
"version": {
"id": "1.1.0.0"
},
"status": "Done",
"type": "main"
}
]
Here, I would like to sort the result in ascending order according to the version.id and to group the result according to the groupId.
Hence, I used the following query
db.collection.aggregate([
{
"$match": {
"$and": [
{
"type": "main",
"code": {
"$in": [
"code"
]
},
"status": {
"$in": [
"Done",
"Completed"
]
},
"groupId": {
"$in": [
"L0LV7ENT",
"F6WJ9QP7"
]
}
}
]
}
},
{
"$sort": {
"_id": 1,
"version.id": 1
}
},
{
"$group": {
"_id": {
"groupId": "$groupId"
},
"services": {
"$push": "$$ROOT"
}
}
}
])
But the result I am getting is not stable. Sometimes I see, the data with "_id": ObjectId("5a934e000102030405000002") coming first then ObjectId("5a934e000102030405000000") and ObjectId("5a934e000102030405000001").
It seems intermmitent. Is there any way to get a stable result?
EDIT
You can try it here
From the documentation:
$group does not order its output documents.
So you will need to sort after the group stage to have a deterministic output order.
Documents looks like this.
{
"sId": "s1",
"language": "hindi",
"service": "editing",
"count": 5,
},
{
"sId": "s2",
"language": "hindi",
"service": "editing",
"count": 6,
},
{
"sId": "s2",
"language": "hindi",
"service": "reading",
"count": 6,
},
{
"sId": "s3",
"language": "english",
"service": "reading",
"count": 10,
}
I want the result should be like this
{
"language":"hindi",
"count": 11
},
{
"language":"english",
"count": 10
}
I tried with the aggregate query like this
{
"$group": {
"_id": {
"lang": "$language",
"sId": "$sId"
},
"count": {"$sum": "$count"}
}
}
In sId: s2 it should ignore the second object.
Can anyone please give me a hint how I can achieve the above.
You can use $first to get the first element of each group. You can then use $group again to sum by language.
{
"$group": {
"_id": {
"language": "$language",
"sId": "$sId"
},
"count": {
"$first": {
"$sum": "$count"
}
}
}
}
https://mongoplayground.net/p/3_RjSt1wtRS
I would like to get the unique elements of all arrays in a collection. Consider the following collection
[
{
"collection": "collection",
"myArray": [
{
"name": "ABC",
"code": "AB"
},
{
"name": "DEF",
"code": "DE"
}
]
},
{
"collection": "collection",
"myArray": [
{
"name": "GHI",
"code": "GH"
},
{
"name": "DEF",
"code": "DE"
}
]
}
]
I can achieve this by using $unwind and $group like this:
db.collection.aggregate([
{
$unwind: "$myArray"
},
{
$group: {
_id: null,
data: {
$addToSet: "$myArray"
}
}
}
])
And get the output:
[
{
"_id": null,
"data": [
{
"code": "GH",
"name": "GHI"
},
{
"code": "DE",
"name": "DEF"
},
{
"code": "AB",
"name": "ABC"
}
]
}
]
However, the array "myArray" will have a lot of elements (about 6) and the number of documents passed into this stage of the pipeline will be about 600. So unwinding the array would give me a total of 3600 documents being processed. I would like to know if there's a way for me to achieve the same result without unwinding
You can use below aggregation
db.collection.aggregate([
{ "$group": {
"_id": null,
"data": { "$push": "$myArray" }
}},
{ "$project": {
"data": {
"$reduce": {
"input": "$data",
"initialValue": [],
"in": { "$setUnion": ["$$this", "$$value"] }
}
}
}}
])
Output
[
{
"_id": null,
"data": [
{
"code": "AB",
"name": "ABC"
},
{
"code": "DE",
"name": "DEF"
},
{
"code": "GH",
"name": "GHI"
}
]
}
]
I wanted to fetch data from 2 independent collections and sort the results based on date through a single query. Is that even possible in mongodb? I have collections:
OrderType1
{
"id": "1",
"name": "Hello1",
"date": "2016-09-23T15:07:38.000Z"
},
{
"id": "2",
"name": "Hello1",
"date": "2015-09-23T15:07:38.000Z"
}
OrderType2
{
"id": "3",
"name": "Hello3",
"date": "2012-09-23T15:07:38.000Z"
},
{
"id": "4",
"name": "Hello4",
"date": "2018-09-23T15:07:38.000Z"
}
Expected Result
[
{
"id": "3",
"name": "Hello3",
"date": "2012-09-23T15:07:38.000Z"
},
{
"id": "2",
"name": "Hello1",
"date": "2015-09-23T15:07:38.000Z"
},
{
"id": "1",
"name": "Hello1",
"date": "2016-09-23T15:07:38.000Z"
},
{
"id": "4",
"name": "Hello4",
"date": "2018-09-23T15:07:38.000Z"
}
]
Now, I want to fetch both types of orders in a single query sorted by date.
You can try below aggregation with mongodb 3.6 and above but I think you should use two queries because for the large data set $lookup pipeline will breach BSON limit of 16mb. But also It depends upon your $match condition or $limit. If they are applied to the $lookup pipeline then your aggregation would work perfectly.
db.OrderType1.aggregate([
{ "$limit": 1 },
{ "$facet": {
"collection1": [
{ "$limit": 1 },
{ "$lookup": {
"from": "OrderType1",
"pipeline": [{ "$match": { } }],
"as": "collection1"
}}
],
"collection2": [
{ "$limit": 1 },
{ "$lookup": {
"from": "OrderType2",
"pipeline": [{ "$match": { } }],
"as": "collection2"
}}
]
}},
{ "$project": {
"data": {
"$concatArrays": [
{ "$arrayElemAt": ["$collection1.collection1", 0] },
{ "$arrayElemAt": ["$collection2.collection2", 0] },
]
}
}},
{ "$unwind": "$data" },
{ "$replaceRoot": { "newRoot": "$data" } }
])
I've got collection that looks like:
[{
"org": "A",
"type": "simple",
"payFor": 3,
"price": 100
},
{
"org": "A",
"type": "custom",
"payFor": 2,
"price": 115
},
{
"org": "B",
"type": "simple",
"payFor": 1,
"price": 110
},
{
"org": "B",
"type": "custom",
"payFor": 2,
"price": 200
},
{
"org": "B",
"type": "custom",
"payFor": 4,
"price": 220
}]
And need to produce result with query to perform group by "org" where payments appears for only first "payFor" prices in "type".
I'm trying to use expression result by $slice operator in $add but this is not works.
pipeline:
[{
"$group": {
"_id": {
"org": "$org",
"type": "$type"
},
"payFor": {
"$max": "$payFor"
},
"count": {
"$sum": 1
},
"prices": {
"$push": "$price"
}
}
},
{
"$group": {
"_id": "$_id.org",
"payments": {
"$push": {
"type": "$_id.type",
"forFirst": "$payFor",
"sum": {
"$cond": [
{
"$gte": [
"$payFor",
"$count"
]
},
{
"$add": {
"$prices": {
"$slice": "$count"
}
}
},
{
"$add": "$prices"
}
]
}
}
}
}
}]
I know that it is possible to traverse unwinded prices and pick only "payFor" count of them. but result collections are more rich than in example above and this operation will produce some unecessary overheads.
Need some advice from community. Please. Thanks.