Now i get this table(it can't be more than two kinds of {A,B,C} to appear in the same data at the same time.):
{_id:1,A:a}
{_id:2,B:b}
{_id:3,C:a}
{_id:4,A:a}
{_id:5,A:b}
{_id:6,A:c}
{_id:7,C:a}
How to get this result?
a:4
b:2
c:1
you can get this result with mongodb aggregation framework,
first, you'll need to add all the value in a single field, and the perform a $group on that field:
db.collection.aggregate([{
"$project": {
"v": ["$A", "$B", "$C"]
}
}, {
"$unwind": "$v"
}, {
"$match": {
"v": {
"$ne": null
}
}
}, {
"$group": {
"_id": "$v",
"count": {
"$sum": 1
}
}
}])
result:
[
{
"_id": "c",
"count": 1
},
{
"_id": "b",
"count": 2
},
{
"_id": "a",
"count": 4
}
]
you can try it here: mongoplayground.net/p/rGHUPWsw2ee
Related
I have a user collection:
[
{"_id": 1,"name": "John", "age": 25, "valid_user": true}
{"_id": 2, "name": "Bob", "age": 40, "valid_user": false}
{"_id": 3, "name": "Jacob","age": 27,"valid_user": null}
{"_id": 4, "name": "Amelia","age": 29,"valid_user": true}
]
I run a '$facet' stage on this collection. Checkout this MongoPlayground.
I want to talk about the first output from the facet stage. The following is the response currently:
{
"user_by_valid_status": [
{
"_id": false,
"count": 1
},
{
"_id": true,
"count": 2
},
{
"_id": null,
"count": 1
}
]
}
However, I want to restructure the output in this way:
"analytics": {
"invalid_user": {
"_id": false
"count": 1
},
"valid_user": {
"_id": true
"count": 2
},
"user_with_unknown_status": {
"_id": null
"count": 1
}
}
The problem with using a '$project' stage along with 'arrayElemAt' is that the order may not be definite for me to associate an index with an attribute like 'valid_users' or others. Also, it gets further complicated because unlike the sample documents that I have shared, my collection may not always contain all the three categories of users.
Is there some way I can do this?
You can use $switch conditional operator,
$project to show value part in v with _id and count field as object, k to put $switch condition
db.collection.aggregate([
{
"$facet": {
"user_by_valid_status": [
{
"$group": {
"_id": "$valid_user",
"count": { "$sum": 1 }
}
},
{
$project: {
_id: 0,
v: { _id: "$_id", count: "$count" },
k: {
$switch: {
branches: [
{ case: { $eq: ["$_id", null] }, then: "user_with_unknown_status" },
{ case: { $eq: ["$_id", false] }, then: "invalid_user" },
{ case: { $eq: ["$_id", true] }, then: "valid_user" }
]
}
}
}
}
],
"users_above_30": [{ "$match": { "age": { "$gt": 30 } } }]
}
},
$project stage in root, convert user_by_valid_status array to object using $arrayToObject
{
$project: {
analytics: { $arrayToObject: "$user_by_valid_status" },
users_above_30: 1
}
}
])
Playground
I've been using MongoDB for just a week and I have problems achieving this result: I want to group my documents by date while also keeping track of the number of entries that have a certain field set to a certain value.
So, my documents look like this:
{
"_id" : ObjectId("5f3f79fc266a891167ca8f65"),
"recipe" : "A",
"timestamp" : ISODate("2020-08-22T09:38:36.306Z")
}
where recipe is either "A", "B" or "C". Right now I'm grouping the documents by date using this pymongo query:
mongo.db.aggregate(
# Pipeline
[
# Stage 1
{
"$project": {
"createdAt": {
"$dateToString": {
"format": "%Y-%m-%d",
"date": "$timestamp"
}
},
"progressivo": 1,
"temperatura_fusione": 1
}
},
# Stage 2
{
"$group": {
"_id": {
"createdAt": "$createdAt"
},
"products": {
"$sum": 1
}
}
},
# Stage 3
{
"$project": {
"label": "$_id.createdAt",
"value": "$products",
"_id": 0
}
}])
Which gives me results like this:
[{"label": "2020-08-22", "value": 1}, {"label": "2020-08-15", "value": 2}, {"label": "2020-08-11", "value": 1}, {"label": "2020-08-21", "value": 5}]
What I'd like to have is also the counting of how many times each recipe appears on every date. So, if for example on August 21 I have 2 entries with the "A" recipe, 3 with the "B" recipe and 0 with the "C" recipe, the desired output would be
{"label": "2020-08-21", "value": 5, "A": 2, "B":3, "C":0}
Do you have any tips?
Thank you!
You can do like following, what have you done is excellent. After that,
In second grouping, We just get total value and value of each recipe.
$map is used to go through/modify each objects
$arrayToObject is used to covert the array what we have done via map (key : value pair) to object
$ifNull is used for, sometimes your data might not have "A" or "B" or "C". But you need the value should be 0 if there is no name as expected output.
Here is the code
[
{
"$project": {
"createdAt": {
"$dateToString": {
"format": "%Y-%m-%d",
"date": "$timestamp"
}
},
recipe: 1,
"progressivo": 1,
"temperatura_fusione": 1
}
},
{
"$group": {
"_id": {
"createdAt": "$createdAt",
"recipeName": "$recipe",
},
"products": {
$sum: 1
}
}
},
{
"$group": {
"_id": "$_id.createdAt",
value: {
$sum: "$products"
},
recipes: {
$push: {
name: "$_id.recipeName",
val: "$products"
}
}
}
},
{
$project: {
"content": {
"$arrayToObject": {
"$map": {
"input": "$recipes",
"as": "el",
"in": {
"k": "$$el.name",
"v": "$$el.val"
}
}
}
},
value: 1
}
},
{
$project: {
_id: 1,
value: 1,
A: {
$ifNull: [
"$content.A",
0
]
},
B: {
$ifNull: [
"$content.B",
0
]
},
C: {
$ifNull: [
"$content.C",
0
]
}
}
}
]
Working Mongo playground
[
{ "$match": {
"created":{
"$gte": ISODate("2015-07-19T07:26:49.045Z")
},
"created":{
"$lte": ISODate("2015-07-20T07:37:56.045Z")
}
}},
{ "$group:{
"_id":{
"ln":"$l.ln",
"cid":"$cid"
},
"appCount":{ "$sum": 1 }
}},
{ "$group": {
"_id": { "ln":"$_id.ln" },
"cusappCount": { "$sum": 1 }
}},
{ "$sort":{ "_id.ln":1 } }
]
In above mongo db query I am not able to display the appcount in result.. I am able to display cusappCount. Could anyone please help me on this
The $match is wrong to start with and does not do what you think. It is only selecting the "second" statement:
"created":{
"$lte": ISODate("2015-07-20T07:37:56.045Z")
}
So your selections are incorrect to start with.
That and other corrections below:
[
{ "$match": {
"created": {
"$gte": ISODate("2015-07-19T07:26:49.045Z"),
"$lte": ISODate("2015-07-20T07:37:56.045Z")
}
}},
{ "$group":{
"_id": {
"ln":"$l.ln",
"cid":"$cid"
},
"appCount":{ "$sum": 1 }
}},
{ "$group": {
"_id": "$_id.ln",
"cusappCount": { "$sum": "$appCount" },
"distinctCustCount": { "$sum": 1 }
}},
{ "$sort":{ "_id": 1 } }
]
Which seems to be what you are trying to do.
So your earier "count" is then passed to $sum when grouping at a "broader" level. The "second" count is just for the "distinct" items in the earlier key.
If you are trying to "retain" the values of "appCount", then the problem here is that your "grouping" is "taking away" the detail level that appears at. So for what it is woth, then this is where you use "arrays" in an output structure:
[
{ "$match": {
"created": {
"$gte": ISODate("2015-07-19T07:26:49.045Z"),
"$lte": ISODate("2015-07-20T07:37:56.045Z")
}
}},
{ "$group":{
"_id": {
"ln":"$l.ln",
"cid":"$cid"
},
"appCount":{ "$sum": 1 }
}},
{ "$group": {
"_id": "$_id.ln",
"cusappCount": { "$sum": 1 },
"custs": { "$push": {
"cid": "$_id.cid", "appCount": "$appCount"
}}
}},
{ "$sort":{ "_id": 1 } }
]
I have the following vote data in a large collection:
{
"user_id" : ObjectId("53ac7bce4eaf6de4d5601c1a"),
"article_id" : ObjectId("53ab27504eaf6de4d5601be5"),
"score" : 5
},
{
"user_id" : ObjectId("53ac7bce4eaf6de4d5601c1b"),
"article_id" : ObjectId("53ab27504eaf6de4d5601be5"),
"score" : 3
},
{
"user_id" : ObjectId("53ac7bce4eaf6de4d5601c1c"),
"article_id" : ObjectId("53ab27504eaf6de4d5601be5"),
"score" : 3
},
...
I'm looking to filter this collection where more than 3 votes have been obtained for a single article (as above) and output as-is (excluding any vote entries on articles < 3 total votes).
Any help much appreciated. This collection can be huge so efficiency would be ideal.
Normally not something you do in a single operation, but you can do this if those really are your only fields and there are not too many matching documents.
db.collection.aggregate([
{ "$group": {
"_id": "$article_id",
"docs": {
"$push": {
"user_id": "$user_id",
"article_id": "$article_id",
"score": "$score"
}
},
"votes": { "$sum": 1 }
}},
{ "$match": { "votes": { "$gt": 3 } } },
{ "$unwind": "$docs" },
{ "$project": {
"user_id": "$docs.user_id",
"article_id": "$docs.article_id",
"score": "$docs.score"
}}
])
You can clean that up a little with MongoDB 2.6 and greater which provides a system variable in the pipeline for $$ROOT:
db.collection.aggregate([
{ "$group": {
"_id": "$article_id",
"docs": {
"$push": "$$ROOT"
},
"votes": { "$sum": 1 }
}},
{ "$match": { "votes": { "$gt": 3 } } },
{ "$unwind": "$docs" },
{ "$project": {
"user_id": "$docs.user_id",
"article_id": "$docs.article_id",
"score": "$docs.score"
}}
])
Otherwise you can accept that you are doing this in a few steps and process the list of "article_id" values returned with a "count" greater than three:
var ids = db.collection.aggregate([
{ "$group": {
"_id": "$article_id",
"votes": { "$sum": 1 }
}},
{ "$match": { "votes": { "$gt": 3 } } },
]).toArray().map(function(x){ return x._id });
db.collection.find({ "article_id": { "$in": ids } })
If that was a shell operation then you would use the "results" key from the array of results that was returned by default in versions earlier to 2.6.
I want to get two objects $first and $last after grouping. Is it possible?
Something like this, but this is not working:
{ "$group": {
"_id": "type",
"values": [{
"time": { "$first": "$time" },
"value": { "$first": "$value" }
},
{
"time": { "$last": "$time" },
"value": { "$last": "$value" }
}]
}
}
In order to get the $first and $last values from an array with the aggregation framework, you need to use $unwind first to "de-normalize" the array as individual documents. There is also another trick to put those back in an array.
Assuming a document like this
{
"type": "abc",
"values": [
{ "time": ISODate("2014-06-12T22:35:42.260Z"), "value": "ZZZ" },
{ "time": ISODate("2014-06-12T22:36:45.921Z"), "value": "KKK" },
{ "time": ISODate("2014-06-12T22:37:18.237Z"), "value": "AAA" }
]
}
And assuming that your array is already sorted your would do:
If you do not care about the results being in an array just $unwind and $group:
db.junk.aggregate([
{ "$unwind": "$values" },
{ "$group": {
"_id": "$type",
"ftime": { "$first": "$values.time" },
"fvalue": { "$first": "$values.value" },
"ltime": { "$last": "$values.time" },
"lvalue": { "$last": "$values.value" },
}}
])
For those results in array then there is a trick to it:
db.collection.aggregate([
{ "$unwind": "$values" },
{ "$project": {
"type": 1,
"values": 1,
"indicator": { "$literal": ["first", "last"] }
}},
{ "$group": {
"_id": "$type",
"ftime": { "$first": "$values.time" },
"fvalue": { "$first": "$values.value" },
"ltime": { "$last": "$values.time" },
"lvalue": { "$last": "$values.value" },
"indicator": { "$first": "$indicator" }
}},
{ "$unwind": "$indicator" },
{ "$project": {
"values": {
"time": {
"$cond": [
{ "$eq": [ "$indicator", "first" ] },
"$ftime",
"$ltime"
]
},
"value": {
"$cond": [
{ "$eq": [ "$indicator", "first" ] },
"$fvalue",
"$lvalue"
]
}
}
}},
{ "$group": {
"_id": "$_id",
"values": { "$push": "$values" }
}}
])
If your array is not sorted place an additional $sort stage before the very first $group to make sure your items are in the order you want them to be evaluated by $first and $last. A logical order where is by the "time" field, so:
{ "$sort": { "type": 1, "values.time": 1 } }
The $literal declares an array to identify the values of "first" and "last" which are later "unwound" to create two copies of each grouped document. These are then evaluated using the $cond operator to re-assign to a single field for "values" which is finally push back into an array using $push.
Remember to allways try to $match first in the pipeline in order to reduce the number of documents you are working on to what you reasonable want. You pretty much never want to do this over whole collections, especially when you are using $unwind on arrays.
Just as a final note $literal is introduced/exposed in MongoDB 2.6 and greater versions. For prior versions you can interchange that with the undocumented $const.