mongodb aggregation query for field value length's sum - mongodb

Say, I have following documents:
{name: 'A', fav_fruits: ['apple', 'mango', 'orange'], 'type':'test'}
{name: 'B', fav_fruits: ['apple', 'orange'], 'type':'test'}
{name: 'C', fav_fruits: ['cherry'], 'type':'test'}
I am trying to query to find the total count of fav_fruits field on overall documents returned by :
cursor = db.collection.find({'type': 'test'})
I am expecting output like:
cursor.count() = 3 // Getting
Without much idea of aggregate, can mongodb aggregation framework help me achieve this in any way:
1. sum up the lengths of all 'fav_fruits' field: 6
and/or
2. unique 'fav_fruit' field values = ['apple', 'mango', 'orange', 'cherry']

You need to $project your document after the $match stage and use the $size operator which return the number of items in each array. Then in the $group stage you use the $sum accumulator operator to return the total count.
db.collection.aggregate([
{ "$match": { "type": "test" } },
{ "$project": { "count": { "$size": "$fav_fruits" } } },
{ "$group": { "_id": null, "total": { "$sum": "$count" } } }
])
Which returns:
{ "_id" : null, "total" : 6 }
To get unique fav_fruits simply use .distinct()
> db.collection.distinct("fav_fruits", { "type": "test" } )
[ "apple", "mango", "orange", "cherry" ]

Do this to get just the number of fruits in the fav_fruits array:
db.fruits.aggregate([
{ $match: { type: 'test' } },
{ $unwind: "$fav_fruits" },
{ $group: { _id: "$type", count: { $sum: 1 } } }
]);
This will return the total number of fruits.
But if you want to get the array of unique fav_fruits along with the total number of elements in the fav_fruits field of each document, do this:
db.fruits.aggregate([
{ $match: { type: 'test' } },
{ $unwind: "$fav_fruits" },
{ $group: { _id: "$type", count: { $sum: 1 }, fav_fruits: { $addToSet: "$fav_fruits" } } }
])

You can try this. It may helpful to you.
db.collection.aggregate([{ $match : { type: "test" } }, {$group : { _id : null, count:{$sum:1} } }])

Related

Fetching sum of rows for a type of column value in mongodb as a single output

I am trying to get the sum of field 'score.number' based on the type of a column value work.type in MongoDB. It should fetch sum as 25 for 'hw' ,and 'cw' as 5 as a single output for the student 'A'. Is there a way to achieve it using mongodb queries ? I tried the $group as well but it doesn't seem to fetch the worktype and the sum for each worktype against it for a single student record 'A'.
Expected Output:
after $match you should use $group like this
db.collection.aggregate([
{
$match: {
student: {
$in: [
"A"
]
},
"work.type": {
$in: [
"hw",
"cw"
]
}
}
},
{
"$group": {
"_id": {
"worktype": "$work.type",
"student": "$student"
},
"workScore": {
"$sum": "$score.number"
}
}
}
])
https://mongoplayground.net/p/qzghM5KoAbp
Able to get the sum with these two
$match{
'student': {'$in': ['A']},
"work.type": {'$in': ['hw', 'cw']},
}
followed by
$group
{
_id: '$work.type',
totalAmount: { $sum: "$score.number" },
}
$match {'student': {
$in: [
"A"
]
},
"work.type": {
$in: [
"hw",
"cw"
]
}}
followed by
$group {
"_id": {
"worktype": "$work.type",
"student": "$student"
},
"workScore": {
"$sum": "$score.number"
}
}
followed by
$group {"_id": {
"student": "$_id.student"
},
'list': {'$push': {'worktype':"$_id.worktype", 'workScore': "$workScore" }},
}
Solved output:
Solves the issue.

Mongodb find maximum based on nested object key

I have below schema where I need to identify the object which has highest rank.
{ "team" : {
"member1" : [ { "rank": 2, "goal": 50 } ],
"member2" : [ { "rank": 5, "goal": 30 } ],
"member3" : [ { "rank": 1, "goal": 80 } ]
}}
$unwind will not work on the nested objects. Tried to convert this object as Array and tried to find the max of rank key. Any help would be appreciated.
If the intent is to only find the maximum rank that exists: The idea is a two stage aggregation query using $project and using $objectToArray to have common keys from which $max on required attribute can be applied.
Query: playground link
db.collection.aggregate([
{
$project: {
teamsData: {
$objectToArray: "$team"
}
}
},
{
$project: {
maxRank: {
$max: "$teamsData.v.rank"
}
}
}
]);
To get the object details that has the maximum rank: Use $unwind on the array projected from previous stage to help in sorting by rank $sort and then picking the the first item $first at $group stage.
Query: playgorund link
db.collection.aggregate([
{
$project: {
team: {
$objectToArray: "$team"
}
}
},
{
$unwind: "$team"
},
{
$sort: {
"team.v.rank": -1
}
},
{
$group: {
_id: null,
maxRankObj: {
$first: "$$ROOT"
}
}
}
]);
Sample O/P:
[
{
"_id": null,
"maxRankObj": {
"_id": ObjectId("5a934e000102030405000000"),
"team": {
"k": "member2",
"v": [
{
"goal": 30,
"rank": 5
}
]
}
}
}
]

MongoDB multiple levels embedded array query

I have a document like this:
{
_id: 1,
data: [
{
_id: 2,
rows: [
{
myFormat: [1,2,3,4]
},
{
myFormat: [1,1,1,1]
}
]
},
{
_id: 3,
rows: [
{
myFormat: [1,2,7,8]
},
{
myFormat: [1,1,1,1]
}
]
}
]
},
I want to get distinct myFormat values as a complete array.
For example: I need the result as: [1,2,3,4], [1,1,1,1], [1,2,7,8]
How can I write mongoDB query for this?
Thanks for the help.
Please try this, if every object in rows has only one field myFormat :
db.getCollection('yourCollection').distinct('data.rows')
Ref : mongoDB Distinct Values for a field
Or if you need it in an array & also objects in rows have multiple other fields, try this :
db.yourCollection.aggregate([{$project :{'data.rows.myFormat':1}},{ $unwind: '$data' }, { $unwind: '$data.rows' },
{ $group: { _id: '$data.rows.myFormat' } },
{ $group: { _id: '', distinctValues: { $push: '$_id' } } },
{ $project: { distinctValues: 1, _id: 0 } }])
Or else:
db.yourCollection.aggregate([{ $project: { values: '$data.rows.myFormat' } }, { $unwind: '$values' }, { $unwind: '$values' },
{ $group: { _id: '', distinctValues: { $addToSet: '$values' } } }, { $project: { distinctValues: 1, _id: 0 } }])
Above aggregation queries would get what you wanted, but those can be tedious on large datasets, try to run those and check if there is any slowness, if you're using for one-time then if needed you can consider using {allowDiskUse: true} & irrespective of one-time or not you need to check on whether to use preserveNullAndEmptyArrays:true or not.
Ref : allowDiskUse , $unwind preserveNullAndEmptyArrays

Query multiple properties in at the same time getting an overall average and an array

Given the following data, I'm trying to get an average of all their ages, at the same time I want to return an array of their names. Ideally, I want to do this in just one query but can't seem to figure it out.
Data:
users:[
{user:{
id: 1,
name: “Bob”,
age: 23
}},
{user:{
id: 1,
name: “Susan”,
age: 32
}},
{user:{
id: 2,
name: “Jeff”,
age: 45
}
}]
Query:
var dbmatch = db.users.aggregate([
{$match: {"id" : 1}},
{$group: {_id: null, avg_age: { $avg: "$age" }}},
{$group: {_id : { name: "$name"}}}
)]
Running the above groups one at a time outputs the results I expect, either an _id of null and an average of 27.5, or an array of the names.
When I combine them as you see above using a comma, I get:
Issue Generated Code:
[ { _id: {name: null } } ]
Expected Generated Code:
[
{name:"Bob"},
{name:"Susan"},
avg_age: 27.5
]
Any help would be greatly appreciated!
Not sure if this is exactly what you want, but this query
db.users.aggregate([
{
$match: {
id: 1
}
},
{
$group: {
_id: "$id",
avg_age: {
$avg: "$age"
},
names: {
$push: {
name: "$name"
}
}
}
},
{
$project: {
_id: 0
}
}
])
Results in this result:
[
{
"avg_age": 27.5,
"names": [
{
"name": "Bob"
},
{
"name": "Susan"
}
]
}
]
This will duplicate names, so if there are two documents with the name Bob, it will be two times in the array. If you don't want duplicates, change $push to $addToSet.
Also, if you want names to be just an array of names instead of objects, change names query to
names: {
$push: "$name"
}
This will result in
[
{
"avg_age": 27.5,
"names": ["Bob", "Susan"]
}
]
Hope it helps,
Tomas :)
You can use $facet aggregation to run the multiple queries at once
db.collection.aggregate([
{ "$facet": {
"firstQuery": [
{ "$match": { "id": 1 }},
{ "$group": {
"_id": null,
"avg_age": { "$avg": "$age" }
}}
],
"secondQuery": [
{ "$match": { "id": 1 }},
{ "$group": { "_id": "$name" }}
]
}}
])

Mongo aggregation pipeline, finding out the total number of entries in an array per user

I have a collection, lets call it 'user'. In this collection there is a property entries, which holds a variably sized array of strings,
I want to find out the total number of these strings across my collection.
db.users.find()
> [{ entries: [] }, { entries: ['entry1','entry2']}, {entries: ['entry1']}]
So far I have have made many attempts here are some of my closest.
db.users.aggregate([
{ $project:
{ numberOfEntries:
{ $size: "$entries" } }
},
{ $group:
{_id: { total_entries: { $sum: "$entries"}
}
}
}
])
What this gives me is a list of the users with the total number of entries, now what I want is each of the total_entries figures added up to get my total. Any ideas of what I am doing wrong. Or if there is a better way to start this?
A possible solution could be:
db.users.aggregate([{
$group: {
_id: 'some text here',
count: {$sum: {$size: '$entries'}}
}
}]);
This will give you the total count of all entries across all users and look like
[
{
_id: 'some text here',
count: 3
}
]
I would use $unwind in the case that you want individual entry counts.
That would look like
db.users.aggregate([
{ $unwind: '$entries' },
{$group: {
_id: '$entries',
count: {$sum: 1}
}
])
and this will give you something along the lines of:
[
{
_id: 'entry1',
count: 2
},
{
_id: 'entry2',
count: 1
}
]
In case you want the overall distinct nbr of entries:
> db.users.aggregate([
{ $unwind: "$entries" },
{ $group: { _id: "$entries" } },
{ $count: "total" }
])
{ "total" : 2 }
In case you want the overall nbr of entries:
> db.users.aggregate( [ { $unwind: "$entries" }, { $count: "total" } ] )
{ "total" : 3 }
This makes use of the "unwind" operator which flattens elements of an array from records:
> db.users.aggregate( [ { $unwind: "$entries" } ] )
{ "_id" : ObjectId("5a81a7a1318e1cfc10250430"), "entries" : "entry1" }
{ "_id" : ObjectId("5a81a7a1318e1cfc10250430"), "entries" : "entry2" }
{ "_id" : ObjectId("5a81a7a1318e1cfc10250431"), "entries" : "entry1" }
You were in the right direction though you just needed to specify an _id value of null in the $group stage to calculate accumulated values for all the input documents as a whole i.e.
db.users.aggregate([
{
"$project": {
"numberOfEntries": {
"$size": {
"$ifNull": ["$entries", []]
}
}
}
},
{
"$group": {
"_id": null, /* _id of null to get the accumulated values for all the docs */
"totalEntries": { "$sum": "$numberOfEntries" }
}
}
])
Or with just a single pipeline as:
db.users.aggregate([
{
"$group": {
"_id": null, /* _id of null to get the accumulated values for all the docs */
"totalEntries": {
"$sum": {
"$size": {
"$ifNull": ["$entries", []]
}
}
}
}
}
])