MongoDB aggregation grouping fill missing values - mongodb

I'm using MongoDB aggregation framework. I have a Mongo collection with documents like this:
{
'step': 1,
'name': 'house',
'score': 2
}
{
'step': 1,
'name': 'car',
'score': 3
}
{
'step': 2,
'name': 'house',
'score': 4
}
I'm grouping the documents with same 'step' and pushing 'name' and 'score' into an array of objects. What I get is:
{
'step': 1,
'scores':
[
{'name':'house','score':2},
{'name':'car','score':3}
]
}
{
'step': 2,
'scores':
[
{'name':'house','score':4}
]
}
For each 'step' I need to copy the value of previous 'step' in case that a 'name' does not exists. I should have something like this:
{
'step': 1,
'scores':
[
{'name':'house','score':2},
{'name':'car','score':3}
]
}
{
'step': 2,
'scores':
[
{'name':'house','score':4},
**{'name': 'car', 'score':3}**
]
}
At the second document the element {'name':'car','score':3} has been copied from the previous document because at 'step:2' there is not documents having 'score' for 'car'.
I'm not able to figure out how to do this operation with MongoDB aggregation. Some help will be very appreciated.

Required to use $lookup with pipeline, look below step by step,
$group by step and push all scores in one array scores
push all name in names of each score of particular step, we will use in match condition inside lookup
db.collection.aggregate([
{
$group: {
_id: "$step",
scores: {
$push: {
name: "$name",
score: "$score"
}
},
names: { $push: "$name" }
}
},
$unwind scores because its array and we are going to lookup
{ $unwind: "$scores" },
$lookup let variables step(_id) and names for pipeline level
$match condition with expression $expr there are 3 conditions
check the size of names It should be one(1), either its car or house,
match step number, it should be equal
match not in for ex. if car is already available then it will search for house in lookup, need to use separate $not and than $in
$project to show required fields
lookup result will store in clone_score
{
$lookup: {
from: "collection",
let: {
step_id: { $subtract: ["$_id", 1] },
names: "$names"
},
pipeline: [
{
$match: {
$expr: {
$and: [
{ $eq: [{ $size: "$$names" }, 1] },
{ $eq: ["$$step_id", "$step"] },
{ $not: [{ $in: ["$name", "$$names"] }] }
]
}
}
},
{
$project: {
_id: 0,
name: 1,
score: 1
}
}
],
as: "clone_score"
}
},
$group by step(_id) and push all scores in one array scores, keep first clone_score
{
$group: {
_id: "$_id",
scores: { $push: "$scores" },
clone_score: { $first: "$clone_score" }
}
},
from above pipelines, we have two separate array scores and clone_score now,
$project we need to concat both of them in scores
{
$project: {
_id: 0,
step: "$_id",
scores: {
$concatArrays: ["$scores", "$clone_score"]
}
}
}
])
Playground: https://mongoplayground.net/p/Fytf7NEU7uG

Related

How to match an element in array? MongoDB aggregation

I need to check all the documents whose specific field is contained in an array.
For example I have the array
arr = ['a', 'b', 'a']
I want to match all the documents that has field my_letter equal to a or b.
I have the documents:
[
{
_id: ObjectID(),
my_letter:'d'
},
{
_id: ObjectID(),
my_letter:'a'
},
{
_id: ObjectID(),
my_letter:'b'
}
]
I want the aggregation to return
[
{
_id: ObjectID(),
my_letter:'a'
},
{
_id: ObjectID(),
my_letter:'b'
}
]
I tried this in my $match pipeline
{
$match: {
_id: {
$elemMatch: {
$or: [
{ $eq: ["a"] },
{ $eq: ["b"] },
],
},
},
},
},
Of course it doesn't work. How would You suggest to complete the $match pipeline?
db.collection.find({
my_letter: {
$in: [ "a", "b" ]
}
})
mongoplayground
db.collection.aggregate([
{
$match: {
my_letter: {
$in: [ "a", "b" ]
}
}
}
])
mongoplayground

mongodb: sort nested array using dynamic parmeter of field name

Assume in an aggregation pipeline one of the steps produces the following results:
{
customer: "WN",
sort_category: "category_a",
locations: [
{
city: "Elkana",
category_a: 11904.0,
category_b: 74.0,
category_c: 657.0,
},
{
city: "Haifa",
category_a: 20.0,
category_b: 841.0,
category_c: 0,
},
{
city" : "Jerusalem",
category_a: 451.0,
category_b: 45.0,
category_c: 712.0,
}
]
}
{
...
}
The next step is to sort the list of the nested objects of each document in the collection.
The list of the nested objects should be sorted by dynamic parameter containing the field name.
For example - the list of locations should be sorted by the value of category_a.
category_a is parmeter given in sort_category field.
Here is a solution without use $function:
db.tests.aggregate([
{$unwind:"$locations"},
{$project: {
_id: 1,
customer: 1,
sort_category: 1,
locations: 1,
locationsKV:{$objectToArray:"$locations"}
}
},
{$unwind:"$locationsKV"},
{$project:{
_id: 1,
customer: 1,
sort_category: 1,
locations: 1,
locationsKV: 1,
category: {
$cond:[{$eq: ["$sort_category","$locationsKV.k"]},
"$locationsKV.v", 0]},
agg: {
$cond: [{$eq: ["$sort_category","$locationsKV.k"]}, true, false]
}
}
},
{$match: {agg: true}},
{$sort: {category: 1}},
{$group: {
_id: "$_id",
customer: {$first: "$customer"},
sort_category: {$first: "$sort_category"},
locations: {$push: "$locations"}
}
},
{$project:{ _id: 0}}
])
You can try custom way,
$addFields will add one copy field named category of key sort_category inside every object in locations array using $map and $reduce
$unwind deconstruct locations array
$sort by category field that we have added in locations array
$project to remove category field
$group by _id and reconstruct locations array
$project to remove _id field
db.collection.aggregate([
{
$addFields: {
locations: {
$map: {
input: "$locations",
as: "l",
in: {
$mergeObjects: [
"$$l",
{
category: {
$reduce: {
input: { $objectToArray: "$$l" },
initialValue: null,
in: {
$cond: [{ $eq: ["$$this.k", "$sort_category"] }, "$$this.v", "$$value"]
}
}
}
}
]
}
}
}
}
},
{ $unwind: "$locations" },
{ $sort: { "locations.category": 1 } },
{ $project: { "locations.category": 0 } },
{
$group: {
_id: "$_id",
customer: { $first: "$customer" },
sort_category: { $first: "$sort_category" },
locations: { $push: "$locations" }
}
},
{ $project: { _id: 0 } }
])
Playground
If you are planing to upgrade your MongoDB to v4.4 or also this will helpful for others, The $function is a option for custom operation and user defined operation.
There are 3 properties:
body The function definition. You can specify the function definition as either BSON type Code or String, define our own function using function(){, we have passed locations array and sort_category that is dynamic field, the function logic is sort by descending order
args Arguments passed to the function body
lang The language used in the body. You must specify lang: "js"
db.collection.aggregate([
{
$addFields: {
locations: {
$function: {
body: function(locations, sort_category){
return locations.sort(function(a, b){
// DESCENDING ORDER
return b[sort_category] - a[sort_category]
// ASCENDING ORDER
// return a[sort_category] - b[sort_category]
})
},
args: ["$locations", "$sort_category"],
lang: "js"
}
}
}
}
])
For more guidelines about restrictions and considerations Refer.

Convert array to new field, using keys as the values of this array and values as frequency of these items (aggregation framework)

I have this problem, but I can't solve it.
I have to transform the array s to a new field called shares.
This new field have inside new keys and new values.
Suppose I have these documents:
{
'name': 'igor',
's': ['a', 'a', 'a', 'b', 'b']
},
{
'name': 'jones',
's': ['c', 'b']
}
Expected output:
{
'name': 'igor',
'shares': {
'a': 3
'b': 2
}
},
{
'name': 'jones',
'shares': {
'c': 1
'b': 1
}
}
You can try below aggregation query :
db.collection.aggregate([
/** unwind `s` array */
{
$unwind: "$s"
},
/** group on unique pairs of `_id + s` & retain name field, count sum of matching docs */
{
$group: { _id: { k: "$s", _id: "$_id" }, name: { $first: "$name" }, v: { $sum: 1 } }
},
/** group on unique pairs of just `_id` & retain name field, push docs into shares array `[{k :..., v:...}]` */
{
$group: { _id: "$_id._id", name: { $first: "$name" }, shares: { $push: { k: "$_id.k", v: "$v" } } }
},
/** Re-create shares field from array to object */
{
$addFields: { shares: { $arrayToObject: "$shares" } }
}
])
Test : mongoplayground
It's a bad practice to add heterogeneous elements (in your case: 'a': 3, 'b': 2) to an array, I converted shares's type to something like:
{
key: "$_id.shares",
count: "$count"
}
You need to do the following in order:
Unwind the array s.
Group by composite _ids name and s.
Again group by _id _id.name and push objects of type key and count to the shares array.
You can try the below query:
db.collection.aggregate([
{
$unwind: "$s"
},
{
$group: {
_id: {
name: "$name",
shares: "$s"
},
count: {
$sum: 1
}
}
},
{
$group: {
_id: "$_id.name",
shares: {
$push: {
key: "$_id.shares",
count: "$count"
}
}
}
}
])
Output
[
{
"_id": "jones",
"shares": [
{
"count": 1,
"key": "c"
},
{
"count": 1,
"key": "b"
}
]
},
{
"_id": "igor",
"shares": [
{
"count": 3,
"key": "a"
},
{
"count": 2,
"key": "b"
}
]
}
]
MongoPlayGroundLink

Top documents per bucket

I would like to get the documents with the N highest fields for each of N categories. For example, the posts with the 3 highest scores from each of the past 3 months. So each month would have 3 posts that "won" for that month.
Here is what my work so far has gotten, simplified.
// simplified
db.posts.aggregate([
{$bucket: {
groupBy: "$createdAt",
boundaries: [
ISODate('2019-06-01'),
ISODate('2019-07-01'),
ISODate('2019-08-01')
],
default: "Other",
output: {
posts: {
$push: {
// ===
// This gets all the posts, bucketed by month
score: '$score',
title: '$title'
// ===
}
}
}
}},
{$match: {_id: {$ne: "Other"}}}
])
I attempted to use the $slice operator in between the // ===s, but go an error (below).
postResults: {
$each: [{
score: '$score',
title: '$title'
}],
$sort: {score: -1},
$slice: 2
}
An object representing an expression must have exactly one field: { $each: [ { score: \"$score\", title: \"$title\" } ], $sort: { baseScore: -1.0 }, $slice: 2.0 }
$slice you're trying to use is dedicated for update operations. To get top N posts you need to run $unwind, then $sort and $group to get ordered array. As a last step you can use $slice (aggregation), try:
db.posts.aggregate([
{$bucket: {
groupBy: "$createdAt",
boundaries: [
ISODate('2019-06-01'),
ISODate('2019-07-08'),
ISODate('2019-08-01')
],
default: "Other",
output: {
posts: {
$push: {
score: '$score',
title: '$title'
}
}
}
}},
{ $match: {_id: {$ne: "Other"}}},
{ $unwind: "$posts" },
{ $sort: { "posts.score": -1 } },
{ $group: { _id: "$_id", posts: { $push: { "score": "$posts.score", "title": "$posts.title" } } } },
{ $project: { _id: 1, posts: { $slice: [ "$posts", 3 ] } } }
])

mongodb aggregation query for field value length's sum

Say, I have following documents:
{name: 'A', fav_fruits: ['apple', 'mango', 'orange'], 'type':'test'}
{name: 'B', fav_fruits: ['apple', 'orange'], 'type':'test'}
{name: 'C', fav_fruits: ['cherry'], 'type':'test'}
I am trying to query to find the total count of fav_fruits field on overall documents returned by :
cursor = db.collection.find({'type': 'test'})
I am expecting output like:
cursor.count() = 3 // Getting
Without much idea of aggregate, can mongodb aggregation framework help me achieve this in any way:
1. sum up the lengths of all 'fav_fruits' field: 6
and/or
2. unique 'fav_fruit' field values = ['apple', 'mango', 'orange', 'cherry']
You need to $project your document after the $match stage and use the $size operator which return the number of items in each array. Then in the $group stage you use the $sum accumulator operator to return the total count.
db.collection.aggregate([
{ "$match": { "type": "test" } },
{ "$project": { "count": { "$size": "$fav_fruits" } } },
{ "$group": { "_id": null, "total": { "$sum": "$count" } } }
])
Which returns:
{ "_id" : null, "total" : 6 }
To get unique fav_fruits simply use .distinct()
> db.collection.distinct("fav_fruits", { "type": "test" } )
[ "apple", "mango", "orange", "cherry" ]
Do this to get just the number of fruits in the fav_fruits array:
db.fruits.aggregate([
{ $match: { type: 'test' } },
{ $unwind: "$fav_fruits" },
{ $group: { _id: "$type", count: { $sum: 1 } } }
]);
This will return the total number of fruits.
But if you want to get the array of unique fav_fruits along with the total number of elements in the fav_fruits field of each document, do this:
db.fruits.aggregate([
{ $match: { type: 'test' } },
{ $unwind: "$fav_fruits" },
{ $group: { _id: "$type", count: { $sum: 1 }, fav_fruits: { $addToSet: "$fav_fruits" } } }
])
You can try this. It may helpful to you.
db.collection.aggregate([{ $match : { type: "test" } }, {$group : { _id : null, count:{$sum:1} } }])