mongodb aggregation framework - generate _id from function - mongodb

Is it possible to have a custom function in the _id field in $group? I couldn't make it work although the documentation seems to indicate that the field can be computed.
For example, let's say I have a set of documents having a number field that ranges 1 to 100. I want to classify the number into several buckets e.g. 1-20, 21-40, etc. Then, I will sum/avg a different field with this bucket identifier. So I am trying to do this:
$group : { _id : bucket("$numberfield") , sum: { $sum: "$otherfield" } }
...where bucket is a function that returns a string e.g. "1-20".
That didn't work.
http://docs.mongodb.org/manual/reference/operator/aggregation/group/#pipe._S_group
For this _id field, you can specify various expressions, including a single field from the documents in the pipeline, a computed value from a previous stage, a document that consists of multiple fields, and other valid expressions, such as constant or subdocument fields.

As at MongoDB 2.4, you cannot implement any custom functions in the Aggregation Framework. If you want to $group by one or more fields, you need to add those either through aggregation operators and expressions or via an explicit update() if you don't want to calculate each time.
Using the Aggregation Framework you can add a computed bucket field in a $project pipeline step with the $cond operator.
Here is an example of calculating ranges based on numberField that can then be used in a $group pipeline for sum/avg/etc:
db.data.aggregate(
{ $project: {
numberfield: 1,
someotherfield: 1,
bucket: {
$cond: [ {$and: [ {$gte: ["$numberfield", 1]}, {$lte: ["$numberfield", 20]} ] }, '1-20', {
$cond: [ {$lt: ["$numberfield", 41]}, '21-40', {
$cond: [ {$lt: ["$numberfield", 61]}, '41-60', {
$cond: [ {$lt: ["$numberfield", 81]}, '61-80', {
$cond: [ {$lt: ["$numberfield", 101]}, '81-100', '100+' ]
}]}]}]}]
}
}},
{ $group: {
_id: "$bucket",
sum: { $sum: "$someotherfield" }
}}
)

Related

Project nested array element to top level using MongoDB aggregation pipeline

I have a groups collection with documents of the form
{
"_id": "g123"
...,
"invites": [
{
"senderAccountId": "a456",
"recipientAccountId": "a789"
},
...
]
}
I want to be able to list all the invites received by a user.
I thought of using an aggregation pipeline on the groups collection that filters all the groups to return only those to which the user has been invited to.
db.groups.aggregate([
{
$match: {
"invites.recipientAccountID": "<user-id>"
}
}
])
Lastly I want to project this array of groups to end up with an array of the form
[
{
"senderAccountId": "a...",
"recipientAccountId": "<user-id>",
"groupId": "g...", // Equal to "_id" field of document.
},
...
]
But I'm missing the "project" step in my aggregation pipeline to bring to the top-level the nested senderAccountId and recipientAccountId fields. I have seen examples online of projections in MongoDB queries and aggregation pipelines but I couldn't find examples for projecting the previously matched element of an array field of a document to the top-level.
I've thought of using Array Update Operators to reference the matched element but couldn't get any meaningful progress using this method.
There are multiple ways to do this, using a combination of unwind and project would work as well. Unwind will create one object for each and project let you choose how you want to structure your result with current variables.
db.collection.aggregate([
{
"$unwind": "$invites"
},
{
"$match": {
"invites.recipientAccountId": "a789"
}
},
{
"$project": {
recipientAccountId: "$invites.recipientAccountId",
senderAccountId: "$invites.senderAccountId",
groupId: "$_id",
_id: 0 // don't show _id key:value
}
}
])
You can also use nimrod serok's $replaceRoot in place of the $project one
{$replaceRoot: {newRoot: {$mergeObjects: ["$invites", {group: "$_id"}]}}}
playground
nimrod serok's solution might be a bit better because mine unwind it first and then matches it but I believe mine is more readable
I think what you want is $replaceRoot:
db.collection.aggregate([
{$match: {"invites.recipientAccountId": "a789"}},
{$set: {
invites: {$first: {
$filter: {
input: "$invites",
cond: {$eq: ["$$this.recipientAccountId", "a789"]}
}
}}
}},
{$replaceRoot: {newRoot: {$mergeObjects: ["$invites", {group: "$_id"}]}}}
])
See how it works on the playground example

Removing item out of nested document array and while also accounting for null/empty document array

I'm new to mongodb and I've been working on this query for quite sometime. I've found solutions using "$project" and "$group" and "$match". Overall goal is if document within nested array "internal" attribute is false, remove it from the array.
$project and $group DO work BUT they then throw of the projection, I don't even see a current projection in this query but once I add in $project or $group it ONLY returns the specific nested document array I'm messing with.
$match won't work because I have cases where the parameter in question that I'm using to remove items from the nested document array is true or false or the array is empty, and $match in different use cases just doesn't return the main document.
Here's an example $group
{ '$unwind': '$notes' },
{
$group: {
_id: "$_id",
notes: {
$push: {
$cond: {
if: { $eq: [ "$notes.internal", false ] },
then: "$$REMOVE",
else: "$notes.internal"
}
}
}
}
You may be able to use $addFields with $filter:
{$addFields: {
notes: {$filter: {
input: "$notes",
as: "item",
cond: {$ne: [ "$$item.internal", false ]}
}}
}}

MongoDB map filtered array inside another array with aggregate $project

I am using Azure Cosmos DB's API for MongoDB with Pymongo. My goal is to filter array inside array and return only filtered results. Aggregation query works for the first array, but returns full inside array after using map, filter operations. Please find Reproducible Example in Mongo Playground: https://mongoplayground.net/p/zS8A7zDMrmK
Current query use $project to filter and return result by selected Options but still returns every object in Discount_Price although query has additional filter to check if it has specific Sales_Week value.
Let me know in comments if my question is clear, many thanks for all possible help and suggestions.
It seemed you troubled in filtering nested array.
options:{
$filter: {
input: {
$map: {
input: "$Sales_Options",
as: 's',
in: {
City: "$$s.City",
Country: "$$s.Country",
Discount_Price: {
$filter: {
input: "$$s.Discount_Price",
as: "d",
cond: {
$in: ["$$d.Sales_Week", [2, 7]]
}
}
}
}
}
},
as: 'pair',
cond: {
$and: [{
$in: [
'$$pair.Country',
[
'UK'
]
]
},
{
$in: [
'$$pair.City',
[
'London'
]
]
}
]
}
}
}
Working Mongo plaground. If you need price1, you can use $project in next stage.
Note : If you follow the projection form upper stage use 1 or 0 which is good practice.
I'd steer you towards the $unwind operator and everything becomes a lot simpler:
db.collection.aggregate([
{$match: {"Store": "AB"}},
{$unwind: "$Sales_Options"},
{$unwind: "$Sales_Options.Discount_Price"},
{$match: {"Sales_Options.Country": {$in: [ "UK" ]},
"Sales_Options.City": {$in: [ "London" ]},
"Sales_Options.Discount_Price.Sales_Week": {$in: [ 2, 7 ]}
}
}
])
Now just $project the fields as appropriate for your output.

MongoDB projections and fields subset

I would like to use mongo projections in order to return less data to my application. I would like to know if it's possible.
Example:
user: {
id: 123,
some_list: [{x:1, y:2}, {x:3, y:4}],
other_list: [{x:5, y:2}, {x:3, y:4}]
}
Given a query for user_id = 123 and some 'projection filter' like user.some_list.x = 1 and user.other_list.x = 1 is it possible to achieve the given result?
user: {
id: 123,
some_list: [{x:1, y:2}],
other_list: []
}
The ideia is to make mongo work a little more and retrieve less data to the application. In some cases, we are discarding 80% of the elements of the collections at the application's side. So, it would be better not returning then at all.
Questions:
Is it possible?
How can I achieve this. $elemMatch doesn't seem to help me. I'm trying something with unwind, but not getting there
If it's possible, can this projection filtering benefit from a index on user.some_list.x for example? Or not at all once the user was already found by its id?
Thank you.
What you can do in MongoDB v3.0 is this:
db.collection.aggregate({
$match: {
"user.id": 123
}
}, {
$redact: {
$cond: {
if: {
$or: [ // those are the conditions for when to include a (sub-)document
"$user", // if it contains a "user" field (as is the case when we're on the top level
"$some_list", // if it contains a "some_list" field (would be the case for the "user" sub-document)
"$other_list", // the same here for the "other_list" field
{ $eq: [ "$x", 1 ] } // and lastly, when we're looking at the innermost sub-documents, we only want to include items where "x" is equal to 1
]
},
then: "$$DESCEND", // descend into sub-document
else: "$$PRUNE" // drop sub-document
}
}
})
Depending on your data setup what you could also do to simplify this query a little is to say: Include everything that does not have a "x" field or if it is present that it needs to be equal to 1 like so:
$redact: {
$cond: {
if: {
$eq: [ { "$ifNull": [ "$x", 1 ] }, 1 ] // we only want to include items where "x" is equal to 1 or where "x" does not exist
},
then: "$$DESCEND", // descend into sub-document
else: "$$PRUNE" // drop sub-document
}
}
The index you suggested won't do anything for the $redact stage. You can benefit from it, however, if you change the $match stage at the start to get rid of all documents which don't match anyway like so:
$match: {
"user.id": 123,
"user.some_list.x": 1 // this will use your index
}
Very possible.
With findOne, the query is the first argument and the projection is the second. In Node/Javascript (similar to bash):
db.collections('users').findOne( {
id = 123
}, {
other_list: 0
} )
Will return the who'll object without the other_list field. OR you could specify { some_list: 1 } as the projection and returned will be ONLY the _id and some_list
$filter is your friend here. Below produces the output you seek. Experiment with changing the $eq fields and target values to see more or less items in the array get picked up. Note how we $project the new fields (some_list and other_list) "on top of" the old ones, essentially replacing them with the filtered versions.
db.foo.aggregate([
{$match: {"user.id": 123}}
,{$project: { "user.some_list": { $filter: {
input: "$user.some_list",
as: "z",
cond: {$eq: [ "$$z.x", 1 ]}
}},
"user.other_list": { $filter: {
input: "$user.other_list",
as: "z",
cond: {$eq: [ "$$z.x", 1 ]}
}}
}}
]);

Mongodb query specific month|year not date

How can I query a specific month in mongodb, not date range, I need month to make a list of customer birthday for current month.
In SQL will be something like that:
SELECT * FROM customer WHERE MONTH(bday)='09'
Now I need to translate that in mongodb.
Note: My dates are already saved in MongoDate type, I used this thinking that will be easy to work before but now I can't find easily how to do this simple thing.
With MongoDB 3.6 and newer, you can use the $expr operator in your find() query. This allows you to build query expressions that compare fields from the same document in a $match stage.
db.customer.find({ "$expr": { "$eq": [{ "$month": "$bday" }, 9] } })
For other MongoDB versions, consider running an aggregation pipeline that uses the $redact operator as it allows you to incorporate with a single pipeline, a functionality with $project to create a field that represents the month of a date field and $match to filter the documents
which match the given condition of the month being September.
In the above, $redact uses $cond tenary operator as means to provide the conditional expression that will create the system variable which does the redaction. The logical expression in $cond will check
for an equality of a date operator field with a given value, if that matches then $redact will return the documents using the $$KEEP system variable and discards otherwise using $$PRUNE.
Running the following pipeline should give you the desired result:
db.customer.aggregate([
{ "$match": { "bday": { "$exists": true } } },
{
"$redact": {
"$cond": [
{ "$eq": [{ "$month": "$bday" }, 9] },
"$$KEEP",
"$$PRUNE"
]
}
}
])
This is similar to a $project +$match combo but you'd need to then select all the rest of the fields that go into the pipeline:
db.customer.aggregate([
{ "$match": { "bday": { "$exists": true } } },
{
"$project": {
"month": { "$month": "$bday" },
"bday": 1,
"field1": 1,
"field2": 1,
.....
}
},
{ "$match": { "month": 9 } }
])
With another alternative, albeit slow query, using the find() method with $where as:
db.customer.find({ "$where": "this.bday.getMonth() === 8" })
You can do that using aggregate with the $month projection operator:
db.customer.aggregate([
{$project: {name: 1, month: {$month: '$bday'}}},
{$match: {month: 9}}
]);
First, you need to check whether the data type is in ISODate.
IF not you can change the data type as the following example.
db.collectionName.find().forEach(function(each_object_from_collection){each_object_from_collection.your_date_field=new ISODate(each_object_from_collection.your_date_field);db.collectionName.save(each_object_from_collection);})
Now you can find it in two ways
db.collectionName.find({ $expr: {
$eq: [{ $year: "$your_date_field" }, 2017]
}});
Or by aggregation
db.collectionName.aggregate([{$project: {field1_you_need_in_result: 1,field12_you_need_in_result: 1,your_year_variable: {$year: '$your_date_field'}, your_month_variable: {$month: '$your_date_field'}}},{$match: {your_year_variable:2017, your_month_variable: 3}}]);
Yes you can fetch this result within date like this ,
db.collection.find({
$expr: {
$and: [
{
"$eq": [
{
"$month": "$date"
},
3
]
},
{
"$eq": [
{
"$year": "$date"
},
2020
]
}
]
}
})
If you're concerned about efficiency, you may want to store the month data in a separate field within each document.