Optional break in aggregation pipeline - mongodb

I have the following pipeline:
Match a single document from collection collection1.
Take key k_id, which is ObjectId and lookup for another single document in collection2 with that _id.
$unwind the result.
The problem is, I am not sure k_id exists and collection2 contains the document I lookup for. In that case I'd like to do some checks and break aggregation pipeline after the first step. I didn't found any operator with similar functionality.
For now I do some complicated mess.
db['collection1'].aggregate([
{$match: ...},
// make field that always exist and put either $k_id or null there
{$addFields: {
k_id_ensured: {$ifNull: ['$k_id', null]}
}},
// $document_unensured may be empty array in case no documents found
{$lookup: {
from: 'collection2',
...,
as: 'document_unensured'
}},
// replace empty array with [null], ...
{$addFields: {
document_ensured: {
$cond: {
// (if $document_unensured is empty array...)
if: {$eq: [ {$size, '$document_unensured'}, 0 ]},
// (...then make it contain at least `null`)
then: [ null ],
else: '$document_unensured'
}
}
}},
// ...because after unwinding empty array the whole
// document will just dissappear
{$unwind: {path: '$document_ensured'}},
{$project: /*delete all non needed fields*/}
])
Is there more elegant way to do it?

Related

Project nested array element to top level using MongoDB aggregation pipeline

I have a groups collection with documents of the form
{
"_id": "g123"
...,
"invites": [
{
"senderAccountId": "a456",
"recipientAccountId": "a789"
},
...
]
}
I want to be able to list all the invites received by a user.
I thought of using an aggregation pipeline on the groups collection that filters all the groups to return only those to which the user has been invited to.
db.groups.aggregate([
{
$match: {
"invites.recipientAccountID": "<user-id>"
}
}
])
Lastly I want to project this array of groups to end up with an array of the form
[
{
"senderAccountId": "a...",
"recipientAccountId": "<user-id>",
"groupId": "g...", // Equal to "_id" field of document.
},
...
]
But I'm missing the "project" step in my aggregation pipeline to bring to the top-level the nested senderAccountId and recipientAccountId fields. I have seen examples online of projections in MongoDB queries and aggregation pipelines but I couldn't find examples for projecting the previously matched element of an array field of a document to the top-level.
I've thought of using Array Update Operators to reference the matched element but couldn't get any meaningful progress using this method.
There are multiple ways to do this, using a combination of unwind and project would work as well. Unwind will create one object for each and project let you choose how you want to structure your result with current variables.
db.collection.aggregate([
{
"$unwind": "$invites"
},
{
"$match": {
"invites.recipientAccountId": "a789"
}
},
{
"$project": {
recipientAccountId: "$invites.recipientAccountId",
senderAccountId: "$invites.senderAccountId",
groupId: "$_id",
_id: 0 // don't show _id key:value
}
}
])
You can also use nimrod serok's $replaceRoot in place of the $project one
{$replaceRoot: {newRoot: {$mergeObjects: ["$invites", {group: "$_id"}]}}}
playground
nimrod serok's solution might be a bit better because mine unwind it first and then matches it but I believe mine is more readable
I think what you want is $replaceRoot:
db.collection.aggregate([
{$match: {"invites.recipientAccountId": "a789"}},
{$set: {
invites: {$first: {
$filter: {
input: "$invites",
cond: {$eq: ["$$this.recipientAccountId", "a789"]}
}
}}
}},
{$replaceRoot: {newRoot: {$mergeObjects: ["$invites", {group: "$_id"}]}}}
])
See how it works on the playground example

Removing item out of nested document array and while also accounting for null/empty document array

I'm new to mongodb and I've been working on this query for quite sometime. I've found solutions using "$project" and "$group" and "$match". Overall goal is if document within nested array "internal" attribute is false, remove it from the array.
$project and $group DO work BUT they then throw of the projection, I don't even see a current projection in this query but once I add in $project or $group it ONLY returns the specific nested document array I'm messing with.
$match won't work because I have cases where the parameter in question that I'm using to remove items from the nested document array is true or false or the array is empty, and $match in different use cases just doesn't return the main document.
Here's an example $group
{ '$unwind': '$notes' },
{
$group: {
_id: "$_id",
notes: {
$push: {
$cond: {
if: { $eq: [ "$notes.internal", false ] },
then: "$$REMOVE",
else: "$notes.internal"
}
}
}
}
You may be able to use $addFields with $filter:
{$addFields: {
notes: {$filter: {
input: "$notes",
as: "item",
cond: {$ne: [ "$$item.internal", false ]}
}}
}}

Find documents based on a property values progressively in MongoDB

Assume I have a collection with many documents which they have a property called "status". Status accepts any Int value. I want to find all documents that have status with value "1". If there are not any, find all documents that have status with value "2" and so on... Is there any solution to do such action in a single query?
That is possible.
If you create an index on {status:1}, you can run a query with limit:1 and sort:{status:1}, and project to return only the status field (excluding _id). This would be a covered query that is quite efficient and should only examine a single index key.
Then use that value to query for matching status. This query would also use the index to minimize the number of document examined.
The difference between doing this in 2 queries vs 1 is likely small.
You could peform both in an aggregation:
db.collection.aggregate([
{$sort: {status: 1}},
{$limit: 1},
{$project: {
_id: 0,
status: 1
}},
{$lookup: {
as: "matched",
from: "collection",
let: {target: "$status"},
pipeline: [
{$match: {
$expr: {
$eq: [
"$status",
"$$target"
]
}
}}
]
}},
{$unwind: "$matched"},
{$replaceRoot: {newRoot: "$matched"}}
])
It is not completely clear whether the $lookup part will be able to use the index, so you should test to see if that actually performs better than running 2 queries from the client.
Playground

MongoDB projections and fields subset

I would like to use mongo projections in order to return less data to my application. I would like to know if it's possible.
Example:
user: {
id: 123,
some_list: [{x:1, y:2}, {x:3, y:4}],
other_list: [{x:5, y:2}, {x:3, y:4}]
}
Given a query for user_id = 123 and some 'projection filter' like user.some_list.x = 1 and user.other_list.x = 1 is it possible to achieve the given result?
user: {
id: 123,
some_list: [{x:1, y:2}],
other_list: []
}
The ideia is to make mongo work a little more and retrieve less data to the application. In some cases, we are discarding 80% of the elements of the collections at the application's side. So, it would be better not returning then at all.
Questions:
Is it possible?
How can I achieve this. $elemMatch doesn't seem to help me. I'm trying something with unwind, but not getting there
If it's possible, can this projection filtering benefit from a index on user.some_list.x for example? Or not at all once the user was already found by its id?
Thank you.
What you can do in MongoDB v3.0 is this:
db.collection.aggregate({
$match: {
"user.id": 123
}
}, {
$redact: {
$cond: {
if: {
$or: [ // those are the conditions for when to include a (sub-)document
"$user", // if it contains a "user" field (as is the case when we're on the top level
"$some_list", // if it contains a "some_list" field (would be the case for the "user" sub-document)
"$other_list", // the same here for the "other_list" field
{ $eq: [ "$x", 1 ] } // and lastly, when we're looking at the innermost sub-documents, we only want to include items where "x" is equal to 1
]
},
then: "$$DESCEND", // descend into sub-document
else: "$$PRUNE" // drop sub-document
}
}
})
Depending on your data setup what you could also do to simplify this query a little is to say: Include everything that does not have a "x" field or if it is present that it needs to be equal to 1 like so:
$redact: {
$cond: {
if: {
$eq: [ { "$ifNull": [ "$x", 1 ] }, 1 ] // we only want to include items where "x" is equal to 1 or where "x" does not exist
},
then: "$$DESCEND", // descend into sub-document
else: "$$PRUNE" // drop sub-document
}
}
The index you suggested won't do anything for the $redact stage. You can benefit from it, however, if you change the $match stage at the start to get rid of all documents which don't match anyway like so:
$match: {
"user.id": 123,
"user.some_list.x": 1 // this will use your index
}
Very possible.
With findOne, the query is the first argument and the projection is the second. In Node/Javascript (similar to bash):
db.collections('users').findOne( {
id = 123
}, {
other_list: 0
} )
Will return the who'll object without the other_list field. OR you could specify { some_list: 1 } as the projection and returned will be ONLY the _id and some_list
$filter is your friend here. Below produces the output you seek. Experiment with changing the $eq fields and target values to see more or less items in the array get picked up. Note how we $project the new fields (some_list and other_list) "on top of" the old ones, essentially replacing them with the filtered versions.
db.foo.aggregate([
{$match: {"user.id": 123}}
,{$project: { "user.some_list": { $filter: {
input: "$user.some_list",
as: "z",
cond: {$eq: [ "$$z.x", 1 ]}
}},
"user.other_list": { $filter: {
input: "$user.other_list",
as: "z",
cond: {$eq: [ "$$z.x", 1 ]}
}}
}}
]);

mongodb aggregation framework - generate _id from function

Is it possible to have a custom function in the _id field in $group? I couldn't make it work although the documentation seems to indicate that the field can be computed.
For example, let's say I have a set of documents having a number field that ranges 1 to 100. I want to classify the number into several buckets e.g. 1-20, 21-40, etc. Then, I will sum/avg a different field with this bucket identifier. So I am trying to do this:
$group : { _id : bucket("$numberfield") , sum: { $sum: "$otherfield" } }
...where bucket is a function that returns a string e.g. "1-20".
That didn't work.
http://docs.mongodb.org/manual/reference/operator/aggregation/group/#pipe._S_group
For this _id field, you can specify various expressions, including a single field from the documents in the pipeline, a computed value from a previous stage, a document that consists of multiple fields, and other valid expressions, such as constant or subdocument fields.
As at MongoDB 2.4, you cannot implement any custom functions in the Aggregation Framework. If you want to $group by one or more fields, you need to add those either through aggregation operators and expressions or via an explicit update() if you don't want to calculate each time.
Using the Aggregation Framework you can add a computed bucket field in a $project pipeline step with the $cond operator.
Here is an example of calculating ranges based on numberField that can then be used in a $group pipeline for sum/avg/etc:
db.data.aggregate(
{ $project: {
numberfield: 1,
someotherfield: 1,
bucket: {
$cond: [ {$and: [ {$gte: ["$numberfield", 1]}, {$lte: ["$numberfield", 20]} ] }, '1-20', {
$cond: [ {$lt: ["$numberfield", 41]}, '21-40', {
$cond: [ {$lt: ["$numberfield", 61]}, '41-60', {
$cond: [ {$lt: ["$numberfield", 81]}, '61-80', {
$cond: [ {$lt: ["$numberfield", 101]}, '81-100', '100+' ]
}]}]}]}]
}
}},
{ $group: {
_id: "$bucket",
sum: { $sum: "$someotherfield" }
}}
)