pipeline function in MongoDB, how to make the join - mongodb

I am looking at a query in MongoDB.
Essentially, I want to join records, but only when the records in collection mongo2 meet certain conditions (those in the and statement).
I have 2 questions about this
Where can I put the local and foreign field setting. It says I cannot define them when using pipeline.
Its says that my GT and LT statements are wrong. They work in single find statements, but I am getting the error
Expression $gt takes exactly 2 arguments. 1 were passed in.
Any help will be massivel appreciated :)
Thanks guys
db.mongo.aggregate([
{ $lookup:
{
from: "mongo2",
pipeline: [
{ $match:
{ $expr:
{
$and:[{Age : {$gt:50}}, {Age : {$lt:100}}]
}
}
}
],
as: "filters"
}
}
])

The only way to access fields coming from mongo collection inside of pipeline is to define them as variables using let statement. For instance:
db.mongo.aggregate([
{
$lookup: {
from: "mongo2",
let: { "mongo_collection_id": "$_id" },
pipeline: [
{
$match: { $expr: { $eq: [ "$$mongo_collection_id", "$_id" ] } }
}
],
as: "filters"
}
}
])
Please note that you need double dollar sign ($$) to refer to that variable within pipeline. Single dollar references fields from mongo2 collection documents.
Answering second question: there are two $gt and $lt pairs in MongoDB (which might be confusing). Since you probably have to use $expr the only way is to use $gt (aggregation) so the syntax is a bit different:
{ $expr:
{
$and:[{ $gt: [ "$Age", 50 ] }, { $lt: [ "$Age", 100 ] }]
}
}

Related

Using conditions for both collections (original and foreign) in lookup $match

I'm not sure if it is a real problem or just lack of documentations.
You can put conditions for documents in foreign collection in a lookup $match.
You can also put conditions for the documents of original collection in a lookup $match with $expr.
But when I want to use both of those features, it doesn't work. This is sample lookup in aggregation
{ $lookup:
{
from: 'books',
localField: 'itemId',
foreignField: '_id',
let: { "itemType": "$itemType" },
pipeline: [
{ $match: { $expr: { $eq: ["$$itemType", "book"] } }}
],
as: 'bookData'
}
}
$expr is putting condition for original documents. But what if I want to get only foreign documents with status: 'OK' ? Something like:
{ $match: { status: "OK", $expr: { $eq: ["$$itemType", "book"] } }}
Does not work.
I tried to play with the situation you provided.
Try to put $expr as the first key of $match object. And it should do the thing.
{ $lookup:
{
from: 'books',
localField: 'itemId',
foreignField: '_id',
let: { "itemType": "$itemType" },
pipeline: [
{ $match: { $expr: { $eq: ["$$itemType", "book"] }, status: 'OK' }}
],
as: 'bookData'
}
}
The currently accepted answer is "wrong" in the sense that it doesn't actually change anything. The ordering that the fields for the $match predicate are expressed in does not make a difference. I would demonstrate this with your specific situation, but there is an extra complication there which we will get to in a moment. In the meantime, consider the following document:
{
_id: 1,
status: "OK",
key: 123
}
This query:
db.collection.find({
status: "OK",
$expr: {
$eq: [
"$key",
123
]
}
})
And this query, which just has the order of the predicates reversed:
db.collection.find({
$expr: {
$eq: [
"$key",
123
]
},
status: "OK"
})
Will both find and return that document. A playground demonstration of the first can be found here and the second one is here.
Similarly, your original $match:
{ $match: { status: "OK", $expr: { $eq: ["$$itemType", "book"] } }}
Will behave the same as the one in the accepted answer:
{ $match: { $expr: { $eq: ["$$itemType", "book"] }, status: 'OK' }}
Said another way, there is no difference in behavior based on whether or not the $expr is used first. However, I suspect the overall aggregation is not expressing your desired logic. Let's explore that a little further. First, we need to address this:
$expr is putting condition for original documents.
This is not really true. According to the documentation for $expr, that operator "allows the use of aggregation expressions within the query language."
A primary use of this functionality, and indeed the first one listed in the documentation, is to compare two fields from a single document. In the context of $lookup, this ability to refer to fields from the original documents allows you to compare their values against the collection that you are joining with. The documentation has some examples of that, such as here and other places on that page which refer to $expr.
With that in mind, let's come back to your aggregation. If I am understanding correctly, your intent with the { $expr: { $eq: ["$$itemType", "book"] } predicate is to filter documents from the original collection. Is that right?
If so, then that is not what your aggregation is currently doing. You can see in this playground example that the $match nested inside of the $lookup pipeline does not affect the documents from the original collection. Instead, you should do that filtering via an initial $match on the base pipeline. So something like this:
db.orders.aggregate([
{
$match: {
$expr: {
$eq: [
"$itemType",
"book"
]
}
}
}
])
Or, more simply, this:
db.orders.aggregate([
{
$match: {
"itemType": "book"
}
}
])
Based on all of this, your final pipeline should probably look similar to the following:
db.orders.aggregate([
{
$match: {
"itemType": "book"
}
},
{
$lookup: {
from: "books",
localField: "itemId",
foreignField: "_id",
let: {
"itemType": "$itemType"
},
pipeline: [
{
$match: {
status: "OK"
}
}
],
as: "bookData"
}
}
])
Playground example here. This pipeline:
Filters the data in the original collection (orders) by their itemType. From the sample data, it removes the document with _id: 3 as it has a different itemType than the one we are looking for ("book").
It uses the localField/foreignField syntax to find data in books where the _id of the books document matches the itemId of the source document(s) in the orders collection.
It further uses the let/pipeline syntax to express the additional condition that the status of the books document is "OK". This is why books document with the status of "BAD" does not get pulled into the bookData for the orders document with _id: 2.
Documentation for the (combined) second and third parts is here.

Mongo DB $lookup with no common value to match

I am a novice to MongoDb and am trying to join two collections where there is no common value.
I have two collections.
collection 1 : Role
Fields : Role, UserName
collection 2 :mysite
Fields : userName ,userEmail
In collection 1:
eg :
{
'Role' :"admin"
'UserName' : "abc.efg"
}
In collection 2:
eg:
{
'userName' : "abc Mr, efg"
'userEmail' : "abc.efg#company.com"
}
The value of username is different in format so I am looking for a way to join these two collections.
Is there any way to merge these two collections please.
Kindly help on this.
To perform uncorrelated subqueries between two collections as well as allow other join conditions besides a single equality match, the $lookup stage has the following syntax:
{
$lookup:
{
from: <collection to join>,
let: { <var_1>: <expression>, …, <var_n>: <expression> },
pipeline: [ <pipeline to execute on the collection to join> ],
as: <output array field>
}
}
You can use the following aggregation query using the above $lookup syntax:
db.Role.aggregate([
{
"$lookup": {
"from": "mysite",
let: {
"userName": "$UserName"
},
"pipeline": [
{
$match: {
"$expr": {
"$ne": [
{
"$indexOfCP": [
"$userEmail",
"$$userName"
]
},
-1
]
}
}
}
],
"as": "mysite"
}
},
{
"$unwind": "$mysite"
},
])
MongoDB Playground
$indexOfCP searches a string for an occurrence of a substring and returns the index of the first occurrence. If the substring is not found, it returns -1.
So, in the following stage, it checks if UserName substring is present in userEmail, if not it returns -1, if present it returns the index at which the substring is located.
Hence using the expr $ne -1 , it matches all documents that have UserNamesubstring present in userEmail, ignoring the documents where given substring is not present.
{
$match: {
"$expr": {
"$ne": [
{
"$indexOfCP": [
"$userEmail",
"$$userName"
]
},
-1
]
}
}
}

Lookup with concat followed by match, doesn't return the result of the latter

I'm new to MongoDB and I'm trying to perform aggregate of multiple collections in order to validate a HTTP request. Right now I need to lookup a collection, concat a value of the collection I've looked up with a given string and use the result of the concat on a match equals. The code below is what i currently have:
{
from: "Patient",
let: {
subject: "$subject.reference"
},
pipeline: [
{
$project: {
concatTest: {
$concat: [
"Patient/",
"$id"
]
}
}
},
{
$match: {
$expr: {
$eq: [
"$concatTest",
"$$subject"
]
}
}
}
],
as: "result"
}
The problem:
The result array does not outputs the collection after filtering with match and instead outputs the results of the concat as shown:
result:Array
0:Object
_id:5d6d13175def3532dd905767
concatTest:"Patient/5d6d13175def3532dd905767"
I'm guessing this is a pretty simple problem of placing the right output, however I can't find a solution to it. Maybe i shouldn't be doing concat inside the pipeline? Or I completely misunderstood out the pipeline works?
Thanks in advance.
Ye, was just a simple problem of having the concat has part of the pipeline and not within the match. Solution was to write the match as follows:
$match: {
$expr: {
$eq: [
"$$subject",
{$concat:["Patient/","$id"]}
]
}
}
This way the concat is resolved into the match and not into the pipeline.

MongoDB: adding fields based on partial match query - expression vs query

So I have one collection that I'd like to query/aggegate. The query is made up of several parts that are OR'ed together. For every part of the query, I have a specific set of fields that need to be shown.
So my hope was to do this with an aggregate, that will $match the queries OR'ed together all at once, and then use $project with $cond to see what fields are needed. The problem here is that $cond uses expressions, while the $match uses queries. Which is a problem since some query features are not available as an expression. So a simple conversion is not an option.
So I need another solution..
- I could just make an aggregate per separate query, because there I know what fields to match, and them merger the results together. But this will not work if I use pagination in the queries (limit/skip etc).
- find some other way to tag every document so I can (afterwards) remove any fields not needed. It might not be super efficient, but would work. No clue yet how to do that
- figure out a way to make queries that are only made of expressions. For my purpose that might be good enough, and it would mean a rewrite of the query parser. It could work, but is not ideal.
So This is the next incarnation right here. It will deduplicate and merge records and finally transform it back again to something resembling a normal query result:
db.getCollection('somecollection').aggregate(
[
{
"$facet": {
"f1": [
{
"$match": {
<some query 1>
},
{
"$project: {<some fixed field projection>}
}
],
"f2": [
{
"$match": {
<some query 1>
}
},
{
"$project: {<some fixed field projection>}
}
]
}
},
{
$project: {
"rt": { $concatArrays: [ "$f1", "$f2"] }
}
},
{ $unwind: { path: "$rt"} },
{ $replaceRoot: {newRoot:"$rt"}},
{ $group: {_id: "$_id", items: {$push: {item:"$$ROOT"} } }},
{
$project: {
"rt": { $mergeObjects: "$items" }
}
},
{ $replaceRoot: {newRoot:"$rt.item"}},
]
);
There might still be some optimisation to be, so any comments are welcome
I found an extra option using $facet. This way, I can make a facet for every group opf fields/subqueries. This seems to work fine, except that the result is a single document with a bunch of arrays. not yet sure how to convert that back to multiple documents.
okay, so now I have it figured out. I'm not sure yet about all of the intricacies of this solution, but it seems to work in general. Here an example:
db.getCollection('somecollection').aggregate(
[
{
"$facet": {
"f1": [
{
"$match": {
<some query 1>
},
{
"$project: {<some fixed field projection>
}
],
"f2": [
{
"$match": {
<some query 1>
}
},
{
"$project: {<some fixed field projection>
}
]
}
},
{
$project: {
"rt": { $concatArrays: [ "$f1", "$f2"] }
}
},
{ $unwind: { path: "$rt"} },
{ $replaceRoot: {newRoot:"$rt"}}
]
);

MongoDB projections and fields subset

I would like to use mongo projections in order to return less data to my application. I would like to know if it's possible.
Example:
user: {
id: 123,
some_list: [{x:1, y:2}, {x:3, y:4}],
other_list: [{x:5, y:2}, {x:3, y:4}]
}
Given a query for user_id = 123 and some 'projection filter' like user.some_list.x = 1 and user.other_list.x = 1 is it possible to achieve the given result?
user: {
id: 123,
some_list: [{x:1, y:2}],
other_list: []
}
The ideia is to make mongo work a little more and retrieve less data to the application. In some cases, we are discarding 80% of the elements of the collections at the application's side. So, it would be better not returning then at all.
Questions:
Is it possible?
How can I achieve this. $elemMatch doesn't seem to help me. I'm trying something with unwind, but not getting there
If it's possible, can this projection filtering benefit from a index on user.some_list.x for example? Or not at all once the user was already found by its id?
Thank you.
What you can do in MongoDB v3.0 is this:
db.collection.aggregate({
$match: {
"user.id": 123
}
}, {
$redact: {
$cond: {
if: {
$or: [ // those are the conditions for when to include a (sub-)document
"$user", // if it contains a "user" field (as is the case when we're on the top level
"$some_list", // if it contains a "some_list" field (would be the case for the "user" sub-document)
"$other_list", // the same here for the "other_list" field
{ $eq: [ "$x", 1 ] } // and lastly, when we're looking at the innermost sub-documents, we only want to include items where "x" is equal to 1
]
},
then: "$$DESCEND", // descend into sub-document
else: "$$PRUNE" // drop sub-document
}
}
})
Depending on your data setup what you could also do to simplify this query a little is to say: Include everything that does not have a "x" field or if it is present that it needs to be equal to 1 like so:
$redact: {
$cond: {
if: {
$eq: [ { "$ifNull": [ "$x", 1 ] }, 1 ] // we only want to include items where "x" is equal to 1 or where "x" does not exist
},
then: "$$DESCEND", // descend into sub-document
else: "$$PRUNE" // drop sub-document
}
}
The index you suggested won't do anything for the $redact stage. You can benefit from it, however, if you change the $match stage at the start to get rid of all documents which don't match anyway like so:
$match: {
"user.id": 123,
"user.some_list.x": 1 // this will use your index
}
Very possible.
With findOne, the query is the first argument and the projection is the second. In Node/Javascript (similar to bash):
db.collections('users').findOne( {
id = 123
}, {
other_list: 0
} )
Will return the who'll object without the other_list field. OR you could specify { some_list: 1 } as the projection and returned will be ONLY the _id and some_list
$filter is your friend here. Below produces the output you seek. Experiment with changing the $eq fields and target values to see more or less items in the array get picked up. Note how we $project the new fields (some_list and other_list) "on top of" the old ones, essentially replacing them with the filtered versions.
db.foo.aggregate([
{$match: {"user.id": 123}}
,{$project: { "user.some_list": { $filter: {
input: "$user.some_list",
as: "z",
cond: {$eq: [ "$$z.x", 1 ]}
}},
"user.other_list": { $filter: {
input: "$user.other_list",
as: "z",
cond: {$eq: [ "$$z.x", 1 ]}
}}
}}
]);