MongoDB - How to compare fields of different collections in $match of aggregate? - mongodb

Let's suppose I have the collection A and the collection B. My query is like following:
db.A.aggregate([
{
$lookup: {
from: "B",
localField: "_id",
foreignField: "custom_id",
as: "B"
}
},
{
$match: {
"B.anotherId": "A.anotherId" // not working, is it possible?
}
])
I'm curious to know if it's possible to do what I tried to do in $match. The goal is to get only the documents that have the same "anotherId" value in A and B documents. Is it supported? And if yes, how do to it?

You can use $lookup with aggregation pipeline,
let to define your both fields, and check expression condition in $match and $and
db.A.aggregate([
{
$lookup: {
from: "B",
let: {
custom_id: "$_id",
anotherId: "$anotherId
},
pipeline: [
{
$match: {
$expr: {
$and: [
{ $eq: ["$$custom_id", "$custom_id"] },
{ $eq: ["$$anotherId", "$anotherId"] }
]
}
}
}
],
as: "B"
}
}
])

Not sure what you are trying to achieve here. $lookup provides an array of values. Are you trying to filter the array? Which would mean you have to use $filter.
However, based on your question of how to compare two fields, you have to use $expr.
{
$match: {
$expr: {
$eq: ["$firstField", "$secondField"]
}
}
}
If however you are trying to filter the collection B based on a value in A, you will have to use $filter
{
$set: {
B: {
$filter: {
input: "$B",
as: "b",
cond: {
$eq: ["$A.anotherId", "$$b.anotherId"]
}
}
}
}
}

Related

arguments to $lookup must be strings

Trying to use this $lookup query in mongo DB.
db.request_user.aggregate( [
{
$lookup:
{
from: 'request',
let: {req_id: "$requestId",curr_user:"$user"},
pipeline: [
{ $match:
{ $expr:
{ $and:
[
{$eq: [ "$requestId","$$req_id"]}
{$eq: [ "$currentUser","$$curr_user"]}
]
}
}
},
],
as: "result"
}
}
] )
I am getting this error:
{
"message" : "arguments to $lookup must be strings, let: { req_id: '$requestId' } is type object",
"ok" : 0,
"code" : 4570,
"codeName" : "Location4570"
}
Found some sources saying let is not compatible with mongoDB 3 ~ versions. I am using version 3.4. If it's true.. can some please suggest an alternative.
As I mentioned in the comments this $lookup syntax is only available starting version 3.6. The alternative would be to use the "old" $lookup syntax which allows a join only on a single field.
Then to add additional filtering based on the other condition, like so:
db.request_user.aggregate( [
{
$lookup:
{
from: 'request',
localField: "user",
foreignField: "currentUser",
as: "result"
}
},
{
$addFields: {
result: {
$filter: {
input: "$result",
cond: {$eq: ["$$this.requestId", "$requestId"]}
}
}
}
}
] )
It would be better if you use the field that will match less documents in the original $lookup and the other field in the $filter expression, only you can know the distribution based on your data.

"iterate" through all document fields in mongodb

I have a collection with documents in this form:
{
"fields_names": ["field1", "field2", "field3"]
"field1": 1,
"field2": [1, 2, 3]
"field3": "12345"
}
where field1, field2, field3 are "dynamic" for each document (I have for each document the fields names in the "fields_names" array)
I would like to test whether 2 documents are equals using the aggregation framework.
I used $lookup stage for getting another documents.
My issue is: how can I "iterate" through the whole fields for my collection?
db.collection.aggregate([
{
{$match: "my_id": "test_id"},
{$lookup:
from: "collection"
let: my_id: "$my_id", prev_id: "$_id"
pipeline: [
{$match: "my_id": "$$my_id", "_id": {$ne: "$$prev_id"}}
]
as: "lookup_test"
}
}])
and in the pipeline of the lookup, I would like to iterate the "fields_names" array for getting the names of the fields, and then access their value and compare between the "orig document" (not the $lookup) and the other documents ($lookup documents).
OR: just to iterate all fields (not include the "fields_names" array)
I would like to fill the "lookup_test" array with all documents which as the same fields values..
You will have to compare the two "partial" parts of the document meaning you'll have to ( for each document ) do this in the $lookup, needless to say this is going to be a -very- expensive pipeline. With that said here's how I would do it:
db.collection.aggregate([
{
$match: {
"my_id": "test_id"
}
},
{
"$lookup": {
"from": "collection",
"let": {
id: "$_id",
partialRoot: {
$filter: {
input: {
"$objectToArray": "$$ROOT"
},
as: "fieldObj",
cond: {
"$setIsSubset": [
[
"$$fieldObj.k"
],
"$fields_names"
]
}
}
}
},
pipeline: [
{
$match: {
$expr: {
$and: [
{
$ne: [
"$$id",
"$_id"
]
},
{
$eq: [
{
$size: "$$partialRoot"
},
{
$size: {
"$setIntersection": [
"$$partialRoot",
{
$filter: {
input: {
"$objectToArray": "$$ROOT"
},
as: "fieldObj",
cond: {
"$setIsSubset": [
[
"$$fieldObj.k"
],
"$fields_names"
]
}
}
}
]
}
}
]
}
]
}
}
},
],
"as": "x"
}
}
])
Mongo Playground
If you could dynamically build the query through code you could make this much more efficient by using the same match query in the $lookup stage like so:
const query = { my_id: "test_id" };
db.collection.aggregate([
{
$match: query
},
{
$lookup: {
...
pipeline: [
{ $match: query },
... rest of pipeline ...
]
}
}
])
This way you're only matching documents who at least match the initial query, this should drastically improve query performance ( obviously dependant on field x value entropy )
One more caveat to note is that if x document match you will get the same result x times, meaning you probably want to add $limit: 1 stage to your pipeline.

How filter using part of a field in a lookup collection

I've doing a "join" between two mongodb collections and want to filter a lookup collection (before join) using part of a field as a criteria. My first option would be using regex, but I didn't find how to do it in mongodb doc. I found 3 different ways to use regex $regex, $regexMatch e $regexFind.
No one worked or I dont know how to manage with it.
Any idea ?
I tried to use some of these 3 regex in this part of example, without success
$and: [{ $eq: ['$id', '$$key'] }, { $eq: ['$x', 0], [here regex maybe or something] }]
I want something like in SQL "WHERE substr(field,3,1) = 'A'" for example
db.collection('collectionA').aggregate([
{
$lookup: {
from: 'collectionB',
let: { key: '$key },
pipeline: [
{
$match: {
$expr: {
$and: [{ $eq: ['$id', '$$key'] }, { $eq: ['$x', 0] }]
}
}
}
],
as: 'i'
}
}
])
Why not using the substring stage for this?
db.collection('collectionA').aggregate([
{
$lookup: {
from: 'collectionB',
let: { key: '$key },
pipeline: [
{
$match: {
$expr: {
$and: [{ $eq: ['$id', '$$key'] }, {$eq: { $substr: {'$x', 3, 1}, 'A'} }]
}
}
}
],
as: 'i'
}
}
])
This should be the equivalent for the SQL statement you mentioned above.

MongoDB: Using match with input document variables

Why do I have to use this code: { $match: { $expr: { <aggregation expression> } } } to match a document using a document input variable as opposed to doing: { $match: { <query> } } ?
For example:
$lookup: {
from: "comments",
let: { myvar: '$myInputDocVariable'},
pipeline: [
{ $match:
{ $expr:
{ $and:
[
{ $eq: [ "$varFromCommentDocument", "$$myvar" ] },
]
}
}
},
],
as: "returnedValue"
}
The query above works fine but the query below does not work as expected. Why is this? Does this mean that if you are using input variables in a $lookup pipeline you have to use $expr? why is that?
$lookup: {
from: "comments",
let: { myvar: '$myInputDocVariable'},
pipeline: [
{ $match: { "$varFromCommentDocument", "$$myvar" } }
],
as: "returnedValue"
}
When you perform uncorrelated sub-queries for $lookup operator:
If you need to compare parent collection's field within pipeline, MongoDB cannot apply the standard query syntax (field:value) for variable / Aggregation expressions. In this case, you need to use $expr operator.
Example:
{ $match:
{ $expr:
{ $and:[
{ $eq: [ "$varFromCommentDocument", "$$myvar" ] },
]}
}
}
if it matches against "hard-coded" values, you don't need to use $expr operator.
Example:
$lookup: {
from: "comments",
pipeline: [
{ $match:{
"key": "value",
"key2": "value2"
}}
],
as: "returnedValue"
}
Does this mean that if you are using input variables in a $lookup
pipeline you have to use $expr
Yes correct, by default in filters i.e; in filter part of .find() or in $match aggregation stage you can't use an existing field in the document.
If at all if you need to use existing field's value in your query filter then you need to use aggregation pipeline, So in order to use aggregation pipeline in .find() or in $match you need to wrap your filter query with $expr. Same way to access local variables got created using let of $lookup filter in $match needs to be wrapped by $expr.
Let's consider below example :
Sample Docs :
[
{
"key": 1,
"value": 2
},
{
"key": 2,
"value": 4
},
{
"key": 5,
"value": 5
}
]
Query :
db.collection.find({ key: { $gt: 1 }, value: { $gt: 4 } })
Or
db.collection.aggregate([ { $match: { key: { $gt: 1 }, value: { $gt: 4 } } } ])
Test : mongoplayground
If you see the above query both input 1 & 4 are passed into query but it you check below query where you try to match key field == value field - it doesn't work :
db.collection.aggregate([ { $match: { key: { $eq: "$value" } } } ])
Test : mongoplayground
Above as you're comparing two existing fields then you can't do that as it mean you're checking for docs with key field value as string "$value". So to say it's not a string it's actually a reference to value field you need to use $eq aggregation operator rather than $eq query operator like below :
db.collection.aggregate([ { $match: { $expr: { $eq: [ "$key", "$value" ] } } } ])
Test : mongoplayground

How do I recombine unwinded documents?

I have the following document:
{
"_id" : ObjectId("5881cfa62189aa40268b458a"),
"description" : "Document A",
"companies" : [
{"code" : "0001"},
{"code" : "0002"},
{"code" : "0003"}
]
}
I want to filter the companies array to remove some objects based on the code field.
I've tried to use unwind and then match to filter out the companies, but I don't know how to recombine the objects. Is there another way of doing this?
Here's what I've tried so far:
db.getCollection('test').aggregate([
{
$unwind: {
'path': '$companies'
}
},
{
$match: {
'companies.code': {$in: ['0001', '0003']}
}
}
// How do I merge them back into a single document?
]);
A better way would be to just use the $filter operator on the array.
db.getCollection('test').aggregate([
{
$project:
{
companies: {
$filter: {
input: '$companies',
as: 'company',
cond: {$in: ['$$company.code', ['0001', '0003']]}
}
}
}
}
])
You can $group and control the document structure like that but its tedious work as you have to specify each and every field you want to preserve.
I recommend instead of unwinding to use $filter to match the companies like so:
db.getCollection('test').aggregate([
{
$addFields: {
companies: {
$filter: {
input: "$companies",
as: "company",
cond: {$in: ["$$company.code", ['0001', '0003']]}
}
}
}
},
{ // we now need this match as documents with no matched companies might exist
$match: {
"companies.0": {$exists: true}
}
}
])
If you want to keep the way you are doing using Aggregation pipeline:
db.getCollection('testcol').aggregate([
{$unwind: {'path': '$companies'}},
{$match: {'companies.code': {$in: ['0001', '0003']}}},
{$group: {_id: "$_id", description: { "$first": "$description" } , "companies": { $push: "$companies" }}} ,
])