MongoDB find join array - mongodb

I have a collection with data that looks sort of like this
{
"part": [
{ "a": "1", "b": "a" },
{ "a": "23", "b": "b" },
{ "a": "4", "b": "c" },
]
}
What I would like is a way of searching for documents where the join of all a parts equals the search that I am looking for.
for example 1234 should match the document above, but 124 should not.
is this possible with MongoDB?

You can do it with Aggregation framework:
$match with $eq - To filter only documents where concatenated a properties of part array are equal to the input string.
$reduce with $concat - To concatenate all a properties of part array for each document.
db.collection.aggregate([
{
"$match": {
"$expr": {
"$eq": [
"1234",
{
"$reduce": {
"input": "$part",
"initialValue": "",
"in": {
"$concat": [
"$$value",
"$$this.a"
]
}
}
}
]
}
}
}
])
Working example

You can use aggregate with $reduce to join string then $match to filter your string.
Here is the playground.

Related

Find in tripple nested array mongodb [duplicate]

I have this Collection in mongodb
{
"_id" : "777",
"someKey" : "someValue",
"someArray" : [
{
"name" : "name1",
"someNestedArray" : [
{
"name" : "value"
},
{
"name" : "delete me"
}
]
}
]
}
I want to find document based on someArray.someNestedArray.name
but i can't find any useful link all search result about update nested array
i am trying this but return nothing
db.mycollection.find({"someArray.$.someNestedArray":{"$elemMatch":{"name":"1"}}})
db.mycollection.find({"someArray.$.someNestedArray.$.name":"1"})
and Some thing else
how can i find by element in double nested array mongodb?
In the simplest sense this just follows the basic form of "dot notation" as used by MongoDB. That will work regardless of which array member the inner array member is in, as long as it matches a value:
db.mycollection.find({
"someArray.someNestedArray.name": "value"
})
That is fine for a "single field" value, for matching multiple-fields you would use $elemMatch:
db.mycollection.find({
"someArray": {
"$elemMatch": {
"name": "name1",
"someNestedArray": {
"$elemMatch": {
"name": "value",
"otherField": 1
}
}
}
}
})
That matches the document which would contain something with a a field at that "path" matching the value. If you intended to "match and filter" the result so only the matched element was returned, this is not possible with the positional operator projection, as quoted:
Nested Arrays
The positional $ operator cannot be used for queries which traverse more than one array, such as queries that traverse arrays nested within other arrays, because the replacement for the $ placeholder is a single value
Modern MongoDB
We can do this by applying $filter and $map here. The $map is really needed because the "inner" array can change as a result of the "filtering", and the "outer" array of course does not match the conditions when the "inner" was stripped of all elements.
Again following the example of actually having multiple properties to match within each array:
db.mycollection.aggregate([
{ "$match": {
"someArray": {
"$elemMatch": {
"name": "name1",
"someNestedArray": {
"$elemMatch": {
"name": "value",
"otherField": 1
}
}
}
}
}},
{ "$addFields": {
"someArray": {
"$filter": {
"input": {
"$map": {
"input": "$someArray",
"as": "sa",
"in": {
"name": "$$sa.name",
"someNestedArray": {
"$filter": {
"input": "$$sa.someNestedArray",
"as": "sn",
"cond": {
"$and": [
{ "$eq": [ "$$sn.name", "value" ] },
{ "$eq": [ "$$sn.otherField", 1 ] }
]
}
}
}
}
},
},
"as": "sa",
"cond": {
"$and": [
{ "$eq": [ "$$sa.name", "name1" ] },
{ "$gt": [ { "$size": "$$sa.someNestedArray" }, 0 ] }
]
}
}
}
}}
])
Therefore on the "outer" array the $filter actually looks at the $size of the "inner" array after it was "filtered" itself, so you can reject those results when the whole inner array does in fact match noting.
Older MongoDB
In order to "project" only the matched element, you need the .aggregate() method:
db.mycollection.aggregate([
// Match possible documents
{ "$match": {
"someArray.someNestedArray.name": "value"
}},
// Unwind each array
{ "$unwind": "$someArray" },
{ "$unwind": "$someArray.someNestedArray" },
// Filter just the matching elements
{ "$match": {
"someArray.someNestedArray.name": "value"
}},
// Group to inner array
{ "$group": {
"_id": {
"_id": "$_id",
"name": "$someArray.name"
},
"someKey": { "$first": "$someKey" },
"someNestedArray": { "$push": "$someArray.someNestedArray" }
}},
// Group to outer array
{ "$group": {
"_id": "$_id._id",
"someKey": { "$first": "$someKey" },
"someArray": { "$push": {
"name": "$_id.name",
"someNestedArray": "$someNestedArray"
}}
}}
])
That allows you to "filter" the matches in nested arrays for one or more results within the document.
You can also try something like below:
db.collection.aggregate(
{ $unwind: '$someArray' },
{
$project: {
'filteredValue': {
$filter: {
input: "$someArray.someNestedArray",
as: "someObj",
cond: { $eq: [ '$$someObj.name', 'delete me' ] }
}
}
}
}
)

Looking for sub-documents containing a field in a document's array

Assuming I have the following persons collection:
{
"_id": ObjectId("569d07a38e61973f6aded134"),
"name": "john",
"pets": [
{
"name": "spot",
"type": "dog",
"special": "spot eye"
},
{
"name": "bob",
"type": "cat",
}
]
},
{
"_id": ObjectId("569d07a38e61973f6aded135"),
"name": "susie",
"pets": [
{
"name": "fred",
"type": "cat",
}
]
}
How can I retrieve the persons who's pet(s) has a special field? I'm looking to have the returned pets array only contain the pets with a special field.
For example, the expected result from the collection above would be:
{
"_id": ObjectId("569d07a38e61973f6aded134"),
"name": "john",
"pets": [
{
"name": "spot",
"type": "dog",
"special": "spot eye"
}
]
}
I'm trying to implement this in hopefully one query with pymongo, although even just a working MongoDB or mongoose query would be lovely.
I've tried to start with:
db.persons.find({pets:{special:{$exists:true}}});
but that has returned 0 records, even though there should be some.
If the array holds embedded documents, you can query for specific fields in the embedded documents using dot notation.
Without dot notation you are querying array documents for a complete match.
Try the following query:
db.persons.find({'pets.special':{$exists:true}});
You can use the aggregation framework to get the desired result. Run the following aggregation pipeline:
db.persons.aggregate([
{
"$match": {
"pets.special": { "$exists": true }
}
},
{
"$project": {
"name": 1,
"pets": {
"$setDifference": [
{
"$map": {
"input": "$pets",
"as": "el",
"in": {
"$cond": [
{ "$gt": [ "$$el.special", null ] },
"$$el", false
]
}
}
},
[false]
]
}
}
}
])
Sample Output
{
"result" : [
{
"_id" : ObjectId("569d07a38e61973f6aded134"),
"name" : "john",
"pets" : [
{
"name" : "spot",
"type" : "dog",
"special" : "spot eye"
}
]
}
],
"ok" : 1
}
The operators that make a significant difference are the $setDifference and $map operators. The $map operator in essence creates a new array field that holds values as a result of the evaluated logic in a subexpression to each element of an array. The $setDifference operator then returns a set with elements that appear in the first set but not in the second set; i.e. performs a relative complement of the second set relative to the first. In this case it will return the final pets array that has elements not related to the parent documents based on the existence of the special property, based on the conditional operator $cond which evaluates the expression returned by the comparison operator $gt.

Matching two equal fields in arrays without using unwind

The problem is that given documents with two arrays each containing documents as their elements that I want to find the documents that essentially have:
"obj1.a" === "obj2.b"
So given the sample documents, but actually expecting much larger arrays, then how do do this?:
{
"obj1": [
{ "a": "a", "b": "b" },
{ "a": "a", "b": "c" }
],
"obj2": [
{ "a": "c", "b": "b" },
{ "a": "c", "b": "c" }
]
},
{
"obj1": [
{ "a": "a", "b": "b" }
],
"obj2": [
{ "a": "a", "b": "a" }
]
}
One approach might be to compare these with JavaScript and the $where operator, but looping large arrays from within JavaScript doesn't sound very favorable.
Another approach is using the aggregation framework to do the comparison, but this involves unwinding two arrays on top of each other which can create a lot of documents to be processed in the pipeline:
db.objects.aggregate([
{ "$unwind": "$obj1" },
{ "$unwind": "$obj2" },
{ "$project": {
"match": { "$eq": [ "$obj1.a", "$obj2.b" ] }
}},
{ "$group": {
"_id": "$_id",
"match": { "$max": "$match" }
}},
{ "$match": { "match": true } }
])
Where performance is a concern it is easy to see how the number of documents actually processing through $project and $group can end up many times larger than the original documents in the collection.
So in order to do this there has to be some way of comparing the array elements without needing to perform an $unwind on those arrays and end up grouping the documents back together. How could this be done?
You can get this sort of result using the $map operator that was introduced in MongoDB 2.6. This operates by taking an input array and allowing an expression to be evaluated over each element producing a new array as the result:
db.objects.aggregate([
{ "$project": {
"match": {
"$size": {
"$setIntersection": [
{ "$map": {
"input": "$obj1",
"as": "el",
"in": { "$concat": ["$$el.a",""] }
}},
{ "$map": {
"input": "$obj2",
"as": "el",
"in": { "$concat": ["$$el.b",""] }
}}
]
}
}
}},
{ "$match": { "match": { "$gte": 1 } } }
])
Here this is used with the $setIntersection and $size operators. As the $map returns just the property values from the elements that you want to compare you end up with two arrays just containing those values.
The only this is that the "in" option for $map currently requires an operator to be present within the Object {} notation of it's arguments. You cannot presently say:
"in": "$$el.a"
To get around this we are using $concat to join the string value with an empty string. Other operators can be used for different types of even $ifNull which would be fairly generic and gets around "type" problems
"in": { "$ifNull": [ "$$el.a", false ] }
The $setIntersection that wraps these, is used to determine which values of those "sets" are the same and returns it's result as another array containing only the matching values.
Finally the $size operator here is an aggregation method that returns the actual "size" of the array as an integer. So this can be used in the following $match to then filter out any results that did not return a "size" value of 1 or greater.
Essentially this does all the work that was done in four individual stages, where the first two are exponentially growing the number of documents to be processed, within two simple passes, all without increasing the number of documents that were received as input.

Mongodb aggregation pipeline is slow

I have a database of 30mb size, and it has 300 documents which are stored in a single collection, and their size vary from 1mb to 10kb. I am using the new aggregation framework which comes with 2.6 and I do not have any indexes.
I have an aggregation pipeline as following:
1. $match > first query match
2. $project > exclude some fields for efficiency
3. $unwind > unwind one of the arrays
4. $unwind > unwind second array
5. $project > projection to find matching fields among two arrays with $eq
6. $match > same:true
7. $group > put the unwinded arrays together
8. $limit(50)
this pipeline above requires 30 seconds. If I remove $limit, it takes ages. My question is:
Database size is only 30MB, and pipeline is not complicated at all. Why is it taking so long? Any ideas on that?
EDIT
My schema is as following:
{
username: string (max 20 chars
userid : string max 20 chars
userage : string max 20 chars
userobj1: array of objects, length: ~300-500
// example of userobj1:
[
{
innerobj1: array of objects, length: ~30-50
innerobj2: array of objects, length: ~100-200
userinfo1: string max 20 chars
userinfo2: string max 20 chars
userinfo3: string max 20 chars
userinfo4: string max 20 chars
} ...
]
userobj2: same as userobj1
userobj3: same as userobj1
userobj4: same as userobj1
}
this document above has inner objects up to 3-4 levels. Sorry that I cannot provide an example but the alias should be enough. Example query is as following:
1. $match:
$and : [
{userobj1: $elemMatch: {userinfo1:a}},
{userobj1: $elemMatch: {userinfo4:b}}
]
2. $project {username:1, userid:1, userobj1:1, userobj2:1}
3. $unwind userobj1
4. $unwind userobj2
5. $project
{
username:1,
userid:1,
userobj1:1,
userobj2:1,
userobj3:1,
userobj4:1,
"same" : {
$eq: [ userobj3.userinfo4, userobj4.userinfo4 ]
}
}
6. $match {same:true}
7. $group all arrays back
8. limit 50.
There is something here that I just don't get about what you are actually trying to do here. So please bear with me on the possible actual questions and answers that I see.
Considering this simplified data set to your case:
{
"obj1": [
{ "a": "a", "b": "b" },
{ "a": "a", "b": "c" }
],
"obj2": [
{ "a": "c", "b": "b" },
{ "a": "c", "b": "c" }
]
},
{
"obj1": [
{ "a": "a", "b": "b" }
],
"obj2": [
{ "a": "a", "b": "c" }
]
}
Q: "Are you not just trying to to match the documents with { "a": "a", "b": b" } in "obj1" and also { "b": "b" } in "object2"?"
If that is the case then this is just a simple query with .find():
db.collection.find({
"obj1": {
"$elemMatch": { "a": "a", "b": "b" }
},
"obj2.b": "b"
})
Matches only one of those documents that meets the conditions, in this case just the one:
{
"obj1": [
{ "a": "a", "b": "b" },
{ "a": "a", "b": "c" }
],
"obj2": [
{ "a": "c", "b": "b" },
{ "a": "c", "b": "c" }
]
}
Q: "Are you possibly trying to find the positions in the array where your conditions are true?"
If so there are some operators available to MongoDB 2.6 that helps you without using $unwind:
db.objects.aggregate([
{ "$match": {
"obj1": {
"$elemMatch": { "a": "a", "b": "b" }
},
"obj2.b": "b"
}},
{ "$project": {
"obj1": 1,
"obj2": 1,
"match1": {
"$map": {
"input": "$obj1",
"as": "el",
"in": {
"$and": [
{ "$eq": [ "$$el.a", "a" ] },
{ "$eq": [ "$$el.b", "b" ] }
]
}
}
},
"match2": {
"$map": {
"input": "$obj2",
"as": "el",
"in": {
"$eq": [ "$$el.b", "b" ]
}
}
}
}}
])
Gives you:
{
"obj1": [
{ "a": "a", "b": "b" },
{ "a": "a", "b": "c" }
],
"obj2": [
{ "a": "c", "b": "b" },
{ "a": "c", "b": "c" }
],
"match1" : [
true,
false
],
"match2" : [
true,
false
]
}
Q: "Or are you possibly trying to "filter" only the matching array elements to those conditions?"
You can do this with more set operators in MongoDB 2.6 without using $unwind:
db.objects.aggregate([
{ "$match": {
"obj1": {
"$elemMatch": { "a": "a", "b": "b" }
},
"obj2.b": "b"
}},
{ "$project": {
"obj1": {
"$setDifference": [
{ "$map": {
"input": "$obj1",
"as": "el",
"in": {
"$cond": [
{ "$and": [
{ "$eq": [ "$$el.a", "a" ] },
{ "$eq": [ "$$el.b", "b" ] }
]},
"$$el",
false
]
}
}},
[false]
]
},
"obj2": {
"$setDifference": [
{ "$map": {
"input": "$obj2",
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el.b", "b" ] },
"$$el",
false
]
}
}},
[false]
]
}
}}
])
And the result:
{
"obj1": [
{ "a": "a", "b": "b" },
],
"obj2": [
{ "a": "c", "b": "b" },
]
}
The last entry there is the cutest which combines $cond, $map and $setDifference to do some complex filtering of the objects in the array in order to filter just the matches to the conditions. You previously would have to $unwind and $match to get those results.
So it is both $unwind and $group that are not required to actually get to any of these results, and those are really killing you. Also your big "pass through" on the "unwound" arrays with $eq suggests trying to get to the end result of one of the above, but in the way you have implemented it would be very costly.
Also try to have an index within one of those arrays for the element to match that is going to reduce your working results down as far as possible. In all cases it's going to improve things even if you cannot have a compound "multi-key" index due to the restrictions there.
Anyhow, hoping that at least something here that either matches your intent or is at least close to what you are trying to do.
Since your comments went this way, matching values of "obj1.a" to "obj2.b" without the filtering is not much different to the general cases shown.
db.objects.aggregate([
{ "$project": {
"match": {
"$size": {
"$setIntersection": [
{ "$map": {
"input": "$obj1",
"as": "el",
"in": { "$concat": ["$$el.a",""] }
}},
{ "$map": {
"input": "$obj2",
"as": "el",
"in": { "$concat": ["$$el.b",""] }
}}
]
}
}
}},
{ "$match": { "$gte": 1 } }
])
All simply done without using $unwind.
I know this is an old question, but it looks like a simple answer was never reached and it involves using an expression that was available in 2.6 so it would have worked back then too. You don't need to do any $unwinding or complex $mapping you just need to do a $setIntersection on the two arrays that you want to find a match in.
Using the example data from the very long answer:
db.foo.aggregate(
{$match:{"obj1.a":"a"}},
{$project:{keep:{$setIntersection:["$obj1.b","$obj2.b"]},obj1:1,obj2:1}},
{$match:{keep:{$ne:[]}}})
{ "_id" : ObjectId("588a8206c01d80beca3a8e45"), "obj1" : [ { "a" : "a", "b" : "b" }, { "a" : "a", "b" : "c" } ], "obj2" : [ { "a" : "c", "b" : "b" }, { "a" : "c", "b" : "c" } ], "keep" : [ "b", "c" ] }
Only one of the two documents is kept, the one that had two values of "b" in both obj1 and obj2 arrays.
In your original "syntax" the $project stage would be
same: {$setIntersection: [ '$userobj3.userinfo4', '$userobj4.userinfo4' ]}
My guess is that it takes that long because there are no indexes so it does a full collection scan every time it needs a record.
Try adding an index on userinfo1:a and I think you will see a good performance gain. I will also recommend that you remove the AND syntax from the match phase and rewrite it as a list.
I think it would be really helpful for both you and the question to give us the output of the aggregation's explain. In mongo 2.6 you can have explain in aggregation pipeline.
db.collection.aggregate( [ ... stages ...], { explain:true } )

How to match multiple array elements without using unwind?

I have a collection which contains documents with multiple arrays. These are generally quite large, but for purposes of explaining you can consider the following two documents:
{
"obj1": [
{ "a": "a", "b": "b" },
{ "a": "a", "b": "c" },
{ "a": "a", "b": "b" }
],
"obj2": [
{ "a": "a", "b": "b" },
{ "a": "a", "b": "c" }
]
},
{
"obj1": [
{ "a": "c", "b": "b" }
],
"obj2": [
{ "a": "c", "b": "c" }
]
}
The idea is to just get the matching elements in the array to the query. There are multiple matches required and within multiple arrays so this is not within the scope of what can be done with projection and the positional $ operator. The desired result would be like:
{
"obj1": [
{ "a": "a", "b": "b" },
{ "a": "a", "b": "b" }
],
"obj2": [
{ "a": "a", "b": "b" },
]
},
A traditional approach would be something like this:
db.objects.aggregate([
{ "$match": {
"obj1": {
"$elemMatch": { "a": "a", "b": "b" }
},
"obj2.b": "b"
}},
{ "$unwind": "$obj1" },
{ "$match": {
"obj1.a": "a",
"obj1.b": "b"
}},
{ "$unwind": "$obj2" },
{ "$match": { "obj2.b": "b" }},
{ "$group": {
"_id": "$_id",
"obj1": { "$addToSet": "$obj1" },
"obj2": { "$addToSet": "$obj2" }
}}
])
But the use of $unwind there for both arrays causes the overall set to use a lot of memory and slows things down. There are also possible problems there with $addToSet and splitting the $group stages for each array can make things even slower.
So I am looking for a process that is not so intensive but arrives at the same result.
Since MongoDB 3.0 we have the $filter operator, which makes this really quite simple:
db.objects.aggregate([
{ "$match": {
"obj1": {
"$elemMatch": { "a": "a", "b": "b" }
},
"obj2.b": "b"
}},
{ "$project": {
"obj1": {
"$filter": {
"input": "$obj1",
"as": "el",
"cond": {
"$and": [
{ "$eq": [ "$$el.a", "a" ] },
{ "$eq": [ "$$el.b", "b" ] }
]
}
}
},
"obj2": {
"$filter": {
"input": "$obj2",
"as": "el",
"cond": { "$eq": [ "$$el.b", "b" ] }
}
}
}}
])
MongoDB 2.6 introduces the $map operator which can act on arrays in place without the need to $unwind. Combined with some other logical operators and additional set operators that have been added to the aggregation framework there is a solution to this problem and others.
db.objects.aggregate([
{ "$match": {
"obj1": {
"$elemMatch": { "a": "a", "b": "b" }
},
"obj2.b": "b"
}},
{ "$project": {
"obj1": {
"$setDifference": [
{ "$map": {
"input": "$obj1",
"as": "el",
"in": {
"$cond": [
{ "$and": [
{ "$eq": [ "$$el.a", "a" ] },
{ "$eq": [ "$$el.b", "b" ] }
]},
"$$el",
false
]
}
}},
[false]
]
},
"obj2": {
"$setDifference": [
{ "$map": {
"input": "$obj2",
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el.b", "b" ] },
"$$el",
false
]
}
}},
[false]
]
}
}}
])
The core of this is in the $map operator which works like an and internalized $unwind by allowing processing of all the array elements, but also allows operations to act on those array elements in the same statement. Typically this would be done in several pipeline stages but here we can process within a single $project, $group or $redact stage.
In this case that inner processing utilizes the $cond operator which combines with a logical condition in order to return a different result for true or false. Here we act on usage of the $eq operator to test values of the fields contained within the current element in much the same way as a separate $match pipeline stage would be used. The $and condition is another logical operator which works on combining the results of multiple conditions on the element, much in the same way as the $elemMatch operator would work within a $match pipeline stage.
Finally, since our $cond operator was used to either return the value of the current element or false if the condition was not true we need to "filter" any false values from the array produced my the $map operation. The is where the $setDifference operator is used to compare the two input arrays and return the difference. So when compared to an array that only contains false for it's element, the result will be the elements that were returned from the $map without the false elements coming out of $cond when the conditions were not met.
The result filters only the matching elements from the array without having to run through seperate pipeline stages for $unwind, $match and $group.
return more then one match,
const { timeSlots } = req.body;
let ts = [];
for (const slot of timeSlots) {
ts.push({
$eq: ['$$timeSlots.id',slot.id],
});
}
const products = await Product.aggregate<ProductDoc>([
{
$match: {
_id: req.params.productId,
recordStatus: RecordStatus.Active,
},
},
{
$project: {
timeSlots: {
$filter: {
input: '$timeSlots',
as: 'timeSlots',
cond: {
$or: ts,
},
},
},
name: 1,
mrp: 1,
},
},
]);