Using $elemMatch and $or to implement a fallback logic (in projection) - mongodb

db.projects.findOne({"_id": "5CmYdmu2Aanva3ZAy"},
{
"responses": {
"$elemMatch": {
"match.nlu": {
"$elemMatch": {
"intent": "intent1",
"$and": [
{
"$or": [
{
"entities.entity": "entity1",
"entities.value": "value1"
},
{
"entities.entity": "entity1",
"entities.value": {
"$exists": false
}
}
]
}
],
"entities.1": {
"$exists": false
}
}
}
}
}
})
In a given project I need a projection containing only one response, hence $elemMatch. Ideally, look for an exact match:
{
"entities.entity": "entity1",
"entities.value": "value1"
}
But if such a match doesn't exist, look for a record where entities.value does not exist
The query above doesn't work because if it finds an item with entities.value not set it will return it. How can I get this fallback logic in a Mongo query
Here is an example of document
{
"_id": "5CmYdmu2Aanva3ZAy",
"responses": [
{
"match": {
"nlu": [
{
"entities": [],
"intent": "intent1"
}
]
},
"key": "utter_intent1_p3vE6O_XsT"
},
{
"match": {
"nlu": [
{
"entities": [{
"entity": "entity1",
"value": "value1"
}],
"intent": "intent1"
}
]
},
"key": "utter_intent1_p3vE6O_XsT"
},
{
"match": {
"nlu": [
{
"intent": "intent2",
"entities": []
},
{
"intent": "intent1",
"entities": [
{
"entity": "entity1"
}
]
}
]
},
"key": "utter_intent2_Laag5aDZv2"
}
]
}

To answer the question, the first thing to start with is that doing what you want is not as simple as an $elemMatch projection and requires special projection logic of the aggregation framework. The second main principle here is "nesting arrays is a really bad idea", and this is exactly why:
db.collection.aggregate([
{ "$match": { "_id": "5CmYdmu2Aanva3ZAy" } },
{ "$addFields": {
"responses": {
"$filter": {
"input": {
"$map": {
"input": "$responses",
"in": {
"match": {
"nlu": {
"$filter": {
"input": {
"$map": {
"input": "$$this.match.nlu",
"in": {
"entities": {
"$let": {
"vars": {
"entities": {
"$filter": {
"input": "$$this.entities",
"cond": {
"$and": [
{ "$eq": [ "$$this.entity", "entity1" ] },
{ "$or": [
{ "$eq": [ "$$this.value", "value1" ] },
{ "$ifNull": [ "$$this.value", false ] }
]}
]
}
}
}
},
"in": {
"$cond": {
"if": { "$gt": [{ "$size": "$$entities" }, 1] },
"then": {
"$slice": [
{ "$filter": {
"input": "$$entities",
"cond": { "$eq": [ "$$this.value", "value1" ] }
}},
0
]
},
"else": "$$entities"
}
}
}
},
"intent": "$$this.intent"
}
}
},
"cond": { "$ne": [ "$$this.entities", [] ] }
}
}
},
"key": "$$this.key"
}
}
},
"cond": { "$ne": [ "$$this.match.nlu", [] ] }
}
}
}}
])
Will return:
{
"_id" : "5CmYdmu2Aanva3ZAy",
"responses" : [
{
"match" : {
"nlu" : [
{
"entities" : [
{
"entity" : "entity1",
"value" : "value1"
}
],
"intent" : "intent1"
}
]
},
"key" : "utter_intent1_p3vE6O_XsT"
}
]
}
That is extracting ( as best I can determine your specification ), the first matching element from the nested inner array of entities where the conditions for both entity and value are met OR where the value property does not exist.
Note the additional fallback in that if both conditions meant returning multiple array elements, then only the first match where the value was present and matching would be the result returned.
Querying deeply nested arrays requires chained usage of $map and $filter in order to traverse those array contents and return only items which match the conditions. You cannot specify these conditions in an $elemMatch projection, nor has it even been possible until recent releases of MongoDB to even atomically update such structures without overwriting significant parts of the document or introducing problems with update concurrency.
More detailed explanation of this is on my existing answer to Updating a Nested Array with MongoDB and from the query side on Find in Double Nested Array MongoDB.
Note that both responses there show usage of $elemMatch as a "query" operator, which is really only about "document selection" ( therefore does not apply to an _id match condition ) and cannot be used in concert with the former "projection" variant nor the positional $ projection operator.
You would be advised then to "not nest arrays" and instead take the option of "flatter" data structures as those answers already discuss at length.

Related

How to search an array of objects in mongoDB without using aggregate query?

We have a use case where the data is stored in the below format
[
{
"Name": [
{
"KM": "2"
},
{
"Weld Joint Number": "JN2"
},
{
"Status": "Accepted"
},
{
"Upstream": "PP1"
},
{
"Downstream": "PP2"
}
]
},
{
"Name": [
{
"Pipe No": "PP5731A-08"
},
{
"Km": "1"
},
{
"Section Length (m)": "12.22"
}
]
}
]
We are checking for the possibility where we need to search the records using the find query(without aggregate) which matches the search criteria for the values in that array of Objects.
In the search scenario, the value can match with any value in the array.
Here's one way to find documents with a given single value in an object nested in arrays.
db.collection.find({
"$expr": {
"$in": [
// search value goes here
"Accepted",
{ // this puts all the values into a flat array
"$reduce": {
// instead of "stuff", use the field name of the array
"input": "$stuff.Name",
"initialValue": [],
"in": {
"$concatArrays": [
"$$value",
{
"$map": {
"input": "$$this",
"as": "elem",
"in": {
"$getField": {
"field": "v",
"input": {"$first": {"$objectToArray": "$$elem"}}
}
}
}
}
]
}
}
}
]
}
})
Try it on mongplayground.net.

MongoDB $filter nested array by date does not work

I have a document with a nested array which looks like this:
[
{
"id": 1,
data: [
[
ISODate("2000-01-01T00:00:00Z"),
2,
3
],
[
ISODate("2000-01-03T00:00:00Z"),
2,
3
],
[
ISODate("2000-01-05T00:00:00Z"),
2,
3
]
]
},
{
"id": 2,
data: []
}
]
As you can see, we have an array of arrays. For each element in the data array, the first element is a date.
I wanted to create an aggregation pipeline which filters only the elements of data where the date is larger than a given date.
db.collection.aggregate([
{
"$match": {
"id": 1
}
},
{
"$project": {
"data": {
"$filter": {
"input": "$data",
"as": "entry",
"cond": {
"$gt": [
"$$entry.0",
ISODate("2000-01-04T00:00:00Z")
]
}
}
}
}
}
])
The problem is that with $gt, this just returns an empty array for data. With $lt this returns all elements. So the filtering clearly does not work.
Expected result:
[
{
"id": 1,
"data": [
[
ISODate("2000-01-05T00:00:00Z"),
2,
3
]
]
}
]
Any ideas?
Playground
I believe the issue is that when you write $$entry.0, MongoDB is trying to evaluate entry.0 as a variable name, when in reality the variable is named entry. You could make use of the $first array operator in order to get the first element like so:
db.collection.aggregate([
{
"$match": {
"id": 1
}
},
{
"$project": {
"data": {
"$filter": {
"input": "$data",
"as": "entry",
"cond": {
"$gt": [
{
$first: "$$entry"
},
ISODate("2000-01-04T00:00:00Z")
]
}
}
}
}
}
])
Mongo playground example
Don't think $$entry.0 work to get the first element of the array. Instead, use $arrayElemAt operator.
db.collection.aggregate([
{
"$match": {
"id": 1
}
},
{
"$project": {
"data": {
"$filter": {
"input": "$data",
"as": "entry",
"cond": {
"$gt": [
{
"$arrayElemAt": [
"$$entry",
0
]
},
ISODate("2000-01-04T00:00:00Z")
]
}
}
}
}
}
])
Sample Mongo Playground
to specify which element in the array you are comparing it is better to use $arrayElemAt instead of $$ARRAY.0. you must pass 2 parameters while using $arrayElemAt, the first one is the array which in your case is $$entry, and the second one is the index which in your case is 0
this is the solution I came up with:
db.collection.aggregate([
{
"$match": {
"id": 1
}
},
{
"$project": {
"data": {
"$filter": {
"input": "$data",
"as": "entry",
"cond": {
"$gt": [
{
"$arrayElemAt": [
"$$entry",
0
]
},
ISODate("2000-01-04T00:00:00Z")
]
}
}
}
}
}
])
playground

MongoDB/PyMongo: upsert array element

I have the following document:
{'software_house': 'k1',
'client_id': '1234',
'transactions': [
{'antecedents': 12345,
'consequents': '015896018',
'antecedent support': 0.0030889166727954697},
{'antecedents': '932696735',
'consequents': '939605046',
'antecedent support': 0.0012502757961314996}
...
]}
In which key 'transactions' stores within an array 3 features, for each item.
I would like to update each item contained in the 'transactions' array, that matches with the same 'software_house', 'client_id', 'transactions.antecedents' and 'transactions.consequents'; and thus:
Overwriting the element within the array if it does exist
Appending a new value within 'transactions' if it doesn't
How could I achieve that using pymongo?
You can do this with an update with aggregation pipeline. You can first $filter the element matched. Then $setUnion with the item you want to upsert
PyMongo:
db.collection.update_many(filter = {
// the criteria you want to match outside array
"software_house": "k1",
"client_id": "1234"
},
update = [
{
"$addFields": {
"transactions": {
"$filter": {
"input": "$transactions",
"as": "t",
"cond": {
$not: {
$and: [
// the criteria you want to match in array
{
$eq: [
"$$t.antecedents",
12345
]
},
{
$eq: [
"$$t.consequents",
"015896018"
]
}
]
}
}
}
}
}
},
{
"$addFields": {
"transactions": {
"$setUnion": [
"$transactions",
[
{
"antecedents": 12345,
"consequents": "the entry you want to upsert",
"antecedent support": -1
}
]
]
}
}
}
])
Native MongoDB query:
db.collection.update({
// the criteria you want to match outside array
"software_house": "k1",
"client_id": "1234"
},
[
{
"$addFields": {
"transactions": {
"$filter": {
"input": "$transactions",
"as": "t",
"cond": {
$not: {
$and: [
// the criteria you want to match in array
{
$eq: [
"$$t.antecedents",
12345
]
},
{
$eq: [
"$$t.consequents",
"015896018"
]
}
]
}
}
}
}
}
},
{
"$addFields": {
"transactions": {
"$setUnion": [
"$transactions",
[
{
"antecedents": 12345,
"consequents": "the entry you want to upsert",
"antecedent support": -1
}
]
]
}
}
}
],
{
multi: true
})
Here is the Mongo playground for your reference.

MongoDB how to filter in nested array

I have below data. I want to find value=v2 (remove others value which not equals to v2) in the inner array which belongs to name=name2. How to write aggregation for this? The hard part for me is filtering the nestedArray which only belongs to name=name2.
{
"_id": 1,
"array": [
{
"name": "name1",
"nestedArray": [
{
"value": "v1"
},
{
"value": "v2"
}
]
},
{
"name": "name2",
"nestedArray": [
{
"value": "v1"
},
{
"value": "v2"
}
]
}
]
}
And the desired output is below. Please note the value=v1 remains under name=name1 while value=v1 under name=name2 is removed.
{
"_id": 1,
"array": [
{
"name": "name1",
"nestedArray": [
{
"value": "v1"
},
{
"value": "v2"
}
]
},
{
"name": "name2",
"nestedArray": [
{
"value": "v2"
}
]
}
]
}
You can try,
$set to update array field, $map to iterate loop of array field, check condition if name is name2 then $filter to get matching value v2 documents from nestedArray field and $mergeObject merge objects with available objects
let name = "name2", value = "v2";
db.collection.aggregate([
{
$set: {
array: {
$map: {
input: "$array",
in: {
$mergeObjects: [
"$$this",
{
$cond: [
{ $eq: ["$$this.name", name] }, //name add here
{
nestedArray: {
$filter: {
input: "$$this.nestedArray",
cond: { $eq: ["$$this.value", value] } //value add here
}
}
},
{}
]
}
]
}
}
}
}
}
])
Playground
You can use the following aggregation query:
db.collection.aggregate([
{
$project: {
"array": {
"$concatArrays": [
{
"$filter": {
"input": "$array",
"as": "array",
"cond": {
"$ne": [
"$$array.name",
"name2"
]
}
}
},
{
"$filter": {
"input": {
"$map": {
"input": "$array",
"as": "array",
"in": {
"name": "$$array.name",
"nestedArray": {
"$filter": {
"input": "$$array.nestedArray",
"as": "nestedArray",
"cond": {
"$eq": [
"$$nestedArray.value",
"v2"
]
}
}
}
}
}
},
"as": "array",
"cond": {
"$eq": [
"$$array.name",
"name2"
]
}
}
}
]
}
}
}
])
MongoDB Playground

How to find document and single subdocument matching given criterias in MongoDB collection

I have collection of products. Each product contains array of items.
> db.products.find().pretty()
{
"_id" : ObjectId("54023e8bcef998273f36041d"),
"shop" : "shop1",
"name" : "product1",
"items" : [
{
"date" : "01.02.2100",
"purchasePrice" : 1,
"sellingPrice" : 10,
"count" : 15
},
{
"date" : "31.08.2014",
"purchasePrice" : 10,
"sellingPrice" : 1,
"count" : 5
}
]
}
So, can you please give me an advice, how I can query MongoDB to retrieve all products with only single item which date is equals to the date I pass to query as parameter.
The result for "31.08.2014" must be:
{
"_id" : ObjectId("54023e8bcef998273f36041d"),
"shop" : "shop1",
"name" : "product1",
"items" : [
{
"date" : "31.08.2014",
"purchasePrice" : 10,
"sellingPrice" : 1,
"count" : 5
}
]
}
What you are looking for is the positional $ operator and "projection". For a single field you need to match the required array element using "dot notation", for more than one field use $elemMatch:
db.products.find(
{ "items.date": "31.08.2014" },
{ "shop": 1, "name":1, "items.$": 1 }
)
Or the $elemMatch for more than one matching field:
db.products.find(
{ "items": {
"$elemMatch": { "date": "31.08.2014", "purchasePrice": 1 }
}},
{ "shop": 1, "name":1, "items.$": 1 }
)
These work for a single array element only though and only one will be returned. If you want more than one array element to be returned from your conditions then you need more advanced handling with the aggregation framework.
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$unwind": "$items" },
{ "$match": { "items.date": "31.08.2014" } },
{ "$group": {
"_id": "$_id",
"shop": { "$first": "$shop" },
"name": { "$first": "$name" },
"items": { "$push": "$items" }
}}
])
Or possibly in shorter/faster form since MongoDB 2.6 where your array of items contains unique entries:
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$project": {
"shop": 1,
"name": 1,
"items": {
"$setDifference": [
{ "$map": {
"input": "$items",
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el.date", "31.08.2014" ] },
"$$el",
false
]
}
}},
[false]
]
}
}}
])
Or possibly with $redact, but a little contrived:
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$redact": {
"$cond": [
{ "$eq": [ { "$ifNull": [ "$date", "31.08.2014" ] }, "31.08.2014" ] },
"$$DESCEND",
"$$PRUNE"
]
}}
])
More modern, you would use $filter:
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$addFields": {
"items": {
"input": "$items",
"cond": { "$eq": [ "$$this.date", "31.08.2014" ] }
}
}}
])
And with multiple conditions, the $elemMatch and $and within the $filter:
db.products.aggregate([
{ "$match": {
"$elemMatch": { "date": "31.08.2014", "purchasePrice": 1 }
}},
{ "$addFields": {
"items": {
"input": "$items",
"cond": {
"$and": [
{ "$eq": [ "$$this.date", "31.08.2014" ] },
{ "$eq": [ "$$this.purchasePrice", 1 ] }
]
}
}
}}
])
So it just depends on whether you always expect a single element to match or multiple elements, and then which approach is better. But where possible the .find() method will generally be faster since it lacks the overhead of the other operations, which in those last to forms does not lag that far behind at all.
As a side note, your "dates" are represented as strings which is not a very good idea going forward. Consider changing these to proper Date object types, which will greatly help you in the future.
Based on Neil Lunn's code I work with this solution, it includes automatically all first level keys (but you could also exclude keys if you want):
db.products.find(
{ "items.date": "31.08.2014" },
{ "shop": 1, "name":1, "items.$": 1 }
{ items: { $elemMatch: { date: "31.08.2014" } } },
)
With multiple requirements:
db.products.find(
{ "items": {
"$elemMatch": { "date": "31.08.2014", "purchasePrice": 1 }
}},
{ items: { $elemMatch: { "date": "31.08.2014", "purchasePrice": 1 } } },
)
Mongo supports dot notation for sub-queries.
See: http://docs.mongodb.org/manual/reference/glossary/#term-dot-notation
Depending on your driver, you want something like:
db.products.find({"items.date":"31.08.2014"});
Note that the attribute is in quotes for dot notation, even if usually your driver doesn't require this.