mongodb - how it processes $lt on subdocuments - mongodb

I have a collection where one of the fields is a subdocument. I am confused how mongodb supports the $lt, $gt query operators on the complete subdocument.
sample:
db.test.insert({a:1, subdocA:{x:4, y:7, z:10}, b:10})
db.test.insert({a:9, subdocA:{x:2, y:70, z:5}, b:9})
db.test.insert({a:4, subdocA:{x:8, y:2, z:45}, b:19})
In the above collection, I see that mongodb supports a query like:
db.test.find({subdocA:{$lt:{x:6, y:5, z:25}})
In fact it also supports similar queries with $gt operator. It also supports sort({subdocA:1}) on the query.
I would like to know the "logic" it uses to compare the subdocuments and thereby process the $lt, $gt operators.
I see mongodb documentation about how exact matches are processed with subdocuments. But I don't see any documentation on how $lt, $gt are handled with subdocuments.
Thanks.

You have to specify the operator for each field, naming the field with a dot (.) to reach inside the embeeded document. The documentation about $gt hints at this.
So to query a subdocument on z lower than 20, you actually search for subdocA.z being lower than 20, like this :
> db.test.find({'subdocA.z':{$lt:20}}, {_id:0})
{ "a" : 1, "subdocA" : { "x" : 4, "y" : 7, "z" : 10 }, "b" : 10 }
{ "a" : 9, "subdocA" : { "x" : 2, "y" : 70, "z" : 5 }, "b" : 9 }
You can add other criteria in the same way, here with subdocA.x lower than 3 :
> db.test.find({'subdocA.z':{$lt:20}, 'subdocA.x':{$lt:3}}, {_id:0})
{ "a" : 9, "subdocA" : { "x" : 2, "y" : 70, "z" : 5 }, "b" : 9 }
Finally, you can mix and match fields from the "base" document :
> db.test.find({'subdocA.z':{$lt:20}, 'a':{$gt:3}}, {_id:0})
{ "a" : 9, "subdocA" : { "x" : 2, "y" : 70, "z" : 5 }, "b" : 9 }

Related

How can I update a field starting with dollar in an embedded document in MongoDB?

I have a document in MongoDB in a collection with a document which has a field x which value is an embedded document, created this way:
> db.c.insert({_id: 1, x: {$a: 2, b: 3}})
WriteResult({ "nInserted" : 1 })
> db.c.findOne({_id: 1})
{ "_id" : 1, "x" : { "$a" : 2, "b" : 3 } }
Note there is a field in the embedded document which starts with dollar ($a) and a field that doesn't start with dollar (b). I can update the field without dolar without problem:
> db.c.updateOne({_id: 1}, {$set: {"x.b": 30}})
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }
> db.c.findOne({_id: 1})
{ "_id" : 1, "x" : { "$a" : 2, "b" : 30 } }
However, if I try to update the field which starts with dollar in the same way, I get an error
> db.c.updateOne({_id: 1}, {$set: {"x.$a": 20}})
WriteError({
"index" : 0,
"code" : 52,
"errmsg" : "The dollar ($) prefixed field '$a' in 'x.$a' is not valid for storage.",
"op" : {
"q" : {
"_id" : 1
},
"u" : {
"$set" : {
"x.$a" : 20
}
},
"multi" : false,
"upsert" : false
}
})
Thus, how can I update a field starting with dollar in an embedded document?
I'm using MongoDB 4.4.1, in the case you need to know.
Thanks!
It is not possible to update field that start with $, see mongodb field name instructions,
Restrictions on Field Names:
Field names cannot contain the null character.
Top-level field names cannot start with the dollar sign ($) character.
Otherwise, starting in MongoDB 3.6, the server permits storage of field names that contain dots (i.e. .) and dollar signs (i.e. $).
IMPORTANT
The MongoDB Query Language cannot always meaningfully express queries over documents whose field names contain these characters (see SERVER-30575).
Until support is added in the query language, the use of $ and . in field names is not recommended and is not supported by the official MongoDB drivers.

MongoDB $or + sort + index. How to avoid sorting in memory?

I have an issue to generate proper index for my mongo query, which would avoid SORT stage. I am not even sure if that is possible in my case. So here is my query with execution stats:
db.getCollection('test').find(
{
"$or" : [
{
"a" : { "$elemMatch" : { "_id" : { "$in" : [4577] } } },
"b" : { "$in" : [290] },
"c" : { "$in" : [35, 49, 57, 101, 161, 440] },
"d" : { "$lte" : 399 }
},
{
"e" : { "$elemMatch" : { "numbers" : { "$in" : ["1K0407151AC", "0K20N51150A"] } } },
"d" : { "$lte" : 399 }
}]
})
.sort({ "X" : 1, "d" : 1, "Y" : 1, "Z" : 1 }).explain("executionStats")
The fields 'm', 'a' and 'e' are arrays, that is why 'm' is not included in any index.
If you check the execution stats screenshot, you will see that memory usage is pretty close to maximum and unfortunately I had cases where the query failed to execute because of the 32MB limit.
Index for the first part of the $or query:
{
"a._id" : 1,
"X" : 1,
"d" : 1,
"Y" : 1,
"Z" : 1,
"b" : 1,
"c" : 1
}
Index for the second part of the $or query:
{
"e.numbers" : 1,
"X" : 1,
"d" : 1,
"Y" : 1,
"Z" : 1
}
The indexes are used by the query, but not for sorting. Instead of SORT stage I would like too see SORT_MERGE stage, but no success for now. If I run the part queries inside $or separately, they are able to use the index to avoid sorting in a memory. As a workaround it is ok, but I would need to merge and resort the results by the application.
MongoDB version is 3.4.2. I checked that and that question. My query is the result. Probably I missed something?
Edit: mongo documents look like that:
{
"_id" : "290_440_K760A03",
"Z" : "K760A03",
"c" : 440,
"Y" : "NPS",
"b" : 290,
"X" : "Schlussleuchte",
"e" : [
{
"..." : 184,
"numbers" : [
"0K20N51150A"
]
}
],
"a" : [
{
"_id" : 4577,
"..." : [
{
"..." : [
{
"..." : "R",
}
]
}
]
},
{
"_id" : 4578
}
],
"d" : 101,
"m" : [
"AT",
"BR",
"CH"
],
"moreFields":"..."
}
Edit 2: removed the filed "m" from query to decrease complexity and attached test collection dump for someone, who wants to help :)
Here is the solution-
I just added one document in my test collection as shown in your question (edit part). Then I created below four indices-
1. {"m":1,"b":1,"c":1,"X":1,"d":1,"Y":1,"Z":1}
2. {"a._id":1,"b":1,"c":1,"X":1,"d":1,"Y":1,"Z":1}
3. {"m":1,"X":1,"d":1,"Y":1,"Z":1}
4. {"e.numbers":1,"X":1,"d":1,"Y":1,"Z":1}
And when I executed given query for execution stats then it shows me the SORT_MERGE state as expected.
Here is the explanation-
MongoDB has a thing called equality-sort-range which tells a lot how we should create our indices. I just followed this rule and kept the index in that order. So Here the index should be {Equality fields, "X":1,"d":1,"Y":1,"Z":1, Range fields}. You can see that the query has range on field "d" only ("d" : { "$lte" : 101 }) but "d" is already covered in SORT fields of index ("X":1,"d":1,"Y":1,"Z":1) so we can skip range part (i.e. field "d") from the end of index.
If "d" had NOT been in sort/equality predicate then I would have taken it in index for range index field and my index would have looked like {Equality fields, "X":1,"Y":1,"Z":1,"d":1}.
Now my index is {Equality fields, "X":1,"d":1,"Y":1,"Z":1} and I am just concerned about equality fields. So to figure out equality fields I just checked the query find predicates and I found there are two conditions combined by OR operator.
The first condition has equality on "a._id", "b", "c", "m" ("d" has range, not equality). So I need to create an index like "a._id":1,"m":1,"b":1,"c":1,"X":1,"d":1,"Y":1,"Z":1 but this will give error because it has two array fields "a_id" and "m". And as we know Mongo doesn't allow compound index on parallel arrays so it will fail. So I created two separate index just to allow Mongo to use whatever is chosen by query planner. And hence I created first and second index.
The second condition of OR operator has "e.numbers" and "m". Both are arrays fields so I had to create two indices as done for first condition and that's how I got my third and fourth index.
Now we know that at a time a single query can use only and only one index so I need to create these indices because I don't know which branch of OR operator will be executed.
Note: If you are concerned about size of index then you can keep only one index from first two and one from last two. Or you can also keep all four and hint mongo to use proper index if you know it well before query planner.

count the subdocument field and total amount in mongodb

I've a collection with below documents:
{
"_id" : ObjectId("54acfb67a81bf9509246ed81"),
"Billno" : 1234,
"details" : [
{
"itemcode" : 12,
"itemname" : "Paste100g",
"qty" : 2,
"price" : 50
},
{
"itemcode" : 14,
"itemname" : "Paste30g",
"qty" : 4,
"price" : 70
},
{
"itemcode" : 12,
"itemname" : "Paste100g",
"qty" : 4,
"price" : 100
}
]
}
{
"_id" : ObjectId("54acff86a81bf9509246ed82"),
"Billno" : 1237,
"details" : [
{
"itemcode" : 12,
"itemname" : "Paste100g",
"qty" : 3,
"price" : 75
},
{
"itemcode" : 19,
"itemname" : "dates100g",
"qty" : 4,
"price" : 170
},
{
"itemcode" : 22,
"itemname" : "dates200g",
"qty" : 2,
"price" : 160
}
]
}
I need to display below output. Please help
Required Output:
---------------------------------------------------------------------------------
itemcode itemname totalprice totalqty
---------------------------------------------------------------------------------
12 Paste100g 225 9
14 Paste30g 70 4
19 dates100g 170 4
22 dates200g 160 2
The MongoDB aggregation pipeline is available to solve your problem. You get details out of an array my processing with $unwind and then using $group to "sum" the totals:
db.collection.aggregate([
// Unwind the array to de-normalize as documents
{ "$unwind": "$details" },
// Group on the key you want and provide other values
{ "$group": {
"_id": "$details.itemcode",
"itemname": { "$first": "$details.itemname" },
"totalprice": { "$sum": "$details.price" },
"totalqty": { "$sum": "$details.qty" }
}}
])
Ideally you want a $match stage in there to filter out any irrelevant data first. This is basically MongoDB query and takes all the same arguments and operators.
Most here is simple really. The $unwind is sort of like a "JOIN" in SQL except that in an embedded structure the "join" is already made, so you are just "de-normalizing" like a join would do between "one to many" table relationships but just within the document itself. It basically "repeats" the "parent" document parts to the array for each array member as a new document.
Then the $group works of a key, as in "GROUP BY", where the "key" is the _id value. Everything there is "distinct" and all other values are gathered by "grouping operators".
This is where operations like $first come in. As described on the manual page, this takes the "first" value from the "grouping boundary" mentioned in the "key" earlier. You want this because all values of this field are "likely" to be the same, so this is a logical choice to just pick the "first" match.
Finally there is the $sum grouping operator which does what should be expected. All supplied values under the "key" are "added" or "summed" together to provide a total. Just like SQL SUM().
Also note that all the $ prefixed names there is how the aggregation framework deals with variables for "field/property" names within the current document being processed. "Dot notation" is used to reference the embedded "fields/properties" nested within a parent property name.
It is useful to learn aggregation in MongoDB. It is to general queries what anything beyond a basic "SELECT" statement is to SQL. Not just for "grouping" but for other manipulation as well.
Read through the documentation of all aggregation operators and also take a look a SQL to Aggregation Mapping in the documentation as a general guide if you have some familiarity with SQL to begin with. It helps explain concepts and shows some things that can be done.

mongodb $elemMatch in query return all sub docs

db.aaa.insert({"_id":1, "m":[{"_id":1,"a":1},{"_id":2,"a":2}]})
db.aaa.find({"_id":1,"m":{$elemMatch:{"_id":1}}})
{ "_id" : 1, "m" : [ { "_id" : 1, "a" : 1 }, { "_id" : 2, "a" : 2 } ] }
Using $elemMatch as query operator, it return all sub docs in 'm' !! Strange!
Use it as project operator:
db.aaa.find({"_id":1},{"m":{$elemMatch:{"_id":1}}})
{ "_id" : 1, "m" : [ { "_id" : 1, "a" : 1 } ] }
This is OK. Following this logic, use it as query operator in update will change all sub docs in 'm'. So I do:
db.aaa.update({"_id":1,"m":{$elemMatch:{"_id":1}}},{$set:{"m.$.a":3}})
db.aaa.find()
{ "_id" : 1, "m" : [ { "_id" : 1, "a" : 3 }, { "_id" : 2, "a" : 2 } ] }
It works in manner of as second example(project operator). This really confuse me.
Give me a explain
It Isn't strange, it's how it works.
You are using $elemMatch to match an element within an array contained in your document. That means it mactches the "document" and not the "array element", so it does not just selectively display only the array element that was matched.
What you can do, and how you used it in with the $set operator, is use a positional $ operator to indicate the matched "position" from your query side:
db.aaa.find({"_id":1},{"m":{$elemMatch:{"_id":1}}},{ "m.$": 1 })
And that will show you only one element of the array. But it is of course *still an array in the result shown, and you cannot cast it to a different type.
The other part of the usage is that this will only match once. And only the first match will be assigned to the positional operator.
So perhaps the most succinct explaination is you matching the "document that contains" the properties of the sub-document your specified in your query, and not just the "sub-document" itself.
See the documentation for more:
http://docs.mongodb.org/manual/reference/operator/projection/positional/
http://docs.mongodb.org/manual/reference/operator/query/elemMatch/

Adding unique index in MongoDB ignoring nulls

I'm trying to add unique index on a group of fields in MongoDB. Not all of those fields are available in all of the documents and I'd like to index only those which have all of the fields.
So, I'm trying to run this:
db.mycollection.ensureIndex({date:1, type:1, reference:1}, {sparse: true, unique: true})
But I get an error E11000 duplicate key error index on a field which misses 'type' field (there are many of them and they are duplicate, but I just want to ignore them).
Is it possible in MongoDB or there is some workaround?
There are multiple people who want this feature and because there is no workaround for this, I would recommend voting up feature request Jira tickets in jira.mongodb.org:
SERVER-785 - support filtered (partial) indexes
SERVER-2193 - sparse indexes only support single field
Note that because 785 would provide a way to enforce this feature, 2193 is marked "won't fix" so it may be more productive to vote up and add your comments to 785.
The uniqueness, you can guarantee, using upsert operation instead of doing insert. This will make sure that if some document already exist then it will update or insert if document don't exist
test:Mongo > db.test4.ensureIndex({ a : 1, b : 1, c : 1}, {sparse : 1})
test:Mongo > db.test4.update({a : 1, b : 1}, {$set : { d : 1}}, true, false)
test:Mongo > db.test4.find()
{ "_id" : ObjectId("51ae978960d5a3436edbaf7d"), "a" : 1, "b" : 1, "d" : 1 }
test:Mongo > db.test4.update({a : 1, b : 1, c : 1}, {$set : { d : 1}}, true, false)
test:Mongo > db.test4.find()
{ "_id" : ObjectId("51ae978960d5a3436edbaf7d"), "a" : 1, "b" : 1, "d" : 1 }
{ "_id" : ObjectId("51ae97b960d5a3436edbaf7e"), "a" : 1, "b" : 1, "c" : 1, "d" : 1 }