mongodb $elemMatch in query return all sub docs - mongodb

db.aaa.insert({"_id":1, "m":[{"_id":1,"a":1},{"_id":2,"a":2}]})
db.aaa.find({"_id":1,"m":{$elemMatch:{"_id":1}}})
{ "_id" : 1, "m" : [ { "_id" : 1, "a" : 1 }, { "_id" : 2, "a" : 2 } ] }
Using $elemMatch as query operator, it return all sub docs in 'm' !! Strange!
Use it as project operator:
db.aaa.find({"_id":1},{"m":{$elemMatch:{"_id":1}}})
{ "_id" : 1, "m" : [ { "_id" : 1, "a" : 1 } ] }
This is OK. Following this logic, use it as query operator in update will change all sub docs in 'm'. So I do:
db.aaa.update({"_id":1,"m":{$elemMatch:{"_id":1}}},{$set:{"m.$.a":3}})
db.aaa.find()
{ "_id" : 1, "m" : [ { "_id" : 1, "a" : 3 }, { "_id" : 2, "a" : 2 } ] }
It works in manner of as second example(project operator). This really confuse me.
Give me a explain

It Isn't strange, it's how it works.
You are using $elemMatch to match an element within an array contained in your document. That means it mactches the "document" and not the "array element", so it does not just selectively display only the array element that was matched.
What you can do, and how you used it in with the $set operator, is use a positional $ operator to indicate the matched "position" from your query side:
db.aaa.find({"_id":1},{"m":{$elemMatch:{"_id":1}}},{ "m.$": 1 })
And that will show you only one element of the array. But it is of course *still an array in the result shown, and you cannot cast it to a different type.
The other part of the usage is that this will only match once. And only the first match will be assigned to the positional operator.
So perhaps the most succinct explaination is you matching the "document that contains" the properties of the sub-document your specified in your query, and not just the "sub-document" itself.
See the documentation for more:
http://docs.mongodb.org/manual/reference/operator/projection/positional/
http://docs.mongodb.org/manual/reference/operator/query/elemMatch/

Related

MongoDB $or + sort + index. How to avoid sorting in memory?

I have an issue to generate proper index for my mongo query, which would avoid SORT stage. I am not even sure if that is possible in my case. So here is my query with execution stats:
db.getCollection('test').find(
{
"$or" : [
{
"a" : { "$elemMatch" : { "_id" : { "$in" : [4577] } } },
"b" : { "$in" : [290] },
"c" : { "$in" : [35, 49, 57, 101, 161, 440] },
"d" : { "$lte" : 399 }
},
{
"e" : { "$elemMatch" : { "numbers" : { "$in" : ["1K0407151AC", "0K20N51150A"] } } },
"d" : { "$lte" : 399 }
}]
})
.sort({ "X" : 1, "d" : 1, "Y" : 1, "Z" : 1 }).explain("executionStats")
The fields 'm', 'a' and 'e' are arrays, that is why 'm' is not included in any index.
If you check the execution stats screenshot, you will see that memory usage is pretty close to maximum and unfortunately I had cases where the query failed to execute because of the 32MB limit.
Index for the first part of the $or query:
{
"a._id" : 1,
"X" : 1,
"d" : 1,
"Y" : 1,
"Z" : 1,
"b" : 1,
"c" : 1
}
Index for the second part of the $or query:
{
"e.numbers" : 1,
"X" : 1,
"d" : 1,
"Y" : 1,
"Z" : 1
}
The indexes are used by the query, but not for sorting. Instead of SORT stage I would like too see SORT_MERGE stage, but no success for now. If I run the part queries inside $or separately, they are able to use the index to avoid sorting in a memory. As a workaround it is ok, but I would need to merge and resort the results by the application.
MongoDB version is 3.4.2. I checked that and that question. My query is the result. Probably I missed something?
Edit: mongo documents look like that:
{
"_id" : "290_440_K760A03",
"Z" : "K760A03",
"c" : 440,
"Y" : "NPS",
"b" : 290,
"X" : "Schlussleuchte",
"e" : [
{
"..." : 184,
"numbers" : [
"0K20N51150A"
]
}
],
"a" : [
{
"_id" : 4577,
"..." : [
{
"..." : [
{
"..." : "R",
}
]
}
]
},
{
"_id" : 4578
}
],
"d" : 101,
"m" : [
"AT",
"BR",
"CH"
],
"moreFields":"..."
}
Edit 2: removed the filed "m" from query to decrease complexity and attached test collection dump for someone, who wants to help :)
Here is the solution-
I just added one document in my test collection as shown in your question (edit part). Then I created below four indices-
1. {"m":1,"b":1,"c":1,"X":1,"d":1,"Y":1,"Z":1}
2. {"a._id":1,"b":1,"c":1,"X":1,"d":1,"Y":1,"Z":1}
3. {"m":1,"X":1,"d":1,"Y":1,"Z":1}
4. {"e.numbers":1,"X":1,"d":1,"Y":1,"Z":1}
And when I executed given query for execution stats then it shows me the SORT_MERGE state as expected.
Here is the explanation-
MongoDB has a thing called equality-sort-range which tells a lot how we should create our indices. I just followed this rule and kept the index in that order. So Here the index should be {Equality fields, "X":1,"d":1,"Y":1,"Z":1, Range fields}. You can see that the query has range on field "d" only ("d" : { "$lte" : 101 }) but "d" is already covered in SORT fields of index ("X":1,"d":1,"Y":1,"Z":1) so we can skip range part (i.e. field "d") from the end of index.
If "d" had NOT been in sort/equality predicate then I would have taken it in index for range index field and my index would have looked like {Equality fields, "X":1,"Y":1,"Z":1,"d":1}.
Now my index is {Equality fields, "X":1,"d":1,"Y":1,"Z":1} and I am just concerned about equality fields. So to figure out equality fields I just checked the query find predicates and I found there are two conditions combined by OR operator.
The first condition has equality on "a._id", "b", "c", "m" ("d" has range, not equality). So I need to create an index like "a._id":1,"m":1,"b":1,"c":1,"X":1,"d":1,"Y":1,"Z":1 but this will give error because it has two array fields "a_id" and "m". And as we know Mongo doesn't allow compound index on parallel arrays so it will fail. So I created two separate index just to allow Mongo to use whatever is chosen by query planner. And hence I created first and second index.
The second condition of OR operator has "e.numbers" and "m". Both are arrays fields so I had to create two indices as done for first condition and that's how I got my third and fourth index.
Now we know that at a time a single query can use only and only one index so I need to create these indices because I don't know which branch of OR operator will be executed.
Note: If you are concerned about size of index then you can keep only one index from first two and one from last two. Or you can also keep all four and hint mongo to use proper index if you know it well before query planner.

$Avg aggregation in Mongodb [duplicate]

For a given record id, how do I get the average of a sub document field if I have the following in MongoDB:
/* 0 */
{
"item" : "1",
"samples" : [
{
"key" : "test-key",
"value" : "1"
},
{
"key" : "test-key2",
"value" : "2"
}
]
}
/* 1 */
{
"item" : "1",
"samples" : [
{
"key" : "test-key",
"value" : "3"
},
{
"key" : "test-key2",
"value" : "4"
}
]
}
I want to get the average of the values where key = "test-key" for a given item id (in this case 1). So the average should be $avg (1 + 3) = 2
Thanks
You'll need to use the aggregation framework. The aggregation will end up looking something like this:
db.stack.aggregate([
{ $match: { "samples.key" : "test-key" } },
{ $unwind : "$samples" },
{ $match : { "samples.key" : "test-key" } },
{ $project : { "new_key" : "$samples.key", "new_value" : "$samples.value" } },
{ $group : { `_id` : "$new_key", answer : { $avg : "$new_value" } } }
])
The best way to think of the aggregation framework is like an assembly line. The query itself is an array of JSON documents, where each sub-document represents a different step in the assembly.
Step 1: $match
The first step is a basic filter, like a WHERE clause in SQL. We place this step first to filter out all documents that do not contain an array element containing test-key. Placing this at the beginning of the pipeline allows the aggregation to use indexes.
Step 2: $unwind
The second step, $unwind, is used for separating each of the elements in the "samples" array so we can perform operations across all of them. If you run the query with just that step, you'll see what I mean.
Long story short :
{ name : "bob",
children : [ {"name" : mary}, { "name" : "sue" } ]
}
becomes two documents :
{ name : "bob", children : [ { "name" : mary } ] }
{ name : "bob", children : [ { "name" : sue } ] }
Step 3: $match
The third step, $match, is an exact duplicate of the first $match stage, but has a different purpose. Since it follows $unwind, this stage filters out previous array elements, now documents, that don't match the filter criteria. In this case, we keep only documents where samples.key = "test-key"
Step 4: $project (Optional)
The fourth step, $project, restructures the document. In this case, I pulled the items out of the array so I could reference them directly. Using the example above..
{ name : "bob", children : [ { "name" : mary } ] }
becomes
{ new_name : "bob", new_child_name : mary }
Note that this step is entirely optional; later stages could be completed even without this $project after a few minor changes. In most cases $project is entirely cosmetic; aggregations have numerous optimizations under the hood such that manually including or excluding fields in a $project should not be necessary.
Step 5: $group
Finally, $group is where the magic happens. The _id value what you will be "grouping by" in the SQL world. The second field is saying to average over the value that I defined in the $project step. You can easily substitute $sum to perform a sum, but a count operation is typically done the following way: my_count : { $sum : 1 }.
The most important thing to note here is that the majority of the work being done is to format the data to a point where performing the operation is simple.
Final Note
Lastly, I wanted to note that this would not work on the example data provided since samples.value is defined as text, which can't be used in arithmetic operations. If you're interested, changing the type of a field is described here: MongoDB How to change the type of a field

Meteor/Mongo nested arrays update

I'm new to meteor/mongo/js as a stack and I'm getting lost in JSON arrays and referencing them. Based another SO answer ( and the docs) I think I am close...
A document in the Orders collection, the document has nested arrays.
Order -> orderLines -> lineItems:
Sample doc:
{
"_id" : "27tGpRtMWYqPpFkDN",
"orderLines" : [
{
"lineId" : 1,
"name" : "Cheese & Pickle",
"instructions" : "",
"lineItems" : [
{
"name" : "Cheddar Cheese",
"quantity" : 1
},
{
"name" : "Branston Pickle",
"quantity" : 1
},
{
"name" : "Focaccia Roll",
"quantity" : 1
}
]
}
]
}
What I'm trying to do from the meteor/mongo shell:
Add "instructions" of "foo" to orderLines where lineId = 1
Put a new item on lineItems array
This appears to hang...
meteor:PRIMARY> db.orders.update({_id:"27tGpRtMWYqPpFkDN","orderLines.lineId":"1", {$set: {"orderLines.$.instructions":"foo"}})
...
This doesn't like the identifier in the query
meteor:PRIMARY> db.orders.update({_id:"27tGpRtMWYqPpFkDN", "orderLines.lineId":"1"}, {$push:{"orderLines.$.lineItems":" { "name" : "butter", "quantity" : 1}"}});
2015-10-27T16:09:54.489+0100 SyntaxError: Unexpected identifier
Thanks all for your comments... but I found out some answers, posted for reference
Item 1 - using $set on a value within an array
This was failing due to two typos, one missing closing } at the end of the query, second was quoting the value "1" for itemId in the query.
This works:
db.orders.update({_id:"27tGpRtMWYqPpFkDN", orderLines.lineId":1},
{$set: {"orderLines.$.instructions":"foo"}})
I also realised that when I said "It appears to hang" it is the cli waiting for a valid statement, so hints at a missing } or )!
Item 2 - using $push to add data to an array - 2 levels nested
This was failing due to quoting around the array data
db.orders.update({_id:"27tGpRtMWYqPpFkDN", "orderLines.lineId":1 },
{$push:{"orderLines.$.lineItems": { "name" : "butter", "quantity" : 1} }})
Nested Arrays: can use $ positional operator
What I want to do next is use $set on an item in the second level array, and this would require using the $ positional operator twice:
db.orders.update({"orderLines.lineId":1, lineItems.name:"Cheddar Cheese"},
{$set: {"orderLines.$.lineItems.$.quantity": 2}})
This throws an error:
Too many positional (i.e. '$') elements found in path
There is an open MongoDB enhancement request for this but it's been open since 2010
$push adds a new element to an array. You're merely trying to set the value of a particular key in an array element.
Try:
db.orders.update({ _id: "27tGpRtMWYqPpFkDN", "orderLines.lineId": 1},
{ $set: { "orderlines.$.instructions": "foo" }})
docs

List all values of a certain field in mongodb

How would I get an array containing all values of a certain field for all of my documents in a collection?
db.collection:
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be6"), "x" : 1 }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be7"), "x" : 2 }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be8"), "x" : 3 }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be9"), "x" : 4 }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bea"), "x" : 5 }
"db.collection.ListAllValuesForfield(x)"
Result: [1,2,3,4,5]
Also, what if this field was an array?
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be6"), "y" : [1,2] }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be7"), "y" : [3,4] }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be8"), "y" : [5,6] }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be9"), "y" : [1,2] }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bea"), "y" : [3,4] }
"db.collection.ListAllValuesInArrayField(y)"
Result: [1,2,3,4,5,6,1,2,3,4]
Additionally, can I make this array unique? [1,2,3,4,5,6]
db.collection.distinct('x')
should give you an array of unique values for that field.
Notice: My answer is a fork from the original answer.
Before any "thumbs up" here, "thumbs up" the accepted answer :).
db.collection.distinct("NameOfTheField")
Finds the distinct values for a specified field across a single collection or view and returns the results in an array.
Reference: https://docs.mongodb.com/manual/reference/method/db.collection.distinct/
This would return an array of docs, containing just it's x value...
db.collection.find(
{ },
{ x: 1, y: 0, _id:0 }
)
db.collection_name.distinct("key/field_name") - This will return a list of distinct values in the key name from the entire dictionary.
Just make sure you don't use the curly brackets after round brackets.

matching 2 out of 3 (or excluding one) in MongoDb aggregation

Let's say I have a mongo db restaurant collection that has an array of different foods, and I want to average the price of the "sandwich" and the "burger" for each restaurant i.e. to not include the steak. How do I match 2 out of the 3 types in this situation i.e. or, in other words, filter out the steak?
For example, for the match operator, I can (assuming I have already unwound the array) do something like this
{ $match : { foods : "burger" } }
but I want to do something more like this (which leaves out steak)
{ $match : { foods : ["burger", "sandwich" ]} }
except that code doesn't work.
Can you explain?
"_id" : ObjectId("50b59cd75bed76f46522c34e"),
"restaurant_id" : 0,
"foods" : [
{
"type" : "sandwich",
"price" : 6.99
},
{
"type" : "burger",
"price" : 5.99
},
{
"type" : "steak"
"price" : 9.99
}
]
Use $in to match one of multiple values:
{ $match : { foods : { $in: ["burger", "sandwich" ]}}}
JohnyHK's answer is right and concise.
For the "Can you explain?" part, when you specified the match as follows:
{ $match : { foods : ["burger", "sandwich" ]} }
You are requiring the document to have a field "foods" containing an array with "burger" and "sandwich" as elements. This is an equals comparison.
The operator $in is not directly explained with the $match, see here:
http://docs.mongodb.org/manual/reference/aggregation/match/
since $in is a query operator, which is explained here (linked from $match):
http://docs.mongodb.org/manual/tutorial/query-documents/#read-operations-query-argument