Using the db.collection.find query in a sub-document - mongodb

Is there a way to use db.collection.find() to query for a specific value in a sub-document and find those documents that match. For example:
{
{ 'Joe' : {eyecolor : 'brown'},
{ 'Mary' : {eyecolor : 'blue'},
....
}
I want to return the names of all people whose eyecolor is blue.

You need to specify the full path to a value for search to work:
db.people.find({ "Joe.eyecolor" : "brown" })
You can't switch to an array of people instead of an associative array style you're using now, as there is no way to return only array elements that match conditions. You can use $elemMatch to return the first match, but that's not likely what you'd want. Or, you could still use arrays, but you'd need to filter the array further within your client code (not the database).
You might be able to use the Aggregation framework, but it wouldn't use indexes efficiently, as you'd need to $unwind the entire array, and then do filtering, brute force. And if the data contained is more complex, the fact that projections when using the AF require you to manually specify all fields, it becomes a bit cumbersome.
To most efficiently do the query you're showing, you'd need to not use subdocuments, and instead place the people as individual documents:
{
name: "Joe",
eyecolor: "brown"
}
Then, you could just do a simple search like:
db.people.find({eyecolor: "brown"})

Yes and no. You can query for all documents that have a matching person, but you can't query for all persons directly. In other words, subdocuments are not virtual collections, you'll always have the 'parent' document returned.
The example you posted comes with the additional complexity that you're using the name as a field key, which prevents you from using the dot notation.
In general, if you have a number of similar things, it's best to put them in a list, e.g.
{
"_id" : 132,
"ppl" : [ { "Name" : "John", "eyecolor" : "blue" },
{ "Name" : "Mary", "eyecolor" : "brown" },
...
]
}
Then, you can query using the aggregation framework:
db.collection.aggregate([
// only match documents that have a person w/ blue eyes (can use indexing)
{$match : { "ppl.eyecolor" : "blue" } },
// unwind the array of people
{$unwind : "$ppl" },
// match only those with blue eyes
{$match : { "ppl.eyecolor" : "blue" }},
// optional projection to make the result a list of people
{$project : { Name : "$ppl.Name", EyeColor: "$ppl.eyecolor" }} ]);
Which gives a result like
"result" : [
{
"_id" : 132,
"Name" : "John",
"EyeColor" : "blue"
},
{
"_id" : 12,
"Name" : "Jimmy",
"EyeColor" : "blue"
},
{
"_id" : 4312,
"Name" : "Jimmy",
"EyeColor" : "blue"
},
{
"_id" : 4312,
"Name" : "Marc",
"EyeColor" : "blue"
}
],
"ok" : 1

Related

mongodb - is it possible to filter after an $elemMatch projection in a find query?

I have documents like this in a collection called 'variants':
{
"_id" : "An_FM000900_Var_10_100042505_100042505_G_A",
"analysisId" : "FM000900",
"chromosome" : 10,
"start" : 100042505,
"end" : 100042505,
"size" : 1,
"reference" : "G",
"alternative" : "A",
"effects" : [
{
"_id" : "Analysis:FM000900-Variant:An_FM000900_Var_10_100042505_100042505_G_A-Effect:0",
"biotype" : "protein_coding",
"impact" : "LOW",
},
{
"_id" : "Analysis:FM000900-Variant:An_FM000900_Var_10_100042505_100042505_G_A-Effect:1",
"biotype" : "protein_coding",
"impact" : "MODERATE",
}
]
}
I want to find documents in that collection that meet some criteria ("analysisId":"FM000900"), and after that I want to project over 'effects' array field to bring just the first element in 'effects' array that meet some criteria ("biotype" : "protein_coding" and "impact" : "MODERATE").
The thing is that I just want to show the main 'variant' document if and only if at least one element in the 'effects' array has meet the criteria.
With the following query I get the expected result except that I get 'variant' documents with 'effects' array field empty.
db.getCollection('variants').find(
{
"analysisId":"FM000900"
}
,
{
"effects":{
"$elemMatch" : {
"biotype" : "protein_coding",
"impact" : "MODERATE"
}
}
}
).skip(0).limit(200)
Can somebody transform this query to only get 'variant' documents with some element in 'effect' array after the projection if possible?
Can it be done in another way, without using aggregation framework if possible? as the collection has millions of documents and it has to be performant.
Thanks a lot, guys!!!
Simply use $elemMatch as query operator in addition of your projection, it will filter variants that have at least one effects array element that match all conditions.
So your query will be :
db.getCollection('variants').find(
{
"analysisId":"FM000900",
"effects":{
"$elemMatch" : {
"biotype" : "protein_coding",
"impact" : "MODERATE"
}
}
}
,
{
"effects":{
"$elemMatch" : {
"biotype" : "protein_coding",
"impact" : "MODERATE"
}
}
}
).skip(0).limit(200)
In addition, a compound multikey index that covers both query and projection can improve reading performance, but use it carefully as it can drastically reduce writing performances.

Using $last on Mongo Aggregation Pipeline

I searched for similar questions but couldn't find any. Feel free to point me in their direction.
Say I have this data:
{ "_id" : ObjectId("5694c9eed4c65e923780f28e"), "name" : "foo1", "attr" : "foo" }
{ "_id" : ObjectId("5694ca3ad4c65e923780f290"), "name" : "foo2", "attr" : "foo" }
{ "_id" : ObjectId("5694ca47d4c65e923780f294"), "name" : "bar1", "attr" : "bar" }
{ "_id" : ObjectId("5694ca53d4c65e923780f296"), "name" : "bar2", "attr" : "bar" }
If I want to get the latest record for each attribute group, I can do this:
> db.content.aggregate({$group: {_id: '$attr', name: {$last: '$name'}}})
{ "_id" : "bar", "name" : "bar2" }
{ "_id" : "foo", "name" : "foo2" }
I would like to have my data grouped by attr and then sorted by _id so that only the latest record remains in each group, and that's how I can achieve this. BUT I need a way to avoid naming all the fields that I want in the result (in this example "name") because in my real use case they are not known ahead.
So, is there a way to achieve this, but without having to explicitly name each field using $last and just taking all fields instead? Of course, I would sort my data prior to grouping and I just need to somehow tell Mongo "take all values from the latest one".
See some possible options here:
Do multiple find().sort() queries for each of the attr values you
want to search.
Grab the original _id of the $last doc, then do a findOne() for each of those values (this is the more extensible option).
Use the $$ROOT system variable as shown here.
This wouldn't be the quickest operation, but I assume you're using this more for analytics, not in response to a user behavior.
Edited to add slouc's example posted in comments:
db.content.aggregate({$group: {_id: '$attr', lastItem: { $last: "$$ROOT" }}}).

Extract two sub array values in mongodb by $elemMatch

Aggregate, $unwind and $group is not my solution as they make query very slow, there for I am looking to get my record by db.collection.find() method.
The problem is that I need more then one value from sub array. For example from the following example I want to get the "type" : "exam" and "type" : "quiz" elements.
{
"_id" : 22,
"scores" : [
{
"type" : "exam",
"score" : 75.04996547553947
},
{
"type" : "quiz",
"score" : 10.23046475899236
},
{
"type" : "homework",
"score" : 96.72520512117761
},
{
"type" : "homework",
"score" : 6.488940333376703
}
]
}
I am looking something like
db.students.find(
// Search criteria
{ '_id': 22 },
// Projection
{ _id: 1, scores: { $elemMatch: { type: 'exam', type: 'quiz' } }}
)
The result should be like
{ "_id": 22, "scores" : [ { "type" : "exam", "type" : "quiz" } ] }
But this over ride the type: 'exam' and returns only type: 'quiz'. Have anybody any idea how to do this with db.find()?
This is not possible directly using find and elemMatch because of following limitation of elemMatch and mongo array fields.
The $elemMatch operator limits the contents of an field from the query results to contain only the first element matching the $elemMatch condition. ref. from $elemMacth
and mongo array field limitations as below
Only one positional $ operator may appear in the projection document.
The query document should only contain a single condition on the array field being projected. Multiple conditions may override each other internally and lead to undefined behavior. ref from mongo array field limitations
So either you tried following this to find out only exam or quiz
db.collectionName.find({"_id":22,"scores":{"$elemMatch":{"type":"exam"}}},{"scores.$.type":1}).pretty()
is shows only exam scores array.
Otherwise you should go through aggregation

matching 2 out of 3 (or excluding one) in MongoDb aggregation

Let's say I have a mongo db restaurant collection that has an array of different foods, and I want to average the price of the "sandwich" and the "burger" for each restaurant i.e. to not include the steak. How do I match 2 out of the 3 types in this situation i.e. or, in other words, filter out the steak?
For example, for the match operator, I can (assuming I have already unwound the array) do something like this
{ $match : { foods : "burger" } }
but I want to do something more like this (which leaves out steak)
{ $match : { foods : ["burger", "sandwich" ]} }
except that code doesn't work.
Can you explain?
"_id" : ObjectId("50b59cd75bed76f46522c34e"),
"restaurant_id" : 0,
"foods" : [
{
"type" : "sandwich",
"price" : 6.99
},
{
"type" : "burger",
"price" : 5.99
},
{
"type" : "steak"
"price" : 9.99
}
]
Use $in to match one of multiple values:
{ $match : { foods : { $in: ["burger", "sandwich" ]}}}
JohnyHK's answer is right and concise.
For the "Can you explain?" part, when you specified the match as follows:
{ $match : { foods : ["burger", "sandwich" ]} }
You are requiring the document to have a field "foods" containing an array with "burger" and "sandwich" as elements. This is an equals comparison.
The operator $in is not directly explained with the $match, see here:
http://docs.mongodb.org/manual/reference/aggregation/match/
since $in is a query operator, which is explained here (linked from $match):
http://docs.mongodb.org/manual/tutorial/query-documents/#read-operations-query-argument

How to do query on multiple nested data fields in MongoDB

So, what I'm trying to do is query all documents that have a City of 'Paris' and a State of 'France'. I need to do some kind of join, but I haven't been able to figure out how to construct it.
I'm using the c# driver, but I'll gladly accept help using any method.
{
"_id" : ObjectId("519b407f3c22a73a7c29269f"),
"DocumentID" : "1",
"Meta" : [{
"Name" : "City",
"Value" : "Paris",
}, {
"Name" : "State",
"Value" : "France",
}
}]
}
{
"_id" : ObjectId("519b407f3c22a73a7c29269g"),
"DocumentID" : "2",
"Meta" : [{
"Name" : "City",
"Value" : "Paris",
}, {
"Name" : "State",
"Value" : "Texas",
}
}]
}
The $elemMatch operator is used to indicate that all the conditions within it must be matched by the same array element. So (to switch to shell syntax) to match all documents which have meta city Paris you would do
db.collection.find( {Meta:{$elemMatch:{Name:"City",Value:"Paris"}}} )
This assures you won't match something which has Name: "somethingelse", Value: "Paris" somewhere in its array with a different array element matching the Name:"City".
Now, default combination for combining query conditions is "and" so you can continue adding attributes:
db.collection.find( {Meta: {
$elemMatch:{Name:"City",Value:"Paris"},
$elemMatch:{Name:"State",Value:"France"}
}
}
)
Now if you want to add another condition you keep adding it but if you want a NOT then you do it like this:
db.collection.find( {Meta: {
$elemMatch:{Name:"City",Value:"Paris"},
$elemMatch:{Name:"State",Value:"France"},
$not: {$elemMatch:{Name:"Arrondissement",Value:"Louvre"}}
}
}
)
I might be answering my own question here, but I'm new to MongoDB, so while this appears to give me the results I'm after, it might not be the optimum approach.
var result = collection.Find(
Query.And(
Query.ElemMatch("Meta", Query.EQ("Name", "City")),
Query.ElemMatch("Meta", Query.EQ("Value", "Paris")),
Query.ElemMatch("Meta", Query.EQ("Name", "State")),
Query.ElemMatch("Meta", Query.EQ("Value", "France")))
);
Which leads to a follow up - how would I get all of the documents whose 'City' is 'Paris' and 'State' is 'France' but whose 'Arrondissement' is not 'Louvre'?