MongoDB elemMatch subdocuments - mongodb

I have the following data structure
{
"_id" : ObjectId("523331359245b5a07b903ccc"),
"a" : "a",
"b" : [
{
"c" : {
"_id" : ObjectId("5232b5090364678515db9a82"),
"d" : "d1"
}
},
{
"c" : {
"_id" : ObjectId("5232b5090364678515db9a83"),
"d" : "d2"
}
}
]
}
For the following queries, mongo returns
> db.test.find({b : {$elemMatch : {'c.d': 'd1'}}}).count();
1
> db.test.find({b : {$elemMatch : {c: {d: 'd1'}}}}).count();
0
Unfortunately, for the following statements
B b = new B();
C c = new C();
b.c = c;
b.c.d = "d1";
createQuery().field("b").hasThisElement(b).asList();
Morphia generates db.test.find({b : {$elemMatch : {c: {d: 'd1'}}}}) which returns 0 match.
Is this a mongo bug or a morphia bug? Is there any workaround for me to get the matched document?
Please note that in the real world practice, I have 2 conditions for elemMatch, hence I have to use "elemMatch", not "dot notation" match. The above is just to simplify my case for easy viewing.
I'm running Mongodb 2.4.6 and Morphia 1.2.3
Thanks!

It is too late, but maybe others can found it handy.
I found that solution https://groups.google.com/forum/#!topic/morphia/FlEjBoSqkhg
updateQuery.filter("b elem",
BasicDBObjectBuilder.start("c.d", "d1").get());

Related

MongoDB - How Index prefixe works?

I have read this documentation : "Sort and Non-prefix Subset of an Index"
With that info. I am trying to answer this MongoDB mock test question, the question they have is
You have the following indexes on the things collection:
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "test.things"
},
{
"v" : 1,
"key" : {
"a" : 1
},
"name" : "a_1",
"ns" : "test.things"
},
{
"v" : 1,
"key" : {
"c" : 1,
"b" : 1,
"a" : 1
},
"name" : "c_1_b_1_a_1",
"ns" : "test.things"
}
]
Question:
Which of the following queries will require that you load every document into RAM in order to fulfill the query? Assume that no data is being written during the query. Check all that apply.
db.things.find( { b : 1 } ).sort( { c : 1, a : 1 } )
db.things.find( { c : 1 } ).sort( { a : 1, b : 1 } )
db.things.find( { a : 1 } ).sort( { b : 1, c : 1 } )
The answer they give is...
db.things.find( { b: 1} ).sort( {c: 1, a: 1} )
Can someone help me understand why the other 2 option are not correct i.e how are they using index/Index-prefix. My understanding is the SORT part has to match indexed-column-order. Also, the suggested correct answer does not seem to meet the rule (per documentation) either.
I believe, given the options and answer, the emphasis is to find:
Which of the following queries will require that you load every document into RAM in order to fulfill the query?
So the sorting is a red-herring.
find({ b: 1 }) can't use any of the indexes provided
find({ c: 1 }) can use index c_1_b_1_a_1 since it matches the prefix
find({ a: 1 }) can use index a_1
Since options #2 and #3 can use an index, they will not load every document in order to sort them, just the ones found via the index. Option #1 will have to do a full collection scan to find documents where b is 1.

mongodb $elemMatch in query return all sub docs

db.aaa.insert({"_id":1, "m":[{"_id":1,"a":1},{"_id":2,"a":2}]})
db.aaa.find({"_id":1,"m":{$elemMatch:{"_id":1}}})
{ "_id" : 1, "m" : [ { "_id" : 1, "a" : 1 }, { "_id" : 2, "a" : 2 } ] }
Using $elemMatch as query operator, it return all sub docs in 'm' !! Strange!
Use it as project operator:
db.aaa.find({"_id":1},{"m":{$elemMatch:{"_id":1}}})
{ "_id" : 1, "m" : [ { "_id" : 1, "a" : 1 } ] }
This is OK. Following this logic, use it as query operator in update will change all sub docs in 'm'. So I do:
db.aaa.update({"_id":1,"m":{$elemMatch:{"_id":1}}},{$set:{"m.$.a":3}})
db.aaa.find()
{ "_id" : 1, "m" : [ { "_id" : 1, "a" : 3 }, { "_id" : 2, "a" : 2 } ] }
It works in manner of as second example(project operator). This really confuse me.
Give me a explain
It Isn't strange, it's how it works.
You are using $elemMatch to match an element within an array contained in your document. That means it mactches the "document" and not the "array element", so it does not just selectively display only the array element that was matched.
What you can do, and how you used it in with the $set operator, is use a positional $ operator to indicate the matched "position" from your query side:
db.aaa.find({"_id":1},{"m":{$elemMatch:{"_id":1}}},{ "m.$": 1 })
And that will show you only one element of the array. But it is of course *still an array in the result shown, and you cannot cast it to a different type.
The other part of the usage is that this will only match once. And only the first match will be assigned to the positional operator.
So perhaps the most succinct explaination is you matching the "document that contains" the properties of the sub-document your specified in your query, and not just the "sub-document" itself.
See the documentation for more:
http://docs.mongodb.org/manual/reference/operator/projection/positional/
http://docs.mongodb.org/manual/reference/operator/query/elemMatch/

mongodb query by sub-field

How to query all {"module" : "B"} ?
The following query doesn't work:
db.XXX.find({ "_id" : { "module" : "B" } });
Thanks a ton!
There data looks like:
{
"_id" : {"module" : "A","date" : ISODate("2013-03-18T07:00:00Z")},
"value" : {"count" : 1.0}
}
{
"_id" : {"module" : "B","date" : ISODate("2013-03-18T08:00:00Z")},
"value" : {"count" : 2.0}
}
Try:
db.XXX.find({ "_id.module" : "B" });
The difference is your original query would be trying to match on that entire subdocument (i.e. where _id is a subdocument containing a "module" field with value "B" and nothing else)
Reference: MongoDB Dot Notation
Use dot notation:
db.XXX.find({ "_id.module" : "B" })
For Exact match on Subdocument
db.bios.find(
{
'_id.module': 'B'
}
)
the query uses dot notation to access fields in a subdocument:
Refference link

MongoDB aggregation framework sort by length of array

Given the following data set:
{ "_id" : ObjectId("510458b188ce1d16e616129b"), "codes" : [ "oxtbyr", "xstute" ], "name" : "Ciao Mambo", "permalink" : "ciaomambo", "visits" : 1 }
{ "_id" : ObjectId("510458b188ce1d16e6161296"), "codes" : [ "zpngwh", "odszfy", "vbvlgr" ], "name" : "Anthony's at Spokane Falls", "permalink" : "anthonysatspokanefalls", "visits" : 0 }
How can I convert this python/pymongo sort into something that will work with the MongoDB aggregation framework? I'm sorting results based on the number of codes within the codes array.
z = [(x['name'], len(x['codes'])) for x in restaurants]
sorted_by_second = sorted(z, key=lambda tup: tup[1], reverse=True)
for x in sorted_by_second:
print x[0], x[1]
This works in python, I just want to know how to accomplish the same goal on the MongoDB query end of things.
> db.z.aggregate({ $unwind:'$codes'},
{ $group : {_id:'$_id', count:{$sum:1}}},
{ $sort :{ count: 1}})

Get "data from collection b not in collection a" in a MongoDB shell query

I have two MongoDB collections that share a common _id. Using the mongo shell, I want to find all documents in one collection that do not have a matching _id in the other collection.
Example:
> db.Test.insert({ "_id" : ObjectId("4f08a75f306b428fb9d8bb2e"), "foo" : 1 })
> db.Test.insert({ "_id" : ObjectId("4f08a766306b428fb9d8bb2f"), "foo" : 2 })
> db.Test.insert({ "_id" : ObjectId("4f08a767306b428fb9d8bb30"), "foo" : 3 })
> db.Test.insert({ "_id" : ObjectId("4f08a769306b428fb9d8bb31"), "foo" : 4 })
> db.Test.find()
{ "_id" : ObjectId("4f08a75f306b428fb9d8bb2e"), "foo" : 1 }
{ "_id" : ObjectId("4f08a766306b428fb9d8bb2f"), "foo" : 2 }
{ "_id" : ObjectId("4f08a767306b428fb9d8bb30"), "foo" : 3 }
{ "_id" : ObjectId("4f08a769306b428fb9d8bb31"), "foo" : 4 }
> db.Test2.insert({ "_id" : ObjectId("4f08a75f306b428fb9d8bb2e"), "bar" : 1 });
> db.Test2.insert({ "_id" : ObjectId("4f08a766306b428fb9d8bb2f"), "bar" : 2 });
> db.Test2.find()
{ "_id" : ObjectId("4f08a75f306b428fb9d8bb2e"), "bar" : 1 }
{ "_id" : ObjectId("4f08a766306b428fb9d8bb2f"), "bar" : 2 }
Now I want some query or queries that returns the two documents in Test where the _id's do not match any document in Test2:
{ "_id" : ObjectId("4f08a767306b428fb9d8bb30"), "foo" : 3 }
{ "_id" : ObjectId("4f08a769306b428fb9d8bb31"), "foo" : 4 }
I've tried various combinations of $not, $ne, $or, $in but just can't get the right combination and syntax. Also, I don't mind if db.Test2.find({}, {"_id": 1}) is executed first, saved to some variable, which is then used in a second query (though I can't get that to work either).
Update: Zachary's answer pointing to the $nin answered the key part of the question. For example, this works:
> db.Test.find({"_id": {"$nin": [ObjectId("4f08a75f306b428fb9d8bb2e"), ObjectId("4f08a766306b428fb9d8bb2f")]}})
{ "_id" : ObjectId("4f08a767306b428fb9d8bb30"), "foo" : 3 }
{ "_id" : ObjectId("4f08a769306b428fb9d8bb31"), "foo" : 4 }
But (and acknowledging this is not scalable but trying to it anyway because its not an issue in this situation) I still can't combine the two queries together in the shell. This is the closest I can get, which is obviously less than ideal:
vals = db.Test2.find({}, {"_id": 1}).toArray()
db.Test.find({"_id": {"$nin": [ObjectId(vals[0]._id), ObjectId(vals[1]._id)]}})
Is there a way to return just the values in the find command so that vals can be used directly as the array input to $nin?
In mongo 3.2 the following code seems to work
db.collectionb.aggregate([
{
$lookup: {
from: "collectiona",
localField: "collectionb_fk",
foreignField: "collectiona_fk",
as: "matched_docs"
}
},
{
$match: {
"matched_docs": { $eq: [] }
}
}
]);
based on this https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/#use-lookup-with-an-array example
Answering your follow-up. I'd use map().
Given this:
> b1 = {i: 1}
> db.b.save(b1)
> db.b.save({i: 2})
> db.a.save({_id: b1._id})
All you need is:
> vals = db.a.find({}, {id: 1}).map(function(a){return a._id;})
> db.b.find({_id: {$nin: vals}})
which returns
{ "_id" : ObjectId("4f08c60d6b5e49fa3f6b46c1"), "i" : 2 }
You will have to save the _ids from collection A to not pull them again from collection B, but you can do it using $nin. See Advanced Queries for all of the MongoDB operators.
Your end query, using the example you gave would look something like:
db.Test.find({"_id": {"$nin": [ObjectId("4f08a75f306b428fb9d8bb2e"),
ObjectId("4f08a766306b428fb9d8bb2f")]}})`
Note that this approach won't scale. If you need a solution that scales, you should be setting a flag in collections A and B indicating if the _id is in the other collection and then query off of that instead.
Updated for second part:
The second part is impossible. MongoDB does not support joins or any sort of cross querying between collections in a single query. Querying from one collection, saving the results and then querying from the second is your only choice unless you embed the data in the rows themselves as I mention earlier.
I've made a script, marking all documents on the second collection that appears in first collection. Then processed the second collection documents.
var first = db.firstCollection.aggregate([ {'$unwind':'$secondCollectionField'} ])
while (first.hasNext()){ var doc = first.next(); db.secondCollection.update( {_id:doc.secondCollectionField} ,{$set:{firstCollectionField:doc._id}} ); }
...process the second collection that has no mark
db.secondCollection.find({"firstCollectionField":{$exists:false}})