How to query nested document in mongodb? - mongodb

I have document with nested document reviews:
{
"_id" : ObjectId("53a5753937c2f0ef6dcd9006"),
"product" : "Super Duper-o-phonic",
"price" : 11000000000,
"reviews" : [
{
"user" : "fred",
"comment" : "Great!",
"rating" : 5
},
{
"user" : "Tom",
"comment" : "Great again!",
"rating" : 5
},
{
"user" : "Tom",
"comment" : "I agree with fred somewhat",
"rating" : 4
}
]
}
I want to find only those reviews whose rating is 5.
Final query should select product price and two documents from reviews whose rating is 5.
The last query I tried is :
db.testData.find({'reviews':{$elemMatch:{'rating':{$gte:5}}}}).pretty()
It's strange but it doesn't work.
How to do this in mongodb?

If you only want a single sub-doc from reviews whose rating is 5, you can use the $ positional projection operator to include the first element of reviews that matches your query:
db.test.find({'reviews.rating': 5}, {product: 1, price: 1, 'reviews.$': 1})
If you want all reviews elements with a rating of 5 (instead of just the first) you can use aggregate instead of find:
db.test.aggregate([
// Only include docs with at least one 5 rating review
{$match: {'reviews.rating': 5}},
// Duplicate the docs, one per reviews element
{$unwind: '$reviews'},
// Only include the ones where rating = 5
{$match: {'reviews.rating': 5}},
// Only include the following fields in the output
{$project: {product: 1, price: 1, reviews: 1}}])

Take a look up here: MongoDB - how to query for a nested item inside a collection?
Just in case you thought about this:
If you try to accomplish this with $elemMatchit will jsut return the first matching review.
http://docs.mongodb.org/manual/reference/operator/projection/elemMatch/

Related

Is two-way referencing more efficient in Mongo for a 1 to N relationship?

I have a discussion at work about two-way referencing in a 1 to N relationship. According to this post in MongoDB blog, you can do it. We wouldn't need atomic updates at all, so no problem there. Following the example in the article, in our case you can only create or delete task but not change the task owner.
My argument is that two-way referencing is probably more efficient for fetching data from both sides, as we will need to display more often the owner with their tasks and less often just the tasks, in different parts of the program. My colleague says there won't be an efficiency gain and the data duplication is not worth it.
Do you have any info about the efficiency of this approach?
De-normalizing and storing the data helps when we have less write and more read. Here the efficiency depends upon how the data is retrieved. If our retrieval of data from the collections requires two way referencing and if we already have it then certainly it improves the efficiency of our query.
Student collection
{ _id:1, name: "Joseph", courses:[1, 3, 4]}
{ _id:2, name: "Mary", courses:[1, 3]}
{ _id:3, name: "Catherine", courses:[1, 2, 4]}
{ _id:4, name: "Robert", courses:[2, 4]}
Course Collection
{ _id:1, name: "Math101", students: [1, 2, 3]}
{ _id:2, name: "Science101", students: [3, 4]}
{ _id:3, name: "History101", students: [1, 2]}
{ _id:4, name: "Astronomy101", students: [1, 3, 4]}
Consider the above example of Students and Courses, here two way referencing is done, the courses array in Students collection gives us the different courses studied by the student. Similarly the Students array in the Courses collection gives us the students who are studying the respective course.
If we want to list the students who were studying Math101 then the query would be
db.courses.aggregate([{$match: {name:"Math101"}},
{$unwind:"$students"},
{$lookup:{from:"students",
localField:"students",
foreignField:"_id",
as:"result"}}])
$match, $unwind, $lookup in the aggregation pipeline are used to achieve the result. $match to reduce the data(it is good to use this operator in the start of the aggregation pipeline), $unwind to unwind the students array in the Courses collection, $lookup to look in to the Students collection and get the student details
The result after executing the above aggregation query on our sample collections is
{
"_id" : 1,
"name" : "Math101",
"students" : 1,
"result" : [
{
"_id" : 1,
"name" : "Joseph",
"courses" : [
1,
3,
4
]
}
]
}
{
"_id" : 1,
"name" : "Math101",
"students" : 2,
"result" : [
{
"_id" : 2,
"name" : "Mary",
"courses" : [
1,
3
]
}
]
}
{
"_id" : 1,
"name" : "Math101",
"students" : 3,
"result" : [
{
"_id" : 3,
"name" : "Catherine",
"courses" : [
1,
2,
4
]
}
]
}
The efficiency on two way referencing purely based on what we retrieve, hence design your schema closely aligned with your expected results.

Multiple update in a document in MongoDB

I am trying to update multiple nested documents in a document in mongoDB.
Say my data is:
{
"_id" : "ObjectId(7df78ad8902c)",
"title" : "Test",
"img_url" : "[{s: 1, v:1}, {s: 2, v: 2}, {s: 3, v: 3}]",
"tags" : "['mongodb', 'database', 'NoSQL']",
"likes" : "100"
}
I want to update v to 200 for s = 1 and s= 2 in img_url list.
It is easy to update v for any single s.
Is there any way to update multiple documents satisfying some criteria.
I tried:
db.test.update({ "_id" : ObjectId("7df78ad8902c"), "img_url.s": {$in : ["1", "2"]}}, {$set: { "img_url.$.v" : 200 } });
and
db.test.update({ "_id" : ObjectId("7df78ad8902c"), "img_url.s": {$in : ["1", "2"]}}, {$set: { "img_url.$.v" : 200 } }, {mulit: true});
Some sources are suggesting it is not possible to do so.
Multiple update of embedded documents' properties
https://jira.mongodb.org/browse/SERVER-1243
Am I missing something ?
For the specific case/example you have here. You are specifying an _id which means you are to update only one with that specific _id.
to update img_url try without the _id; something like this:
db.test.update({}, {"$set":{"img_url.0":{s:1, v:400}}}, {multi:true})
db.test.update({}, {"$set":{"img_url.1":{s:2, v:400}}}, {multi:true})
0 and 1 in img_url are the array indexes for s:1 and s:2
in order to update based on specific criteria you need to set the attribute you need on the first argument. say for example, to update all documents that have likes greater than 100 increment by 1 you do (assuming likes type is int...):
db.people.update( { likes: {$gt:100} }, {$inc :{likes: 1}}, {multi: true} )
hope that helps

MongoDB aggregation and paging

I have documents with my internal id field inside of each document and date when this document was added. There could be number of documents with the same id (differents versions of the same document), but dates will always be different for those documents. I want in some query, to bring only one document from all versions of the same document (with same id field) that was relevant to specified date, and I want to display them with paging (50 rows in the page). So, is there any chance to do this in MongoDB (operations - query documents by some field, group them by id field, sort by date field and take only first, and all this should be with paging.) ?
Please see example :Those are documents, some of them different documents,like documents A,B and C, and some are versions of the same documents,
like _id: 1, 2 and 3 are all version of the same document A
Document A {
_id : 1,
"id" : "A",
"author" : "value",
"date" : "2015-11-05"
}
Document A {
_id : 2,
"id" : "A",
"author" : "value",
"date" : "2015-11-06"
}
Document A {
_id : 3,
"id" : "A",
"author" : "value",
"date" : "2015-11-07"
}
Document B {
_id : 4,
"id" : "B",
"author" : "value",
"date" : "2015-11-06"
}
Document B {
_id : 5,
"id" : "B",
"author" : "value",
"date" : "2015-11-07"
}
Document C {
_id : 6,
"id" : "C",
"author" : "value",
"date" : "2015-11-07"
}
And I want to query all documents that has "value" in the "author" field.
And from those documents to bring only one document of each with latest date for
the specified date, for example 2015-11-08. So, I expect the result to be :
_id : 3, _id : 5, _id : 6
And also paging , for example 10 documents in each page.
Thanks !!!!!
Two documents can't have the same _id. There is a unique index on _id by default.
As per 1. you need to have a compound _id field which includes the date:
{
"_id":{
docId: yourFormerIdValue,
date: new ISODate()
}
// other fields
}
To get the version valid at a specified date, the query becomes rather easy:
db.yourColl.find({
"_id":{
"docId": idToFind,
// get only the version valid up to a specific date...
"date":{ "$lte": someISODate }
}
})
// ...sort the results descending...
.sort("_id.date":-1)
// ...and get only the first and therefor newest entry
.limit(1)

Sorting with condition in mongodb

Here is the example collection:
[
{'name': 'element1', ratings: {'user1': 1, 'user2':2}},
{'name': 'element2', ratings: {'user1': 2, 'user2':1}}
]
I want to sort for user1: ['element2', 'element1'] ,
and for user2: ['element1', 'element2'].
In other words, I want to place the results with maximum rating for current user to top.
The collection is very big, thats why search must use indexes. The collection structure can be modified, this is just example.
Those are sorts on different fields. The first is a sort on { "ratings.user1" : -1 } and the second is a sort on { "ratings.user2" : -1 }; you will need an index on each field to support each sort. You can't scale a set up like that beyond a few users. I don't know what the entire use case is for these documents, but if the core requirement is to sort elements for a user based on the user's ratings, I would restructure the collection so that a single document represents a rating of a particular element by a particular user:
{
"_id" : ObjectId(...),
"element" : "element1",
"user" : "user1",
"rating" : 1
},
{
"_id" : ObjectId(...),
"element" : "element1",
"user" : "user2",
"rating" : 2
}
If you create an index on { "user" : 1, "rating" : -1 }, you can perform indexed queries for the ratings of elements by a particular user sorted descending by rating:
db.ratings.find({ "user" : "user1" }).sort({ "rating" : -1 })

MongoDB Why this error : can't append to array using string field name: comments

I have a DB structure like below:
{
"_id" : 1,
"comments" : [
{
"_id" : 2,
"content" : "xxx"
}
]
}
I update a new subdocument in the comments feild. It is OK.
db.test.update(
{"_id" : 1, "comments._id" : 2},
{$push : {"comments.$.comments" : {_id : 3, content:"xxx"}}}
)
after that the DB structure:
{
"_id" : 1,
"comments" : [
{
"_id" : 2,
"comments" : [
{
"id" : 3,
"content" : "xxx"
}
],
"content" : "xxx"
}
]
}
But when I update a new subdocument in the comment field that _id is 3, There is a error:
db.test.update(
{"_id" : 1, "comments.comments.id" : 3},
{$push : {"comments.comments.$.comments" : {id : 4, content:"xxx"}}}
)
error message:
can't append to array using string field name: comments
Well, it makes total sense if you think about it. MongoDb has the advantage and the disadvantage of solving magically certain things.
When you query the database for a specific regular field like this:
{ field : "value" }
The query {field:"value"} makes total sense, it wouldn't in case value is part of an array but Mongo solves it for you, so in case the structure is:
{ field : ["value", "anothervalue"] }
Mongo iterates through all of them and matches "value" into the field and you don't have to think about it. It works perfectly.. at only one level, because it's impossible to guess what you want to do if you have multiple levels
In your case the first query works because it's the case in this example:
db.test.update(
{"_id" : 1, "comments._id" : 2},
{$push : {"comments.$.comments" : {_id : 3, content:"xxx"}}}
)
Matches _id in the first level, and comments._id at the second level, it gets an array as a result but Mongo is able to solve it.
But in the second case, think what you need, let's isolate the where clause:
{"_id" : 1, "comments.comments.id" : 3},
"Give me from the main collection records with _id:1" (one doc)
"And comments which comments inside have and id=3" (array * array)
The first level is solved easily, comments.id, the second is not possible due comments returns an array, but one more level is an array of arrays and Mongo gets an array of arrays as a result and it's not possible to push a document into all the records of the array.
The solution is to narrow your where clause to obtain an unique document in comments (could be the first one) but it's not a good solution because you never know what is the position of the document you're looking for, using the shell I think the only option to be accurate is to do it in two steps. Check this query that works (not the solution anyway) but "solves" the multiple array part fixing it to the first record:
db.test.update(
{"_id" : 1, "comments.0.comments._id" : 3},
{$push : {"comments.0.comments.$.comments" : {id : 4, content:"xxx"}}}
)