mongodb: inc high level document and embedded document - mongodb

I want to increment two values, one in high level document and one in embedded document:
{
studentId: "x1"
numberOfAttending: 2
courses: [
{
courseId:"y1"
numberOfAttending: 1
},
{
courseId:"y2"
numberOfAttending: 1
}
]
}
How could i inc the number of attending for student and for the course (upsert). and could i do it with a single update query ?

That's going to be tough since courses is an array. You'll need to know the index of the course you want to update, then do something like:
{ '$inc' : {numberOfAttending : 1, 'courses.1.numberOfAttending' : 1}}
Have you thought about switching it to a single embedded doc with courseId as a key for each course? If so, you can run a command like this to increment both. This doesn't depend on position so it's going to be less fragile:
{ '$inc' : { numberOfAttending : 1, 'courses.y2.numberOfAttending' : 1}}

Related

Execution time of a query - MongoDB

I have two collections: coach and team.
Coach collection contains information about coaches like name, surname, age and an array coached_Team that contains the _id of the team that a coach coached.
The team collection contains data about teams like _id, common name, official name, country, championship....
If I want to find, for example, the official name of all teams coached by Allegri, I have to do two queries, the first on coach collection:
>var x = db.coach.find({surname:"Allegri"},{_id:0, "coached_Team.team_id":1})
>var AllegriTeams
>while(x.hasNext()) AllegriTeams=x.next()
{
"coached_Team" : [
{
"team_id" : "Juv.26
},
{
"team_id" : "Mil.74
},
{
"team_id" : "Cag.00
}
]
}
>AllegriTeams=AllegriTeams.coached_Team
[
{
"team_id" : "Juv.26"
},
{
"team_id" : "Mil.74"
},
{
"team_id" : "Cag.00"
}
]
And then I have to execute three queries on team collection:
> db.team.find({ _id:AllegriTeams[0].team_id}, {official_name:1,_id:0})
{official_name : "Juventus Football Club S.p.A."}
> db.team.find({ _id:AllegriTeams[1].team_id}, {official_name:1,_id:0})
{official_name : "Associazione Calcio Milan S.p.A"}
> db.team.find({ _id:AllegriTeams[2].team_id}, {official_name:1,_id:0})
{official_name:"Cagliari Calcio S.p.A"}
Now consider I have about 100k documents on collection team and collection coach. The first query, on coach collection, needs about 71 ms plus the time of while cycle. The three queries on team collection, using cursor.explain("executionStats") needs 0 ms. I don't understand why this query takes 0.
I need executionTimeMillis of these three queries to have the execution time of the query "find official names of all teams coached by Allegri". I want to add the execution time of the query on coach collection(71ms) with the execution time of these three. If the time of these three queries is 0 what can I say about the execution time of the query mainly?
I think the more important observation here is that 71ms is a long time for a simple fetch of one item. Looks like your "surname" field needs an index. The other "three" queries are simple lookups of a primary key, which is why they are relatively fast.
db.coach.createIndex({ "surname": 1 })
If that surname is actually "unique" then add that too:
db.coach.createIndex({ "surname": 1 },{ "unique": true })
You can also simplify your "three" queries as as one by simply mapping the array, and applying the $in operator:
var teamIds = [];
db.coach.find(
{ "surname": "Allegri" },
{ "_id":0, "coached_Team.team_id":1 }
).forEach(function(coach) {
teamIds = coach.coached_Team.map(function(team) {
return team.team_id }).concat(teamIds);
});
});
db.team.find(
{ "_id": { "$in": teamIds" }},
{ "official_name": 1, "_id": 0 }
).forEach(function(team) {
printjson(team);
});
And then certainly the overall execution time is way down, as well as removing the overhead of multiple operations down to just the two queries requried.
Also remembering here that despite what is in the execution plan stats, the more queries to make to and from the server then the longer the overal real time execution will be for making each request and retriving the data. So it is best to keep things as minimal as possible.
Therefore even more logical would be that where to "need" this information regularly, storing the "coach name" on the "team itself" ( and indexing that data ) leads to the fastest possible response and only a single query operation.
It's easy to get caught up in observing execution stats. But really, think of what is "best" and "fastest" as a pattern for the sort of queries you want to do.

MongoDB findOne query depends of result

lets say that I have collection of something and record can look like that:
{
'_id' : MongoId('xxxxxx'),
'dir' : '/home/',
'category' : 'catname',
'someData' : 'very important info'
}
Is it possible to make only one query like collection.findOne({????}, {'someData' : 1}); to find colection matching this:
find by dir, if not found search for category name, if there is still
nothing find collection with category name is 'default'
or at least, can I say in this query that I want first match dir and if not found I want match category
collection.findOne({
$or : [
{'dir' : 'someCondition'},
{'category' : 'someCondition'}
]
});
Try this:
db.collection.aggregate([
{ '$project':
{
//Keep existing fields
'_id':1,
'dir':1,
'category':1,
'someData':1,
//Compute some new values
'sameDir': {'$eq':['$dir', 'SOMEDIR']},
'sameCategory': {'$eq':['$dir', 'SOMECATEGORY']},
'defaultCategory': {'$eq':['$dir', 'default']},
}
},
{ '$sort':
{
'sameDir': -1,
'sameCategory': -1,
'defaultCategory': -1,
}
},
{ '$limit': 1 }
])
The $sort will keep true values first, so a directory match will come first, followed by a category match, finally followed by a default category. The $limit:1 will keep the one on top (ie. the best match). Of course, make sure to input your own SOMEDIR and SOMECATEGORY values.

Use MongoDB aggregation to find set intersection of two sets within the same document

I'm trying to use the Mongo aggregation framework to find where there are records that have different unique sets within the same document. An example will best explain this:
Here is a document that is not my real data, but conceptually the same:
db.house.insert(
{
houseId : 123,
rooms: [{ name : 'bedroom',
owns : [
{name : 'bed'},
{name : 'cabinet'}
]},
{ name : 'kitchen',
owns : [
{name : 'sink'},
{name : 'cabinet'}
]}],
uses : [{name : 'sink'},
{name : 'cabinet'},
{name : 'bed'},
{name : 'sofa'}]
}
)
Notice that there are two hierarchies with similar items. It is also possible to use items that are not owned. I want to find documents like this one: where there is a house that uses something that it doesn't own.
So far I've built up the structure using the aggregate framework like below. This gets me to 2 sets of distinct items. However I haven't been able to find anything that could give me the result of a set intersection. Note that a simple count of set size will not work due to something like this: ['couch', 'cabinet'] compare to ['sofa', 'cabinet'].
{'$unwind':'$uses'}
{'$unwind':'$rooms'}
{'$unwind':'$rooms.owns'}
{'$group' : {_id:'$houseId',
use:{'$addToSet':'$uses.name'},
own:{'$addToSet':'$rooms.owns.name'}}}
produces:
{ _id : 123,
use : ['sink', 'cabinet', 'bed', 'sofa'],
own : ['bed', 'cabinet', 'sink']
}
How do I then find the set intersection of use and own in the next stage of the pipeline?
You were not very far from the full solution with aggregation framework - you needed one more thing before the $group step and that is something that would allow you to see if all the things that are being used match up with something that is owned.
Here is the full pipeline
> db.house.aggregate(
{'$unwind':'$uses'},
{'$unwind':'$rooms'},
{'$unwind':'$rooms.owns'},
{$project: { _id:0,
houseId:1,
uses:"$uses.name",
isOkay:{$cond:[{$eq:["$uses.name","$rooms.owns.name"]}, 1, 0]}
}
},
{$group: { _id:{house:"$houseId",item:"$uses"},
hasWhatHeUses:{$sum:"$isOkay"}
}
},
{$match:{hasWhatHeUses:0}})
and its output on your document
{
"result" : [
{
"_id" : {
"house" : 123,
"item" : "sofa"
},
"hasWhatHeUses" : 0
}
],
"ok" : 1
}
Explanation - once you unwrap both arrays you now want to flag the elements where used item is equal to owned item and give them a non-0 "score". Now when you regroup things back by houseId you can check if any used items didn't get a match. Using 1 and 0 for score allows you to do a sum and now a match for item which has sum 0 means it was used but didn't match anything in "owned". Hope you enjoyed this!
So here is a solution not using the aggregation framework. This uses the $where operator and javascript. This feels much more clunky to me, but it seems to work so I wanted to put it out there if anyone else comes across this question.
db.houses.find({'$where':
function() {
var ownSet = {};
var useSet = {};
for (var i=0;i<obj.uses.length;i++){
useSet[obj.uses[i].name] = true;
}
for (var i=0;i<obj.rooms.length;i++){
var room = obj.rooms[i];
for (var j=0;j<room.owns.length;j++){
ownSet[room.owns[j].name] = true;
}
}
for (var prop in ownSet) {
if (ownSet.hasOwnProperty(prop)) {
if (!useSet[prop]){
return true;
}
}
}
for (var prop in useSet) {
if (useSet.hasOwnProperty(prop)) {
if (!ownSet[prop]){
return true;
}
}
}
return false
}
})
For MongoDB 2.6+ Only
As of MongoDB 2.6, there are set operations available in the project pipeline stage. The way to answer this problem with the new operations is:
db.house.aggregate([
{'$unwind':'$uses'},
{'$unwind':'$rooms'},
{'$unwind':'$rooms.owns'},
{'$group' : {_id:'$houseId',
use:{'$addToSet':'$uses.name'},
own:{'$addToSet':'$rooms.owns.name'}}},
{'$project': {int:{$setIntersection:["$use","$own"]}}}
]);

MongoDB: Doing $inc on multiple keys

I need help incrementing value of all keys in participants without having to know name of the keys inside of it.
> db.conversations.findOne()
{
"_id" : ObjectId("4faf74b238ba278704000000"),
"participants" : {
"4f81eab338ba27c011000001" : NumberLong(2),
"4f78497938ba27bf11000002" : NumberLong(2)
}
}
I've tried with something like
$mongodb->conversations->update(array('_id' => new \MongoId($objectId)), array('$inc' => array('participants' => 1)));
to no avail...
You need to redesign your schema. It is never a good idea to have "random key names". Even though MongoDB is schemaless, it still means you need to have defined key names. You should change your schema to:
{
"_id" : ObjectId("4faf74b238ba278704000000"),
"participants" : [
{ _id: "4f81eab338ba27c011000001", count: NumberLong(2) },
{ _id: "4f78497938ba27bf11000002", count: NumberLong(2) }
]
}
Sadly, even with that, you can't update all embedded counts in one command. There is currently an open feature request for that: https://jira.mongodb.org/browse/SERVER-1243
In order to still update everything, you should:
query the document
update all the counts on the client side
store the document again
In order to prevent race conditions with that, have a look at "Compare and Swap" and following paragraphs.
It is not possible to update all nested elements in one single move in current version of MongoDB. So I can advice to use "foreach {}".
Read realted topic: How to Update Multiple Array Elements in mongodb
I hope this feature will be implemented in next version.

Querying and grouping in mongoDb?

Part 1:
I have (student) collection:
{
sname : "",
studentId: "123"
age: "",
gpa: "",
}
im trying to get only two keys from it :
{
sname : "",
studentId: "123"
}
so i need to eliminate age and gpa to have only name and studentId , how could i do that ?
Part2:
Then I have 'subject' collection :
{
subjectName : "Math"
studentId : "123"
teacherName: ""
}
I need to match/combine the previous keys (in part1) with the correct studentId so I will end up with something like this :
{
sname : "",
studentId: "123",
subjectName : "Math"
}
How can i do this and is that the right way to think to get the result? i tried to read about group and mapReduce but i didnt find a clear example.
To answer your first question, you can do this:
db.student.find({}, {"sname":1, "studentId":1});
The first {} in that is the limiting query, which in this case includes the entire collection. The second half specifies keys with a 1 or 0 depending on whether or not you want them back. Don't mix include and excludes in a single query though. Except for a couple special cases, mongo won't accept it.
Your second question is more difficult. What you're asking for is a join and mongo doesn't support that. There is no way to connect the two collections on studentId. You'll need to find all the students that you want, then use those studentIds to find all the matching subjects. Then you'll need to merge the two results in your own code. You can do this through whatever driver you're using, or you can do this in javascript in the shell itself, but either way, you'll have to merge them with your own code.
Edit:
Here's an example of how you could do this in the shell with the output going to a collection called "out".
db.student.find({}, {"sname":1, "studentId":1}).forEach(
function (st) {
db.subject.find({"studentId":st.studentId}, {"subjectName":1}).forEach(
function (sub) {
db.out.insert({"sname":st.sname, "studentId":st.studentId, "subjectName":sub.subjectName});
}
);
}
);
If this isn't data that changes all that often, you could just drop the "out" collection and repopulate it periodically with this shell script. Then your code could query directly from "out". If the data does change frequently, you'll want to do this merging in your code on the fly.
Another, and possibly better, option is to include the "subject" data in the "student" collection or vice versa. This will result in a more mongodb friendly structure. If you run into this joining problem frequently, mongo may not be the way to go and a relational database may be better suited to your needs.
Mongo's find() operator lets you include or exclude certain fields from the results
Check out Field Selection in the docs for more info. You could do either:
db.users.find({}, { 'sname': 1, 'studentId': 1 });
db.users.find({}, { 'age': 0, 'gpa': 0 });
For relating your student and subject together, you could either lookup which subjects a student has separately, like this:
db.subjects.find({ studentId: 123 });
Or embed subject data with each student, and retrieve it together with the student document:
{
sname : "Roland Browning",
studentId: "123"
age: 14,
gpa: "B",
subjects: [ { name : "French", teacher: "Mr Bronson" }, ... ]
}