MongoDB Update Deep Array - mongodb

I have the following object in my mongo database named music.
I want to update where the genre is Grunge
The band name is Nirvana
The album name is Nevermind
The track order is 1
and change the track's name to "Smells Like Teen Spirit!".
I've tried playing with the positional operator, but can't quite
figure this out.
{
genre : "Grunge",
bands : [ {
name : "Nirvana",
albums : [ {
name : "Nevermind",
tracks : [ {
name : "Smell Like Teen Spirit",
order : 1,
duration : 301
},
{
name : "In Bloom",
order : 2,
duration : 254
} ]
},
{
name : "In Utero",
tracks : [ {
name : "Server the Servants",
order : 1,
duration : 216
},
{
name : "Scentless Apprentice",
order : 2,
duration : 254
} ]
} ]
},
{
name : "Karma++ : A Nirvina Tribute Band",
albums : [ {
name : "Nevermind",
tracks : [ {
name : "Smell Like Teen Spirit",
order : 1,
duration : 301
},
{
name : "In Bloom",
order : 2,
duration : 254
} ]
},
{
name : "In Utero",
tracks : [ {
name : "Server the Servants",
order : 1,
duration : 216
},
{
name : "Scentless Apprentice",
order : 2,
duration : 254
} ]
} ]
} ]
}

Unfortunately, at present it is only possible to use a single "$" positional per update. This limits the update to a single embedded array, similar to the example in the documentation: http://www.mongodb.org/display/DOCS/Updating#Updating-The%24positionaloperator
(From your post, it looks like you have already found this, but I have included the link for the benefit of any other users reading this post.)
In order to make the update, you will have to know the position of two out of the following three: The position of the band in the "bands" array, the position of the album in the albums array, or the position of the track in the "tracks" array.
There is a feature request for this functionality, and it is slated for version 2.3.0 (although this is subject to change).
https://jira.mongodb.org/browse/SERVER-831 "Positional Operator Matching Nested Arrays"
For the time being, you will have to know the position of the sub documents in two out of the three arrays:
db.music.update({genre : "Grunge", "bands.name" : "Nirvana"}, {$set:{"bands.$.albums.0.tracks.0.name":"Smells Like Teen Spirit!"}})
db.music.update({genre : "Grunge", "bands.0.albums.name" : "Nevermind"}, {$set:{"bands.0.albums.$.tracks.0.name":"Smells Like Teen Spirit!"}})
or
db.music.update({genre : "Grunge", "bands.0.albums.0.tracks.order" : 1}, {$set:{"bands.0.albums.0.tracks.$.name":"Smells Like Teen Spirit!"}})

Related

How to find and return the page that a result is on in MongoDB aggregation pipeline?

I have a mongo DB aggregation pipeline that performs the following steps:
Sorts a list of user stats objects by timestamp
Groups the results by user ID
Sorts by a specified stat's name
Pages the results via skip and limit stages
In plain English, this pipeline returns a page from a list of user stats sorted by a specified stat. Each user can have multiple stats object, so I group to return only the most recent stats object for each user.
In Mongo Shell, this looks like:
db.getCollection("stats").aggregate(
[
{ "$sort" : { "Timestamp" : -1.0 } },
{
"$group" : {
"_id" : "$UserId",
"UserId" : { "$last" : "$UserId" },
"StatsOverall" : { "$last" : "$StatsOverall" },
"Timestamp" : { "$last" : "$Timestamp" }
}
},
{ "$sort" : { "StatsOverall.Rank" : -1.0 } },
{ "$skip" : specifiedPageNumber },
{ "$limit" : specifiedNumResultsPerPage }
]
);
This works fine.
I now want to modify this query to be able to search the user by name, and get back the entire page that user is contained on. (This is for a leaderboard). So, if the user is on page 5 of the leaderboard, I want to return the entirety of page 5.
However, I'm having trouble seeing a solution that doesn't require me to either load all of the users in to memory and page them there (awful idea), or go back and forth to the database iterating through pages (almost as awful).
Is there some way I can modify my aggregation pipeline to do all this at the database level?
EDIT: As requested, added some sample data and the expected result.
Sample data looks something like this... I've omitted some fields that aren't relevant. The initial data is a collection of user's stats, where each user can have more than one object. My existing pipeline returns the 1 most recent stats object for each user sorted by a specified stat name.
{
"_id" : "5c611e71ab0ffc430410e0ba",
"UserId" : "5c611e71ab0ffc430410e0ba",
"StatsOverall" : {
"Rank" : NumberInt(1000),
"GamesLost" : NumberInt(30),
"GamesWon" : NumberInt(50)
}
"Timestamp" : "2019-02-10T21:35:06.599Z"
}
// ----------------------------------------------
{
"_id" : "5c6238658966ae5860795879",
"UserId" : "5c6238658966ae5860795879",
"StatsOverall" : {
"Rank" : NumberInt(413),
"GamesLost" : NumberInt(2),
"GamesWon" : NumberInt(141),
},
"Timestamp" : "2019-02-10T21:35:06.599Z"
}
// many objects like this
The expected result looks like this:
{
"_id" : "5c611e71ab0ffc430410e0ba",
"UserId" : "5c611e71ab0ffc430410e0ba",
"StatsOverall" : {
"Rank" : NumberInt(1000),
"GamesLost" : NumberInt(30),
"GamesWon" : NumberInt(50)
}
"Timestamp" : "2019-02-10T21:35:06.599Z"
}
It returns the exact same type of object, sorted the same way as the existing pipeline, however I want to return only the page the the user is on. In the example result, assume the page size is just 1 result per page. So, the result would contain the 1 page that the user with the given UserId is on. In my sample result, that ID would be 5c611e71ab0ffc430410e0ba.

Error in mongodb query to get movie based on id

> db.movmodels.findOne()
{
"_id" : ObjectId("55320b0e0e9e0d9d0540593c"),
"username" : "punk",
"favMovies" : [
{
"alternate_ids" : {
"imdb" : "0137523"
},
"abridged_cast" : [
{
"characters" : [
"Tyler"
],
"id" : "162652627",
"name" : "Brad Pitt"
},
{
"characters" : [
"Narrator"
],
"id" : "162660884",
"name" : "Edward Norton"
},
{
"characters" : [
"Robert"
],
"id" : "162676383",
"name" : "Meat Loaf"
},
{
"characters" : [
"Angel Face"
],
"id" : "162653925",
"name" : "Jared Leto"
},
{
"characters" : [
"Boss"
],
"id" : "770706064",
"name" : "Zach Grenier"
}
],
"synopsis" : "",
"ratings" : {
"audience_score" : 96,
"audience_rating" : "Upright",
"critics_score" : 80,
"critics_rating" : "Certified Fresh"
},
"release_dates" : {
"dvd" : "2000-06-06",
"theater" : "1999-10-15"
},
"critics_consensus" : "",
"runtime" : 139,
"mpaa_rating" : "R",
"year" : 1999,
"title" : "Fight Club",
**"id" : "13153"**
}
],
"__v" : 0
}
This is my data in mongodb.
As I am new to mongodb I wanted to know query to get movie with a particular id.
The query that I tried is. I need to get the movie based on id so that I can remove it from my database
db.movmodels.findOne({username:"punk"},{favMovies:{id:13153}})
but this gives me error.
2015-04-18T05:41:26.221-0400 E QUERY Error: error: {
"$err" : "Can't canonicalize query: BadValue ported projection option: favMovies: { id: 13153.0 }",
"code" : 17287
}
at Error (<anonymous>)
at DBQuery.next (src/mongo/shell/query.js:259:15)
at DBCollection.findOne (src/mongo/shell/collection.js:188:22)
at (shell):1:14 at src/mongo/shell/query.js:259
There are several problems with your query:
The second parameter to find() is a projection, not part of the query. What you want is to supply one document for the query that has two properties: {"username" : "punk", favMovies : { ... } }
However, you also don't want to compare the entire sub-document favMovies, but you only want to match on one of its properties, the id, which requires to 'reach into the object' using the dot operator: {username:"punk", "favMovies.id" : 13153}.
However, that will probably not work yet, because 13153 is not the same as "13153", the latter being a string while the former is a number in JSON.
db.movmodels.findOne({username:"punk", "favMovies.id" : "13153"})
Keep in mind, however, that this will find the entire document for the user named "punk". I'm not sure what exactly your data structure should look like, but it appears you'll have to $pull the movie from the user. In general, I'd say you're embedding too much data into the user, but that's hard to tell without knowing the exact use case.
Here you go:
If you just wanted to get first user who has this fav movie:
db.movmodels.findOne({"favMovies.id": 13153});
And, if you want to know if that user has that movie as favorite.
db.movmodels.findOne({"favMovies.id": 13153, username:"punk"});
Second argument in the findOne is used to only return particular field.
You can use also $elemMatch projection operator (not to be confused with the $elemMatch query operator)
db.movmodels.find({username:"punk"},{favMovies:{$elemMatch:{id:"13153"}}});
`
If you want to find a movie that has another movie (with id 13153) in 'favMovies' array, then write the query as below:
db.movmodels.findOne({username:"punk",'favMovies.id':13153})
And if you want to find a movie with _id 55320b0e0e9e0d9d0540593cwrite the following query:
db.movmodels.findOne({username:"punk",'_id':ObjectId("55320b0e0e9e0d9d0540593c")})

Resolving MongoDB DBRef array using Mongo Native Query and working on the resolved documents

My MongoDB collection is made up of 2 main collections :
1) Maps
{
"_id" : ObjectId("542489232436657966204394"),
"fileName" : "importFile1.json",
"territories" : [
{
"$ref" : "territories",
"$id" : ObjectId("5424892224366579662042e9")
},
{
"$ref" : "territories",
"$id" : ObjectId("5424892224366579662042ea")
}
]
},
{
"_id" : ObjectId("542489262436657966204398"),
"fileName" : "importFile2.json",
"territories" : [
{
"$ref" : "territories",
"$id" : ObjectId("542489232436657966204395")
}
],
"uploadDate" : ISODate("2012-08-22T09:06:40.000Z")
}
2) Territories, which are referenced in "Map" objects :
{
"_id" : ObjectId("5424892224366579662042e9"),
"name" : "Afghanistan",
"area" : 653958
},
{
"_id" : ObjectId("5424892224366579662042ea"),
"name" : "Angola",
"area" : 1252651
},
{
"_id" : ObjectId("542489232436657966204395"),
"name" : "Unknown",
"area" : 0
}
My objective is to list every map with their cumulative area and number of territories. I am trying the following query :
db.maps.aggregate(
{'$unwind':'$territories'},
{'$group':{
'_id':'$fileName',
'numberOf': {'$sum': '$territories.name'},
'locatedArea':{'$sum':'$territories.area'}
}
})
However the results show 0 for each of these values :
{
"result" : [
{
"_id" : "importFile2.json",
"numberOf" : 0,
"locatedArea" : 0
},
{
"_id" : "importFile1.json",
"numberOf" : 0,
"locatedArea" : 0
}
],
"ok" : 1
}
I probably did something wrong when trying to access to the member variables of Territory (name and area), but I couldn't find an example of such a case in the Mongo doc. area is stored as an integer, and name as a string.
I probably did something wrong when trying to access to the member variables of Territory (name and area), but I couldn't find an example
of such a case in the Mongo doc. area is stored as an integer, and
name as a string.
Yes indeed, the field "territories" has an array of database references and not the actual documents. DBRefs are objects that contain information with which we can locate the actual documents.
In the above example, you can clearly see this, fire the below mongo query:
db.maps.find({"_id":ObjectId("542489232436657966204394")}).forEach(function(do
c){print(doc.territories[0]);})
it will print the DBRef object rather than the document itself:
o/p: DBRef("territories", ObjectId("5424892224366579662042e9"))
so, '$sum': '$territories.name','$sum': '$territories.area' would show you '0' since there are no fields such as name or area.
So you need to resolve this reference to a document before doing something like $territories.name
To achieve what you want, you can make use of the map() function, since aggregation nor Map-reduce support sub queries, and you already have a self-contained map document, with references to its territories.
Steps to achieve:
a) get each map
b) resolve the `DBRef`.
c) calculate the total area, and the number of territories.
d) make and return the desired structure.
Mongo shell script:
db.maps.find().map(function(doc) {
var territory_refs = doc.territories.map(function(terr_ref) {
refName = terr_ref.$ref;
return terr_ref.$id;
});
var areaSum = 0;
db.refName.find({
"_id" : {
$in : territory_refs
}
}).forEach(function(i) {
areaSum += i.area;
});
return {
"id" : doc.fileName,
"noOfTerritories" : territory_refs.length,
"areaSum" : areaSum
};
})
o/p:
[
{
"id" : "importFile1.json",
"noOfTerritories" : 2,
"areaSum" : 1906609
},
{
"id" : "importFile2.json",
"noOfTerritories" : 1,
"areaSum" : 0
}
]
Map-Reduce functions should not be and cannot be used to resolve DBRefs in the server side.
See what the documentation has to say:
The map function should not access the database for any reason.
The map function should be pure, or have no impact outside of the
function (i.e. side effects.)
The reduce function should not access the database, even to perform
read operations. The reduce function should not affect the outside
system.
Moreover, a reduce function even if used(which can never work anyway) will never be called for your problem, since a group w.r.t "fileName" or "ObjectId" would always have only one document, in your dataset.
MongoDB will not call the reduce function for a key that has only a
single value

Queries on arrays with timestamps

I have documents that look like this:
{
"_id" : ObjectId( "5191651568f1f6000282b81f" ),
"updated_at" : "2013-05-16T09:46:16.199660",
"activities" : [
{
"worker_name" : "image",
"completed_at" : "2013-05-13T21:34:59.293711"
},
{
"worker_name" : "image",
"completed_at" : "2013-05-16T07:33:22.550405"
},
{
"worker_name" : "image",
"completed_at" : "2013-05-16T07:41:47.845966"
}
]
}
and I would like to find only those documents where the updated_at time is greater than the last activities.completed_at time (the array is in time order)
i currently have this, but it matches any activities[].completed_at
{
"activities.completed_at" : {"$gte" : "updated_at"}
}
thanks!
update
well, i have different workers, and each has its own "completed_at".
i'll have to invert activites as follows:
activities: { image :
last : {
completed_at: t3,
},
items: [
{completed_at: t0},
{completed_at: t1},
{completed_at: t2},
{completed_at: t3},
]
}
and use this query:
{
"activities.image.last.completed_at" : {"$gte" : "updated_at"}
}
Assuming that you don't know how many activities you have (it would be easy if you always had 3 activities for example with a activities.3.completed_at positional operator) and since there's no $last positional operator, the short answer is that you cannot do this efficiently.
When the activities are inserted, I would update the record's updated_at value (or another field). Then it becomes a trivial problem.

Map reduce in mongodb

I have mongo documents in this format.
{"_id" : 1,"Summary" : {...},"Examples" : [{"_id" : 353,"CategoryId" : 4},{"_id" : 239,"CategoryId" : 28}, ... ]}
{"_id" : 2,"Summary" : {...},"Examples" : [{"_id" : 312,"CategoryId" : 2},{"_id" : 121,"CategoryId" : 12}, ... ]}
How can I map/reduce them to get a hash like:
{ [ result[categoryId] : count_of_examples , .....] }
I.e. count of examples of each category.
I have 30 categories at all, all specified in Categories collection.
If you can use 2.1 (dev version of upcoming release 2.2) then you can use Aggregation Framework and it would look something like this:
db.collection.aggregate( [
{$project:{"CatId":"$Examples.CategoryId","_id":0}},
{$unwind:"$CatId"},
{$group:{_id:"$CatId","num":{$sum:1} } },
{$project:{CategoryId:"$_id",NumberOfExamples:"$num",_id:0 }}
] );
The first step projects the subfield of Examples (CategoryId) into a top level field of a document (not necessary but helps with readability), then we unwind the array of examples which creates a separate document for each array value of CatId, we do a "group by" and count them (I assume each instance of CategoryId is one example, right?) and last we use projection again to relabel the fields and make the result look like this:
"result" : [
{
"CategoryId" : 12,
"NumberOfExamples" : 1
},
{
"CategoryId" : 2,
"NumberOfExamples" : 1
},
{
"CategoryId" : 28,
"NumberOfExamples" : 1
},
{
"CategoryId" : 4,
"NumberOfExamples" : 1
}
],
"ok" : 1