Monodb database migration with embedded query - mongodb

Currently in my database I have messages objects set up as the following.
{
"name" : "System",
"message" : "Sean Callahan has entered the room.",
"time" : 1406479167270,
"type" : "system_message",
"room" : "helloroom",
"_id" : "4yeHzhHAQmGJNtHww"
}
I want to basically migrate my data so that every message has a roomId that point it at the appropriate room. Currently this is done by the with the room attribute, which I know see the fault in my ways for various reasons.
My room objects are setup something like this.
{
"_id:" xxxxxxxxx
"room_name:" "testingroom"
}
So I was hoping there was a way to run a one-liner that would just add the correct roomId to every current message based on the current room attribute that is set
I was thinking something along the lines of..
db.messages.update({}, {$set: {roomId: db.rooms.findOne({room_name: room})._id}})
As of now, I am getting room is not defined, which makes perfect sense. But I can't seem to get it right, and this may just not be possible in a one-line query.

As you discovered, this isn't possible in a one-line query since you need to join data from two collections.
Here's an example of how to add the missing field in the mongo shell:
db.messages.find(
{ roomId: { $exists: false} }
).forEach(function(room) {
var roomId = db.rooms.findOne({room_name: room.room});
if (roomId._id) {
db.messages.update(
{ _id: room._id },
{ $set: { roomId: roomId._id }}
)
}
})
You could tidy this up with some error checking, and for updates on a large collection consider using the Bulk Update API (only available in MongoDB 2.6+).

Related

Query an array of embedded documents in mongodb

I'm having a little trouble writing a query that needs to compare a given value against a certain field in all embedded documents within an array. I will give an example to make the issue less abstract.
Let's say I want to use MongoDB to store the last queries that users on my network have entered into different online search engines. An entry in the collection would have a structure like this :
{
'_id' : 'zinfandel',
'last_search' : [
{
'engine' : 'google.com',
'query' : 'why is the sky blue'
},
{
'engine' : 'bing.com',
'query' : 'what is love'
},
{ 'engine' : 'yahoo.com',
'query' : 'how to tie a tie'
}
]
}
Now let's say user username enters a new query into a certain engine. The code that stores this query in the DB needs to find out whether there already exists an entry for the engine that the user used. If yes, this entry is to be updated with the new query. If not, a new entry should be created. My idea is to do a $push only if there is no entry for the given engine and do a $set otherwise. For this purpose, I tried to write my push like this :
db.mycollection.update(
{ '_id' : username , search.$.engine : { '$ne' : engine } },
{ '$push' : { 'search.$.engine' : engine, 'search.$.query' : query } }
)
However, this pushes a new embedded document even if there already was an entry for the given engine. The problem seems to be that the $ne operator doesn't work with arrays like I expect it to work. What I need is a way to make sure that not a single embedded document in the array has an "engine" entry that matches the specified engine.
Does anyone have an idea how to do that? Please tell me if I need to further clarify the question ...
You can push the item into the array with the following command:
db.mycollection.update({
_id: "zinfandel",
"last_search.engine": {
$nin: ["notwellknownengine.com"]
}
}, {
$push: {
"last_search": {
"engine" : "notwellknownengine.com",
"query" : "stackoveflow.com"
}
}
});

MongoDB: Doing $inc on multiple keys

I need help incrementing value of all keys in participants without having to know name of the keys inside of it.
> db.conversations.findOne()
{
"_id" : ObjectId("4faf74b238ba278704000000"),
"participants" : {
"4f81eab338ba27c011000001" : NumberLong(2),
"4f78497938ba27bf11000002" : NumberLong(2)
}
}
I've tried with something like
$mongodb->conversations->update(array('_id' => new \MongoId($objectId)), array('$inc' => array('participants' => 1)));
to no avail...
You need to redesign your schema. It is never a good idea to have "random key names". Even though MongoDB is schemaless, it still means you need to have defined key names. You should change your schema to:
{
"_id" : ObjectId("4faf74b238ba278704000000"),
"participants" : [
{ _id: "4f81eab338ba27c011000001", count: NumberLong(2) },
{ _id: "4f78497938ba27bf11000002", count: NumberLong(2) }
]
}
Sadly, even with that, you can't update all embedded counts in one command. There is currently an open feature request for that: https://jira.mongodb.org/browse/SERVER-1243
In order to still update everything, you should:
query the document
update all the counts on the client side
store the document again
In order to prevent race conditions with that, have a look at "Compare and Swap" and following paragraphs.
It is not possible to update all nested elements in one single move in current version of MongoDB. So I can advice to use "foreach {}".
Read realted topic: How to Update Multiple Array Elements in mongodb
I hope this feature will be implemented in next version.

Add new field to all documents in a nested array

I have a database of person documents. Each has a field named photos, which is an array of photo documents. I would like to add a new 'reviewed' flag to each of the photo documents and initialize it to false.
This is the query I am trying to use:
db.person.update({ "_id" : { $exists : true } }, {$set : {photos.reviewed : false} }, false, true)
However I get the following error:
SyntaxError: missing : after property id (shell):1
Is this possible, and if so, what am I doing wrong in my update?
Here is a full example of the 'person' document:
{
"_class" : "com.foo.Person",
"_id" : "2894",
"name" : "Pixel Spacebag",
"photos" : [
{
"_id" : null,
"thumbUrl" : "http://site.com/a_s.jpg",
"fullUrl" : "http://site.com/a.jpg"
},
{
"_id" : null,
"thumbUrl" : "http://site.com/b_s.jpg",
"fullUrl" : "http://site.com/b.jpg"
}]
}
Bonus karma for anyone who can tell me a cleaner why to update "all documents" without using the query { "_id" : { $exists : true } }
For those who are still looking for the answer it is possible with MongoDB 3.6 with the all positional operator $[] see the docs:
db.getCollection('person').update(
{},
{ $set: { "photos.$[].reviewed" : false } },
{ multi: true}
)
Is this possible, and if so, what am I doing wrong in my update?
No. In general MongoDB is only good at doing updates on top-level objects.
The exception here is the $ positional operator. From the docs: Use this to find an array member and then manipulate it.
However, in your case you want to modify all members in an array. So that is not what you need.
Bonus karma for anyone who can tell me a cleaner why to update "all documents"
Try db.coll.update(query, update, false, true), this will issue a "multi" update. That last true is what makes it a multi.
Is this possible,
You have two options here:
Write a for loop to perform the update. It will basically be a nested for loop, one to loop through the data, the other to loop through the sub-array. If you have a lot of data, you will want to write this is your driver of choice (and possibly multi-thread it).
Write your code to handle reviewed as nullable. Write the data such that if it comes across a photo with reviewed undefined then it must be false. Then you can set the field appropriately and commit it back to the DB.
Method #2 is something you should get used to. As your data grows and you add fields, it becomes difficult to "back-port" all of the old data. This is similar to the problem of issuing a schema change in SQL when you have 1B items in the DB.
Instead just make your code resistant against the null and learn to treat it as a default.
Again though, this is still not the solution you seek.
You can do this
(null, {$set : {"photos.reviewed" : false} }, false, true)
The first parameter is null : no specification = any item in the collection.
"photos.reviewed" should be declared as string to update subfield.
You can do like this:
db.person.update({}, $set:{name.surname:null}, false, true);
Old topic now, but this just worked fine with Mongo 3.0.6:
db.users.update({ _id: ObjectId("55e8969119cee85d216211fb") },
{ $set: {"settings.pieces": "merida"} })
In my case user entity looks like
{ _id: 32, name: "foo", ..., settings: { ..., pieces: "merida", ...} }

Multiple update of embedded documents' properties

I have the following collection:
{
"Milestones" : [
{ "ActualDate" : null,
"Index": 0,
"Name" : "milestone1",
"TargetDate" : ISODate("2011-12-13T22:00:00Z"),
"_id" : ObjectId("4ee89ae7e60fc615c42e28d1")},
{ "ActualDate" : null,
"Index" : 0,
"Name" : "milestone2",
"TargetDate" : ISODate("2011-12-13T22:00:00Z"),
"_id" : ObjectId("4ee89ae7e60fc615c42e28d2") } ]
,
"Name" : "a", "_id" : ObjectId("4ee89ae7e60fc615c42e28ce")
}
I want to update definite documents: that have specified _id, List of Milestones._id and ActualDate is null.
I dotnet my code looks like:
var query = Query.And(new[] { Query.EQ("_id", ObjectId.Parse(projectId)),
Query.In("Milestones._id", new BsonArray(values.Select(ObjectId.Parse))),
Query.EQ("Milestones.ActualDate", BsonNull.Value) });
var update = Update.Set("Milestones.$.ActualDate", DateTime.Now.Date);
Coll.Update(query, update, UpdateFlags.Multi, SafeMode.True);
Or in native code:
db.Projects.update({ "_id" : ObjectId("4ee89ae7e60fc615c42e28ce"), "Milestones._id" : { "$in" : [ObjectId("4ee89ae7e60fc615c42e28d1"), ObjectId("4ee89ae7e60fc615c42e28d2"), ObjectId("4ee8a648e60fc615c41d481e")] }, "Milestones.ActualDate" : null },{ "$set" : { "Milestones.$.ActualDate" : ISODate("2011-12-13T22:00:00Z") } }, false, true)
But only the first item is being updated.
This is not possible in current moment. Flag multi in update means update of multiple root documents. Positional operator can match only one nested array item. There is such feature in mongodb jira. You can vote up and wait.
Current solution can be only load document, update as you wish and save back or multiple atomic update for each nested array id.
From documentation at mongodb.org:
Currently the $ operator only applies to the first matched item in the
query
As answered by Andrew Orsich, this is not possible for the moment, at least not as you wish. But loading the document, modifying the array then saving it back will work. The risk is that some other process could modify the array in the meantime, so you would overwrite its changes. To avoid this, you can use optimistic locking, especially if the array is not modified every second.
load the document, including a new attribute: milestones_version
modify the array as needed
save back to mongodb, but now add a query constraint on the milestones_version, and increment it:
db.Projects.findAndModify({
query: {
_id: your_project_id,
milestones_version: expected_milestones_version
},
update: {
$set: {
Milestones: modified_milestones
},
$inc: {
milestones_version: 1
}
},
new: 1
})
If another process modified the milestones array (and hence the milestones_version) before we did, then this command will do nothing and simply return null. We just need to reload the document and try again. If the array is not modified every second, then this will be very rare and will not have any impact on performance.
The main problem with this solution is that you have to edit every Project, one by one (no multi: true). You could still write a javascript function and have it run on the server though.
According to their JIRA page "This new feature is available starting with the MongoDB 3.5.12 development version, and included in the MongoDB 3.6 production version"
https://jira.mongodb.org/browse/SERVER-1243

Querying and grouping in mongoDb?

Part 1:
I have (student) collection:
{
sname : "",
studentId: "123"
age: "",
gpa: "",
}
im trying to get only two keys from it :
{
sname : "",
studentId: "123"
}
so i need to eliminate age and gpa to have only name and studentId , how could i do that ?
Part2:
Then I have 'subject' collection :
{
subjectName : "Math"
studentId : "123"
teacherName: ""
}
I need to match/combine the previous keys (in part1) with the correct studentId so I will end up with something like this :
{
sname : "",
studentId: "123",
subjectName : "Math"
}
How can i do this and is that the right way to think to get the result? i tried to read about group and mapReduce but i didnt find a clear example.
To answer your first question, you can do this:
db.student.find({}, {"sname":1, "studentId":1});
The first {} in that is the limiting query, which in this case includes the entire collection. The second half specifies keys with a 1 or 0 depending on whether or not you want them back. Don't mix include and excludes in a single query though. Except for a couple special cases, mongo won't accept it.
Your second question is more difficult. What you're asking for is a join and mongo doesn't support that. There is no way to connect the two collections on studentId. You'll need to find all the students that you want, then use those studentIds to find all the matching subjects. Then you'll need to merge the two results in your own code. You can do this through whatever driver you're using, or you can do this in javascript in the shell itself, but either way, you'll have to merge them with your own code.
Edit:
Here's an example of how you could do this in the shell with the output going to a collection called "out".
db.student.find({}, {"sname":1, "studentId":1}).forEach(
function (st) {
db.subject.find({"studentId":st.studentId}, {"subjectName":1}).forEach(
function (sub) {
db.out.insert({"sname":st.sname, "studentId":st.studentId, "subjectName":sub.subjectName});
}
);
}
);
If this isn't data that changes all that often, you could just drop the "out" collection and repopulate it periodically with this shell script. Then your code could query directly from "out". If the data does change frequently, you'll want to do this merging in your code on the fly.
Another, and possibly better, option is to include the "subject" data in the "student" collection or vice versa. This will result in a more mongodb friendly structure. If you run into this joining problem frequently, mongo may not be the way to go and a relational database may be better suited to your needs.
Mongo's find() operator lets you include or exclude certain fields from the results
Check out Field Selection in the docs for more info. You could do either:
db.users.find({}, { 'sname': 1, 'studentId': 1 });
db.users.find({}, { 'age': 0, 'gpa': 0 });
For relating your student and subject together, you could either lookup which subjects a student has separately, like this:
db.subjects.find({ studentId: 123 });
Or embed subject data with each student, and retrieve it together with the student document:
{
sname : "Roland Browning",
studentId: "123"
age: 14,
gpa: "B",
subjects: [ { name : "French", teacher: "Mr Bronson" }, ... ]
}