Query an array of embedded documents in mongodb - mongodb

I'm having a little trouble writing a query that needs to compare a given value against a certain field in all embedded documents within an array. I will give an example to make the issue less abstract.
Let's say I want to use MongoDB to store the last queries that users on my network have entered into different online search engines. An entry in the collection would have a structure like this :
{
'_id' : 'zinfandel',
'last_search' : [
{
'engine' : 'google.com',
'query' : 'why is the sky blue'
},
{
'engine' : 'bing.com',
'query' : 'what is love'
},
{ 'engine' : 'yahoo.com',
'query' : 'how to tie a tie'
}
]
}
Now let's say user username enters a new query into a certain engine. The code that stores this query in the DB needs to find out whether there already exists an entry for the engine that the user used. If yes, this entry is to be updated with the new query. If not, a new entry should be created. My idea is to do a $push only if there is no entry for the given engine and do a $set otherwise. For this purpose, I tried to write my push like this :
db.mycollection.update(
{ '_id' : username , search.$.engine : { '$ne' : engine } },
{ '$push' : { 'search.$.engine' : engine, 'search.$.query' : query } }
)
However, this pushes a new embedded document even if there already was an entry for the given engine. The problem seems to be that the $ne operator doesn't work with arrays like I expect it to work. What I need is a way to make sure that not a single embedded document in the array has an "engine" entry that matches the specified engine.
Does anyone have an idea how to do that? Please tell me if I need to further clarify the question ...

You can push the item into the array with the following command:
db.mycollection.update({
_id: "zinfandel",
"last_search.engine": {
$nin: ["notwellknownengine.com"]
}
}, {
$push: {
"last_search": {
"engine" : "notwellknownengine.com",
"query" : "stackoveflow.com"
}
}
});

Related

multi updating a key along the documents of a collection using pymongo

I have lots of documents inside a collection.
The structure of each of the documents inside the collection is as it follows:
{
"_id" : ObjectId(....),
"valor" : {
"AB" : {
"X" : 0.0,
"Y" : 142.6,
},
"FJ" : {
"X" : 0.2,
"Y" : 3.33
....
The collection has currently about 200 documents and I have noticed that one of the keys inside valor has the wrong name. In this case we will say "FJ" shall be "JOF" in all the docs of the collection.
Im pretty sure it is possible to change the key in all the docs using the update function of pymongo. The problem I am facing is that when I visit the online doc available https://docs.mongodb.com/v3.0/reference/method/db.collection.update/ only explains how to change the values(which I would like to remain how they currently are and change only the keys).
This is what I have tried:
def multi_update(spec_key,key_updte):
rdo=col.update((valor.spec_key),{"$set":(valor.key_updte)},multi=True)
return rdo
print(multi_update('FJ','JOF'))
But outputs name 'valor' is not defined . I thought I shall use valor.specific_key to access to the corresponding json
how can I update a key only along the docs of the collection?
You have two problems. First, valor is not an identifier in your Python code, it's a field name of a MongoDB document. You need to quote it in single or double quotes in Python in order to make it a string and use it in a PyMongo update expression.
Your second problem is, MongoDB's update command doesn't allow you set one field to the value of another, nor to rename a field. However, you can reshape all the documents in your collection using the aggregate command with a $project stage and store the results in a second collection using a $out stage.
Here's a complete example to play with:
db = MongoClient().test
collection = db.collection
collection.delete_many({})
collection.insert_one({
"valor" : {
"AB" : {
"X" : 0.0,
"Y" : 142.6,
},
"FJ" : {
"X" : 0.2,
"Y" : 3.33}}})
collection.aggregate([{
"$project": {
"valor": {
"AB": "$valor.AB",
"FOJ": "$valor.FJ"
}
}
}, {
"$out": "collection2"
}])
This is the dangerous part. First, check that "collection2" has all the documents you want, in the desired shape. Then:
collection.drop()
db.collection2.rename("collection")
import pprint
pprint.pprint(collection.find_one())

Override existing Docs in production MongoDB

I have recently changed one of my fields from object to array of objects.
In my production I have only 14 documents with this field, so I decided to change those fields.
Is there any best practices to do that?
As it is in my production I need to do it in a best way possible?
I got the document Id's of those collections.like ['xxx','yyy','zzz',...........]
my doc structure is like
_id:"xxx",option1:{"op1":"value1","op2":"value2"},option2:"some value"
and I want to change it like(converting object to array of objects)
_id:"xxx",option1:[{"op1":"value1","op2":"value2"},
{"op1":"value1","op2":"value2"}
],option2:"some value"
Can I use upsert? If so How to do it?
Since you need to create the new value of the field based on the old value, you should retrieve each document with a query like
db.collection.find({ "_id" : { "in" : [<array of _id's>] } })
then iterate over the results and $set the value of the field to its new value:
db.collection.find({ "_id" : { "in" : [<array of _id's>] } }).forEach(function(doc) {
oldVal = doc.option1
newVal = compute_newVal_from_oldVal(oldVal)
db.collection.update({ "_id" : doc._id }, { "$set" : { "option" : newVal } })
})
The document structure is rather schematic, so I omitted putting in actual code to create newVal from oldVal.
Since it is an embedded document type you could use push query
db.collectionname.update({_id:"xxx"},{$push:{option1:{"op1":"value1","op2":"value2"}}})
This will create document inside embedded document.Hope it helps

Monodb database migration with embedded query

Currently in my database I have messages objects set up as the following.
{
"name" : "System",
"message" : "Sean Callahan has entered the room.",
"time" : 1406479167270,
"type" : "system_message",
"room" : "helloroom",
"_id" : "4yeHzhHAQmGJNtHww"
}
I want to basically migrate my data so that every message has a roomId that point it at the appropriate room. Currently this is done by the with the room attribute, which I know see the fault in my ways for various reasons.
My room objects are setup something like this.
{
"_id:" xxxxxxxxx
"room_name:" "testingroom"
}
So I was hoping there was a way to run a one-liner that would just add the correct roomId to every current message based on the current room attribute that is set
I was thinking something along the lines of..
db.messages.update({}, {$set: {roomId: db.rooms.findOne({room_name: room})._id}})
As of now, I am getting room is not defined, which makes perfect sense. But I can't seem to get it right, and this may just not be possible in a one-line query.
As you discovered, this isn't possible in a one-line query since you need to join data from two collections.
Here's an example of how to add the missing field in the mongo shell:
db.messages.find(
{ roomId: { $exists: false} }
).forEach(function(room) {
var roomId = db.rooms.findOne({room_name: room.room});
if (roomId._id) {
db.messages.update(
{ _id: room._id },
{ $set: { roomId: roomId._id }}
)
}
})
You could tidy this up with some error checking, and for updates on a large collection consider using the Bulk Update API (only available in MongoDB 2.6+).

Add new field to all documents in a nested array

I have a database of person documents. Each has a field named photos, which is an array of photo documents. I would like to add a new 'reviewed' flag to each of the photo documents and initialize it to false.
This is the query I am trying to use:
db.person.update({ "_id" : { $exists : true } }, {$set : {photos.reviewed : false} }, false, true)
However I get the following error:
SyntaxError: missing : after property id (shell):1
Is this possible, and if so, what am I doing wrong in my update?
Here is a full example of the 'person' document:
{
"_class" : "com.foo.Person",
"_id" : "2894",
"name" : "Pixel Spacebag",
"photos" : [
{
"_id" : null,
"thumbUrl" : "http://site.com/a_s.jpg",
"fullUrl" : "http://site.com/a.jpg"
},
{
"_id" : null,
"thumbUrl" : "http://site.com/b_s.jpg",
"fullUrl" : "http://site.com/b.jpg"
}]
}
Bonus karma for anyone who can tell me a cleaner why to update "all documents" without using the query { "_id" : { $exists : true } }
For those who are still looking for the answer it is possible with MongoDB 3.6 with the all positional operator $[] see the docs:
db.getCollection('person').update(
{},
{ $set: { "photos.$[].reviewed" : false } },
{ multi: true}
)
Is this possible, and if so, what am I doing wrong in my update?
No. In general MongoDB is only good at doing updates on top-level objects.
The exception here is the $ positional operator. From the docs: Use this to find an array member and then manipulate it.
However, in your case you want to modify all members in an array. So that is not what you need.
Bonus karma for anyone who can tell me a cleaner why to update "all documents"
Try db.coll.update(query, update, false, true), this will issue a "multi" update. That last true is what makes it a multi.
Is this possible,
You have two options here:
Write a for loop to perform the update. It will basically be a nested for loop, one to loop through the data, the other to loop through the sub-array. If you have a lot of data, you will want to write this is your driver of choice (and possibly multi-thread it).
Write your code to handle reviewed as nullable. Write the data such that if it comes across a photo with reviewed undefined then it must be false. Then you can set the field appropriately and commit it back to the DB.
Method #2 is something you should get used to. As your data grows and you add fields, it becomes difficult to "back-port" all of the old data. This is similar to the problem of issuing a schema change in SQL when you have 1B items in the DB.
Instead just make your code resistant against the null and learn to treat it as a default.
Again though, this is still not the solution you seek.
You can do this
(null, {$set : {"photos.reviewed" : false} }, false, true)
The first parameter is null : no specification = any item in the collection.
"photos.reviewed" should be declared as string to update subfield.
You can do like this:
db.person.update({}, $set:{name.surname:null}, false, true);
Old topic now, but this just worked fine with Mongo 3.0.6:
db.users.update({ _id: ObjectId("55e8969119cee85d216211fb") },
{ $set: {"settings.pieces": "merida"} })
In my case user entity looks like
{ _id: 32, name: "foo", ..., settings: { ..., pieces: "merida", ...} }

Multiple update of embedded documents' properties

I have the following collection:
{
"Milestones" : [
{ "ActualDate" : null,
"Index": 0,
"Name" : "milestone1",
"TargetDate" : ISODate("2011-12-13T22:00:00Z"),
"_id" : ObjectId("4ee89ae7e60fc615c42e28d1")},
{ "ActualDate" : null,
"Index" : 0,
"Name" : "milestone2",
"TargetDate" : ISODate("2011-12-13T22:00:00Z"),
"_id" : ObjectId("4ee89ae7e60fc615c42e28d2") } ]
,
"Name" : "a", "_id" : ObjectId("4ee89ae7e60fc615c42e28ce")
}
I want to update definite documents: that have specified _id, List of Milestones._id and ActualDate is null.
I dotnet my code looks like:
var query = Query.And(new[] { Query.EQ("_id", ObjectId.Parse(projectId)),
Query.In("Milestones._id", new BsonArray(values.Select(ObjectId.Parse))),
Query.EQ("Milestones.ActualDate", BsonNull.Value) });
var update = Update.Set("Milestones.$.ActualDate", DateTime.Now.Date);
Coll.Update(query, update, UpdateFlags.Multi, SafeMode.True);
Or in native code:
db.Projects.update({ "_id" : ObjectId("4ee89ae7e60fc615c42e28ce"), "Milestones._id" : { "$in" : [ObjectId("4ee89ae7e60fc615c42e28d1"), ObjectId("4ee89ae7e60fc615c42e28d2"), ObjectId("4ee8a648e60fc615c41d481e")] }, "Milestones.ActualDate" : null },{ "$set" : { "Milestones.$.ActualDate" : ISODate("2011-12-13T22:00:00Z") } }, false, true)
But only the first item is being updated.
This is not possible in current moment. Flag multi in update means update of multiple root documents. Positional operator can match only one nested array item. There is such feature in mongodb jira. You can vote up and wait.
Current solution can be only load document, update as you wish and save back or multiple atomic update for each nested array id.
From documentation at mongodb.org:
Currently the $ operator only applies to the first matched item in the
query
As answered by Andrew Orsich, this is not possible for the moment, at least not as you wish. But loading the document, modifying the array then saving it back will work. The risk is that some other process could modify the array in the meantime, so you would overwrite its changes. To avoid this, you can use optimistic locking, especially if the array is not modified every second.
load the document, including a new attribute: milestones_version
modify the array as needed
save back to mongodb, but now add a query constraint on the milestones_version, and increment it:
db.Projects.findAndModify({
query: {
_id: your_project_id,
milestones_version: expected_milestones_version
},
update: {
$set: {
Milestones: modified_milestones
},
$inc: {
milestones_version: 1
}
},
new: 1
})
If another process modified the milestones array (and hence the milestones_version) before we did, then this command will do nothing and simply return null. We just need to reload the document and try again. If the array is not modified every second, then this will be very rare and will not have any impact on performance.
The main problem with this solution is that you have to edit every Project, one by one (no multi: true). You could still write a javascript function and have it run on the server though.
According to their JIRA page "This new feature is available starting with the MongoDB 3.5.12 development version, and included in the MongoDB 3.6 production version"
https://jira.mongodb.org/browse/SERVER-1243