Override existing Docs in production MongoDB - mongodb

I have recently changed one of my fields from object to array of objects.
In my production I have only 14 documents with this field, so I decided to change those fields.
Is there any best practices to do that?
As it is in my production I need to do it in a best way possible?
I got the document Id's of those collections.like ['xxx','yyy','zzz',...........]
my doc structure is like
_id:"xxx",option1:{"op1":"value1","op2":"value2"},option2:"some value"
and I want to change it like(converting object to array of objects)
_id:"xxx",option1:[{"op1":"value1","op2":"value2"},
{"op1":"value1","op2":"value2"}
],option2:"some value"
Can I use upsert? If so How to do it?

Since you need to create the new value of the field based on the old value, you should retrieve each document with a query like
db.collection.find({ "_id" : { "in" : [<array of _id's>] } })
then iterate over the results and $set the value of the field to its new value:
db.collection.find({ "_id" : { "in" : [<array of _id's>] } }).forEach(function(doc) {
oldVal = doc.option1
newVal = compute_newVal_from_oldVal(oldVal)
db.collection.update({ "_id" : doc._id }, { "$set" : { "option" : newVal } })
})
The document structure is rather schematic, so I omitted putting in actual code to create newVal from oldVal.

Since it is an embedded document type you could use push query
db.collectionname.update({_id:"xxx"},{$push:{option1:{"op1":"value1","op2":"value2"}}})
This will create document inside embedded document.Hope it helps

Related

How do I update values in a nested array?

I would like to preface this with saying that english is not my mother tongue, if any of my explanations are vague or don't make sense, please let me know and I will attempt to make them clearer.
I have a document containing some nested data. Currently product and customer are arrays, I would prefer to have them as straight up ObjectIDs.
{
"_id" : ObjectId("5bab713622c97440f287f2bf"),
"created_at" : ISODate("2018-09-26T13:44:54.431Z"),
"prod_line" : ObjectId("5b878e4c22c9745f1090de66"),
"entries" : [
{
"order_number" : "123",
"product" : [
ObjectId("5ba8a0e822c974290b2ea18d")
],
"customer" : [
ObjectId("5b86a20922c9745f1a6408d4")
],
"quantity" : "14"
},
{
"order_number" : "456",
"product" : [
ObjectId("5b878ed322c9745f1090de6c")
],
"customer" : [
ObjectId("5b86a20922c9745f1a6408d5")
],
"quantity" : "12"
}
]
}
I tried using the following query to update it, however that proved unsuccessful as Mongo didn't behave quite as I had expected.
db.Document.find().forEach(function(doc){
doc.entries.forEach(function(entry){
var entry_id = entry.product[0]
db.Document.update({_id: doc._id}, {$set:{'product': entry_id}});
print(entry_id)
})
})
With this query it sets product in the root of the object, not quite what I had hoped for. What I was hoping to do was to iterate through entries and change each individual product and customer to be only their ObjectId and not an array. Is it possible to do this via the mongo shell or do I have to look for another way to accomplish this? Thanks!
In order to accomplish your specified behavior, you just need to modify your query structure a bit. Take a look here for the specific MongoDB documentation on how to accomplish this. I will also propose an update to your code below:
db.Document.find().forEach(function(doc) {
doc.entries.forEach(function(entry, index) {
var productElementKey = 'entries.' + index + '.product';
var productSetObject = {};
productSetObject[productElementKey] = entry.product[0];
db.Document.update({_id: doc._id}, {$set: productSetObject});
print(entry_id)
})
})
The problem that you were having is that you were not updating the specific element within the entries array, but rather adding a new key to the top-level of the document named product. Generally, in order to set the value of an inner document within an array, you need to specify the array key first (entries in this case) and the inner document key second (product in this case). Since you are trying to set specific elements within the entries array, you need to also specify the index in your query object, I have specified above.
In order to update the customer key in the inner documents, simply switch out the product for customer in my above code.
You're trying to add a property 'product' directly into your document with this line
db.Document.update({_id: doc._id}, {$set:{'product': entry_id}});
Try to modify all your entries first, then update your document with this new array of entries.
db.Document.find().forEach(function(doc){
let updatedEntries = [];
doc.entries.forEach(function(entry){
let newEntry = {};
newEntry["order_number"] = entry.order_number;
newEntry["product"] = entry.product[0];
newEntry["customer"] = entry.customer[0];
newEntry["quantity"] = entry.quantity;
updatedEntries.push(newEntry);
})
db.Document.update({_id: doc._id}, {$set:{'entries': updatedEntries}});
})
You'll need to enumerate all the documents and then update the documents one and a time with the value store in the first item of the array for product and customer from each entry:
db.documents.find().snapshot().forEach(function (elem) {
elem.entries.forEach(function(entry){
db.documents.update({
_id: elem._id,
"entries.order_number": entry.order_number
}, {
$set: {
"entries.$.product" : entry.product[0],
"entries.$.customer" : entry.customer[0]
}
}
);
});
});
Instead of doing 2 updates each time you could possibly use the filtered positional operator to do all updates to all arrays items within one update query.

MongoDB get object id by finding on another column value

I am new to querying dbs and especially mongodb.If I run :
db.<customers>.find({"contact_name: Anny Hatte"})
I get:
{
"_id" : ObjectId("55f7076079cebe83d0b3cffd"),
"company_name" : "Gap",
"contact_name" : "Anny Hatte",
"email" : "ahatte#gmail.com"
}
I wish to get the value of the "_id" attribute from this query result. How do I achieve that?
Similarly, if I have another collection, named items, with the following data:
{
"_id" : ObjectId("55f7076079cebe83d0b3d009"),
"_customer" : ObjectId("55f7076079cebe83d0b3cfda"),
"school" : "St. Patrick's"
}
Here, the "_customer" field is the "_id" of the customer collection (the previous collection). I wish to get the "_id", the "_customer" and the "school" field values for the record where "_customer" of items-collection equals "_id" of customers-collection.
How do I go about this?
I wish to get the value of the "_id" attribute from this query result.
How do I achieve that?
The find() method returns a cursor to the results, which you can iterate and retrieve the documents in the result set. You can do this using forEach().
var cursor = db.customers.find({"contact_name: Anny Hatte"});
cursor.forEach(function(customer){
//access all the attributes of the document here
var id = customer._id;
})
You could make use of the aggregation pipeline's $lookup stage that has been introduced as part of 3.2, to look up and fetch the matching rows in some other related collection.
db.customers.aggregate([
{$match:{"contact_name":"Anny Hatte"}},
{$lookup:{
"from":"items",
"localField":"_id",
"foreignField":"_customer",
"as":"items"
}}
])
In case you are using a previous version of mongodb where the stage is not supported, then, you would need to fire an extra query to lookup the items collection, for each customer.
db.customers.find(
{"contact_name":"Anny Hatte"}).map(function(customer){
customer["items"] = [];
db.items.find({"_customer":customer._id}).forEach(function(item){
customer.items.push(item);
})
return customer;
})

How to check if a portion of an _id from one collection appears in another

I have a collection where the _id is of the form [message_code]-[language_code] and another where the _id is just [message_code]. What I'd like to do is find all documents from the first collection where the message_code portion of the _id does not appear in the second collection.
Example:
> db.colA.find({})
{ "_id" : "TRM1-EN" }
{ "_id" : "TRM1-ES" }
{ "_id" : "TRM2-EN" }
{ "_id" : "TRM2-ES" }
> db.colB.find({})
{ "_id" : "TRM1" }
I want a query that will return TRM2-EN and TRM-ES from colA. Of course in my live data, there are thousands of records in each collection.
According to this question which is trying to do something similar, we have to save the results from a query against colB and use it in an $in condition in a query against colA. In my case, I need to strip the -[language_code] portion before doing this comparison, but I can't find a way to do so.
If all else fails, I'll just create a new field in colA that contains only the message code, but is there a better way do it?
Edit:
Based on Michael's answer, I was able to come up with this solution:
var arr = db.colB.distinct("_id")
var regexs = arr.map(function(elm){
return new RegExp(elm);
})
var result = db.colA.find({_id : {$nin : regexs}}, {_id : true})
Edit:
Upon closer inspection, the above method doesn't work after all. In the end, I just had to add the new field.
Disclaimer: This is a little hack it may not end well.
Get distinct _id using collection.distinct method.
Build a regular expression array using Array.prototype.map()
var arr = db.colB.distinct('_id');
arr.map(function(elm, inx, tab) {
tab[inx] = new RegExp(elm);
});
db.colA.find({ '_id': { '$nin': arr }})
I'd add a new field to colA since you can index it and if you have hundreds of thousands of documents in each collection splitting the strings will be painfully slow.
But if you don't want to do that you could make use of the aggregation framework's $substr operator to extract the [message-code] then do a $match on the result.

using an Object (subdocument) with varying fields as _id

Our (edX) original Mongo persistence representation uses a bson dictionary (aka object or subdocument) as the _id value (see, mongo/base.py). This id is missing a field.
Can some documents' _id values have more subfields than others without totally screwing up indexing?
What's the best way to handle existing documents without the additional field? Remove and replace them? Try to query w/ new _id format and if fails fall over to query w/o the new field? Try to query with both new and old _id format in one query?
To be more specific, the current format is
{'_id': {
'tag': 'i4x', // yeah, it's always this fixed value
'org': your_school_x,
'course': a_catalog_number,
'category': the_xblock_type,
'name': uniquifier_within_course
}}
I need to add 'run': the_session_or_term_for_course_run or 'course_id': org/course/run.
Documents within a collection need not have values for _id that are of the same structure. Hence, it is perfectly acceptable to have the following documents within a collection:
> db.foo.find()
{ "_id" : { "a" : 1 } }
{ "_id" : { "a" : 1, "b" : 2 } }
{ "_id" : { "c" : 1, "b" : 2 } }
Note that because the index is on only _id, only queries that specify a value for _id will use the index:
db.foo.find({_id:1}) // will use the index on _id
db.foo.find({_id:{state:"Alaska"}) // will use the index on _id
db.foo.find({"_id.a":1}) // will NOT use the index on _id
Note also that only a complete match of the "value" of _id will return a document. So this returns no documents for the collection above:
db.foo.find({_id:{c:1}})
Hence, for your case, you are welcome to add fields to the object that is the value for the _id key. And it does not matter that all documents have a different structure. But if you are hoping to query the collection by_id and have it be efficient, you are going to need to add indexes for all relevant sub parts that might be used in isolation. That is not super efficient.
_id is no different than any other key in this regard.

Mongoose: Saving as associative array of subdocuments vs array of subdocuments

I have a set of documents I need to maintain persistence for. Due to the way MongoDB handle's multi-document operations, I need to embed this set of documents inside a container document in order to ensure atomicity of my operations.
The data lends itself heavily to key-value pairing. Is there any way instead of doing this:
var container = new mongoose.Schema({
// meta information here
subdocs: [{key: String, value: String}]
})
I can instead have subdocs be an associative array (i.e. an object) that applies the subdoc validations? So a container instance would look something like:
{
// meta information
subdocs: {
<key1>: <value1>,
<key2>: <value2>,
...
<keyN>: <valueN>,
}
}
Thanks
Using Mongoose, I don't believe that there is a way to do what you are describing. To explain, let's take an example where your keys are dates and the values are high temperatures, to form pairs like { "2012-05-31" : 88 }.
Let's look at the structure you're proposing:
{
// meta information
subdocs: {
"2012-05-30" : 80,
"2012-05-31" : 88,
...
"2012-06-15": 94,
}
}
Because you must pre-define schema in Mongoose, you must know your key names ahead of time. In this use case, we would probably not know ahead of time which dates we would collect data for, so this is not a good option.
If you don't use Mongoose, you can do this without any problem at all. MongoDB by itself excels at inserting values with new key names into an existing document:
> db.coll.insert({ type : "temperatures", subdocuments : {} })
> db.coll.update( { type : "temperatures" }, { $set : { 'subdocuments.2012-05-30' : 80 } } )
> db.coll.update( { type : "temperatures" }, { $set : { 'subdocuments.2012-05-31' : 88 } } )
{
"_id" : ObjectId("5238c3ca8686cd9f0acda0cd"),
"subdocuments" : {
"2012-05-30" : 80,
"2012-05-31" : 88
},
"type" : "temperatures"
}
In this case, adding Mongoose on top of MongoDB takes away some of MongoDB's native flexibility. If your use case is well suited by this feature of MongoDB, then using Mongoose might not be the best choice.
you can achieve this behavior by using {strict: false} in your mongoose schema, although you should check the implications on the validation and casting mechanism of mongoose.
var flexibleSchema = new Schema( {},{strict: false})
another way is using schema.add method but i do not think this is the right solution.
the last solution i see is to get all the array to the client side and use underscore.js or whatever library you have. but it depends on your app, size of docs, communication steps etc.