MongoDB - update a property with incremental values - mongodb

I have a collection called dealers as follows.
{
"_id" : ObjectId("5b9ba196cd5f5af83bb0dc71"),
"name" : "SOME COMPANY",
"outletID" : "GS0061920",
"companyID" : "GC0050397"
}
I have about 5000 documents in this collection where outletID and companyID properties are empty.
I would like to update both of these properties for all documents in this collection with incremental values.
I'm not sure how I should go about this.
Note: MongoDB 3.4 is in use.

Assuming you are running on mongodb shell. You can use the following:
> var counter = 1;
> db.collection.find().forEach(
function(elem) {
db.myColl.update(
{ "outletID" : null, "companyID" : null},
{$set:{ outletID: counter , companyID: counter }
});
counter++;
});
The mongo shell is an interactive JavaScript interface. I have used JS for the task. counter is a simple variable which is used to update values. $set is used to update the fields.
The $set operator replaces the value of a field with the specified value. You can find more data in its official documentation.
I created a sample data and tested my function. Hope it works fine.

Related

pymongo update creates a new record without upserting

I have an issue where I do an update on a document, however, the update creates a new document and I'm not upserting in my update.
This is my testing code.
I do a find to see if the document exists by checking if "lastseen" doesn't exist:
result = DATA_Collection.find({"sessionID":"12345","lastseen":{"$exists":False}})
if result.count() == 1:
DATA_Collection.update({"sessionID":"12345"},{"$set":{"lastseen":"2021-05-07"}})
When I do an aggregate check to find duplicates I get a few, one example below.
> db.DATA_Collection.find({ "sessionID" : "237a5fb8" })
{ "_id" : ObjectId("60bdf7b05c961b4d27d33bde"), "sessionID" : "237a5fb8", "firstseen" : ISODate("1970-01-19T20:51:09Z"), "lastseen" : ISODate("2021-06-07T12:34:20Z") }
{ "_id" : ObjectId("60bdf7fa7d35ea0f046a2514"), "sessionID" : "237a5fb8", "firstseen" : ISODate("1970-01-19T20:51:09Z") }
I remove all the records in the collection and rerun the script, the same happens again.
Any advice will be much appreciated.
Firstly your pymongo commands are deprecated; use update_one() or update_many() instead of update(); count_documents() instead of count().
Secondly double check you are referencing the same collections as you mention DATA_Collection and VPN_DATA;
How are you defining a "duplicate"? Unless you create a unique index on the field(s), the records won't be duplicates as they have different _id fields.
You need something like:
record = db.VPN_DATA.find_one({'sessionID': '12345', 'lastseen': {'$exists': False}})
if record is not None:
db.VPN_DATA.update_one({'_id': record.get('_id')}, {'$set': {'lastseen': '2021-05-07'}})

Override existing Docs in production MongoDB

I have recently changed one of my fields from object to array of objects.
In my production I have only 14 documents with this field, so I decided to change those fields.
Is there any best practices to do that?
As it is in my production I need to do it in a best way possible?
I got the document Id's of those collections.like ['xxx','yyy','zzz',...........]
my doc structure is like
_id:"xxx",option1:{"op1":"value1","op2":"value2"},option2:"some value"
and I want to change it like(converting object to array of objects)
_id:"xxx",option1:[{"op1":"value1","op2":"value2"},
{"op1":"value1","op2":"value2"}
],option2:"some value"
Can I use upsert? If so How to do it?
Since you need to create the new value of the field based on the old value, you should retrieve each document with a query like
db.collection.find({ "_id" : { "in" : [<array of _id's>] } })
then iterate over the results and $set the value of the field to its new value:
db.collection.find({ "_id" : { "in" : [<array of _id's>] } }).forEach(function(doc) {
oldVal = doc.option1
newVal = compute_newVal_from_oldVal(oldVal)
db.collection.update({ "_id" : doc._id }, { "$set" : { "option" : newVal } })
})
The document structure is rather schematic, so I omitted putting in actual code to create newVal from oldVal.
Since it is an embedded document type you could use push query
db.collectionname.update({_id:"xxx"},{$push:{option1:{"op1":"value1","op2":"value2"}}})
This will create document inside embedded document.Hope it helps

How to query against ObjectId when not in mongo shell

I'm working on paging functionality using a range query. I'm using this test query in the mongo shell:
> var params = {$query: {_id: {$lt: ObjectId("52b06166eff887999c6efbd9")}}, $orderby: {_id: -1}, $maxScan: 3}
> params
{
"$query" : {
"_id" : {
"$lt" : ObjectId("52b06166eff887999c6efbd9")
}
},
"$orderby" : {
"_id" : -1
},
"$maxScan" : 3
}
> db.events.find(params)
I'd like to be able to pass the serialized params object to a web service (as a URL query string). However, the ObjectId class is only available inside the shell. Is there a way to specify an ObjectId as part of a query when not in the shell? I've tried the following as the value of $lt without success:
'ObjectId("52b06166eff887999c6efbd9")'
'new ObjectId("52b06166eff887999c6efbd9")'
{"$oid" : "52b06166eff887999c6efbd9"}
Generally speaking, this abstraction is handled by whatever MongoDB driver you use. If you are using an actual driver, you can do queries on _id without using ObjectId()
Mongoose / Node.js Example:
People.find({ _id : "Valid ObjectID String" }, function(e, person) {
console.log(e, person);
});
If you do still need the ObjectId helper, generally you are able to reference it in whatever native driver you need.
What you are doing in your last examples is passing your objectId as a string (first two examples) or as a dictionary third example. So surely it does not work.
You can pass just a string '52b06166eff887999c6efbd9' as a parameter and then when you receive it you can construct normal ObjectId on the server. For example in php you can construct it in the following way new MongoId('your string');

Get position of selected document in collection [mongoDB]

How to get position (index) of selected document in mongo collection?
E.g.
this document: db.myCollection.find({"id":12345})
has index 3 in myCollection
myCollection:
id: 12340, name: 'G'
id: 12343, name: 'V'
id: 12345, name: 'A'
id: 12348, name: 'N'
If your requirement is to find the position of the document irrespective of any order, that is not
possible as MongoDb does not store the documents in specific order.
However,if you want to know the index based on some field, say _id , you can use this method.
If you are strictly following auto increments in your _id field. You can count all the documents
that have value less than that _id, say n , then n + 1 would be index of the document based on _id.
n = db.myCollection.find({"id": { "$lt" : 12345}}).count() ;
This would also be valid if documents are deleted from the collection.
As far as I know, there is no single command to do this, and this is impossible in general case (see Derick's answer). However, using count() for a query done on an ordered id value field seems to work. Warning: this assumes that there is a reliably ordered field, which is difficult to achieve in a concurrent writer case. In this example _id is used, however this will only work with a single writer case.:
MongoDB shell version: 2.0.1
connecting to: test
> use so_test
switched to db so_test
> db.example.insert({name: 'A'})
> db.example.insert({name: 'B'})
> db.example.insert({name: 'C'})
> db.example.insert({name: 'D'})
> db.example.insert({name: 'E'})
> db.example.insert({name: 'F'})
> db.example.find()
{ "_id" : ObjectId("4fc5f040fb359c680edf1a7b"), "name" : "A" }
{ "_id" : ObjectId("4fc5f046fb359c680edf1a7c"), "name" : "B" }
{ "_id" : ObjectId("4fc5f04afb359c680edf1a7d"), "name" : "C" }
{ "_id" : ObjectId("4fc5f04dfb359c680edf1a7e"), "name" : "D" }
{ "_id" : ObjectId("4fc5f050fb359c680edf1a7f"), "name" : "E" }
{ "_id" : ObjectId("4fc5f053fb359c680edf1a80"), "name" : "F" }
> db.example.find({_id: ObjectId("4fc5f050fb359c680edf1a7f")})
{ "_id" : ObjectId("4fc5f050fb359c680edf1a7f"), "name" : "E" }
> db.example.find({_id: {$lte: ObjectId("4fc5f050fb359c680edf1a7f")}}).count()
5
>
This should also be fairly fast if the queried field is indexed. The example is in mongo shell, but count() should be available in all driver libs as well.
This might be very slow but straightforward method. Here you can pass as usual query. Just I am looping all the documents and checking if condition to match the record. Here I am checking with _id field. You can use any other single field or multiple fields to check it.
var docIndex = 0;
db.url_list.find({},{"_id":1}).forEach(function(doc){
docIndex++;
if("5801ed58a8242ba30e8b46fa"==doc["_id"]){
print('document position is...' + docIndex);
return false;
}
});
There is no way that MongoDB can return this as it does not keep documents in order in the database, just like MySQL f.e. doesn't name row numbers.
The ObjectID trick from jhonkola will only work if only one client creates new elements, as the ObjectIDs are generated on the client side, with the first part being a timestamp. There is no guaranteed order if different clients talk to the same server. Still, I would not rely on this.
I also don't quite understand what you are trying to do though, so perhaps mention that in your question? I can then update the answer.
Restructure your collection to include the position of any entry i.e {'id': 12340, 'name': 'G', 'position': 1} then when searching the database collection(myCollection) using the desired position as a query
The queries I use that return the entire collection all use sort to get a reproducible order, find.sort.forEach works with the script above to get the correct index.

Multiple update of embedded documents' properties

I have the following collection:
{
"Milestones" : [
{ "ActualDate" : null,
"Index": 0,
"Name" : "milestone1",
"TargetDate" : ISODate("2011-12-13T22:00:00Z"),
"_id" : ObjectId("4ee89ae7e60fc615c42e28d1")},
{ "ActualDate" : null,
"Index" : 0,
"Name" : "milestone2",
"TargetDate" : ISODate("2011-12-13T22:00:00Z"),
"_id" : ObjectId("4ee89ae7e60fc615c42e28d2") } ]
,
"Name" : "a", "_id" : ObjectId("4ee89ae7e60fc615c42e28ce")
}
I want to update definite documents: that have specified _id, List of Milestones._id and ActualDate is null.
I dotnet my code looks like:
var query = Query.And(new[] { Query.EQ("_id", ObjectId.Parse(projectId)),
Query.In("Milestones._id", new BsonArray(values.Select(ObjectId.Parse))),
Query.EQ("Milestones.ActualDate", BsonNull.Value) });
var update = Update.Set("Milestones.$.ActualDate", DateTime.Now.Date);
Coll.Update(query, update, UpdateFlags.Multi, SafeMode.True);
Or in native code:
db.Projects.update({ "_id" : ObjectId("4ee89ae7e60fc615c42e28ce"), "Milestones._id" : { "$in" : [ObjectId("4ee89ae7e60fc615c42e28d1"), ObjectId("4ee89ae7e60fc615c42e28d2"), ObjectId("4ee8a648e60fc615c41d481e")] }, "Milestones.ActualDate" : null },{ "$set" : { "Milestones.$.ActualDate" : ISODate("2011-12-13T22:00:00Z") } }, false, true)
But only the first item is being updated.
This is not possible in current moment. Flag multi in update means update of multiple root documents. Positional operator can match only one nested array item. There is such feature in mongodb jira. You can vote up and wait.
Current solution can be only load document, update as you wish and save back or multiple atomic update for each nested array id.
From documentation at mongodb.org:
Currently the $ operator only applies to the first matched item in the
query
As answered by Andrew Orsich, this is not possible for the moment, at least not as you wish. But loading the document, modifying the array then saving it back will work. The risk is that some other process could modify the array in the meantime, so you would overwrite its changes. To avoid this, you can use optimistic locking, especially if the array is not modified every second.
load the document, including a new attribute: milestones_version
modify the array as needed
save back to mongodb, but now add a query constraint on the milestones_version, and increment it:
db.Projects.findAndModify({
query: {
_id: your_project_id,
milestones_version: expected_milestones_version
},
update: {
$set: {
Milestones: modified_milestones
},
$inc: {
milestones_version: 1
}
},
new: 1
})
If another process modified the milestones array (and hence the milestones_version) before we did, then this command will do nothing and simply return null. We just need to reload the document and try again. If the array is not modified every second, then this will be very rare and will not have any impact on performance.
The main problem with this solution is that you have to edit every Project, one by one (no multi: true). You could still write a javascript function and have it run on the server though.
According to their JIRA page "This new feature is available starting with the MongoDB 3.5.12 development version, and included in the MongoDB 3.6 production version"
https://jira.mongodb.org/browse/SERVER-1243