Getting error while Updating Collection attribute name in the MongoDB - mongodb

So I have following structure of MongoDB collection
{ "_id" : ObjectId("516c48631f6c263a24fbbe7a"), "oldname" : 1, "name" : "somename" }
and I want to rename OLD NAME to NEW NAME so it will look like,
{ "_id" : ObjectId("516c48631f6c263a24fbbe7a"), "newname" : 1, "name" : "somename" }
so I am writing this command,
db.element_type.update({}, {$rename: {'oldname': 'newname'}}, false, true);
But it is giving me this error
failing update: objects in a capped ns cannot grow

The problem, per the error message, is that you're trying to update a capped collection, presumably with a newname that is longer than the oldname.
You can read about capped collections in the docs. They're designed to maintain their order, which is why you're running into this.
If you must use a capped collection, perhaps you should remove and re-insert instead of updating.

Related

pymongo update creates a new record without upserting

I have an issue where I do an update on a document, however, the update creates a new document and I'm not upserting in my update.
This is my testing code.
I do a find to see if the document exists by checking if "lastseen" doesn't exist:
result = DATA_Collection.find({"sessionID":"12345","lastseen":{"$exists":False}})
if result.count() == 1:
DATA_Collection.update({"sessionID":"12345"},{"$set":{"lastseen":"2021-05-07"}})
When I do an aggregate check to find duplicates I get a few, one example below.
> db.DATA_Collection.find({ "sessionID" : "237a5fb8" })
{ "_id" : ObjectId("60bdf7b05c961b4d27d33bde"), "sessionID" : "237a5fb8", "firstseen" : ISODate("1970-01-19T20:51:09Z"), "lastseen" : ISODate("2021-06-07T12:34:20Z") }
{ "_id" : ObjectId("60bdf7fa7d35ea0f046a2514"), "sessionID" : "237a5fb8", "firstseen" : ISODate("1970-01-19T20:51:09Z") }
I remove all the records in the collection and rerun the script, the same happens again.
Any advice will be much appreciated.
Firstly your pymongo commands are deprecated; use update_one() or update_many() instead of update(); count_documents() instead of count().
Secondly double check you are referencing the same collections as you mention DATA_Collection and VPN_DATA;
How are you defining a "duplicate"? Unless you create a unique index on the field(s), the records won't be duplicates as they have different _id fields.
You need something like:
record = db.VPN_DATA.find_one({'sessionID': '12345', 'lastseen': {'$exists': False}})
if record is not None:
db.VPN_DATA.update_one({'_id': record.get('_id')}, {'$set': {'lastseen': '2021-05-07'}})

why is mongodb not indexing my collection

I have created a collection and added just a name field and tried to apply the following index.
db.names.createIndex({"name":1})
Even after applying the index I see the below result.
db.names.find()
{ "_id" : ObjectId("57d14139eceab001a19f7e82"), "name" : "kkkk" } {
"_id" : ObjectId("57d1413feceab001a19f7e83"), "name" : "aaaa" } {
"_id" : ObjectId("57d14144eceab001a19f7e84"), "name" : "zzzz" } {
"_id" : ObjectId("57d14148eceab001a19f7e85"), "name" : "dddd" } {
"_id" : ObjectId("57d1414ceceab001a19f7e86"), "name" : "rrrrr" }
What am I missing here.
Khans...
the way you built your index is correct however building an ascending index on names wont return the results in ascending order.
if you need results to be ordered by name you have to use
{db.names.find().sort({names:1})}
what happens when you build an index is that when you search for data the Mongo process perform the search behind the scenes in an ordered fashion for faster outcomes.
Please note: if you just want to see output in sorted order. you dont even need an index.
You won't be able to see if an index has been successfully created (unless there is a considerable speed performance) by running a find() command.
Instead, use db.names.getIndexes() to see if the index has been created (it may take some time if you're running the index in the background for it to appear in the index list)

What's the difference between replaceOne() and updateOne() in MongoDB?

MongoDB bulk operations have two options:
Bulk.find.updateOne()
Adds a single document update operation to a bulk operations list. The operation can either replace an existing document or update specific fields in an existing document.
Bulk.find.replaceOne()
Adds a single document replacement operation to a bulk operations list. Use the Bulk.find() method to specify the condition that determines which document to replace. The Bulk.find.replaceOne() method limits the replacement to a single document.
According to the documentation, both of these two methods can replace a matching document. Do I understand correctly, that updateOne() is more general purpose method, which can either replace the document exactly like replaceOne() does, or just update its specific fields?
With replaceOne() you can only replace the entire document, while updateOne() allows for updating fields.
Since replaceOne() replaces the entire document - fields in the old document not contained in the new will be lost. With updateOne() new fields can be added without losing the fields in the old document.
For example if you have the following document:
{
"_id" : ObjectId("0123456789abcdef01234567"),
"my_test_key3" : 3333
}
Using:
replaceOne({"_id" : ObjectId("0123456789abcdef01234567")}, { "my_test_key4" : 4})
results in:
{
"_id" : ObjectId("0123456789abcdef01234567"),
"my_test_key4" : 4.0
}
Using:
updateOne({"_id" : ObjectId("0123456789abcdef01234567")}, {$set: { "my_test_key4" : 4}})
results in:
{
"_id" : ObjectId("0123456789abcdef01234567"),
"my_test_key3" : 3333.0,
"my_test_key4" : 4.0
}
Note that with updateOne() you can use the update operators on documents.
replaceOne() replaces the entire document, while updateOne() allows for updating or adding fields. When using updateOne() you also have access to the update operators which can reliably perform updates on documents. For example two clients can "simultaneously" increment a value on the same field in the same document and both increments will be captured, while with a replace the one may overwrite the other potentially losing one of the increments.
Since replaceOne() replaces the entire document - fields in the old document not contained in the new will be lost. With updateOne() new fields can be added without losing the fields in the old document.
For example if you have the following document:
{
"_id" : ObjectId("0123456789abcdef01234567"),
"my_test_key3" : 3333
}
Using:
replaceOne({"_id" : ObjectId("0123456789abcdef01234567")}, { "my_test_key4" : 4})
results in:
{
"_id" : ObjectId("0123456789abcdef01234567"),
"my_test_key4" : 4.0
}
Using:
updateOne({"_id" : ObjectId("0123456789abcdef01234567")}, {$set: { "my_test_key4" : 4}})
results in:
{
"_id" : ObjectId("0123456789abcdef01234567"),
"my_test_key3" : 3333.0,
"my_test_key4" : 4.0
}
db.collection.replaceOne() does exactly the same thing as db.collection.updateOne().
The main difference is that db.collection.replaceOne()'s data that are being edited will have to go back and forth to the server, whereas db.collection.UpdateOne() will request only the filtered ones and not the whole document!

event streaming via mongodb. Get last inserted events

I consuming data from existing database. this database store system events. My service should check this database by timer, check if some new events created, then upload it and handle. Something like simple queue implementation.
The question is - how can I get new docs each time, when I check database. I can't use timestamps, because events goes to database from different sources and there are no any order for events. So I just need to use inserting order only.
There are a couple of options.
The first, and easiest if it matches your use case, is to use a capped collection. The capped collection is a collection as a pre-defined size that acts as a sort of ring-buffer. Once then collection is full it starts overwriting the documents. For iterating over the collection you simply create a "tailable" cursor you will need some way of identifying the "last document processed (even a simple "done" flag in the document could work but it would have to exist when the document is inserted). If you truly can't modify the documents in any way then you could even save off the last processed document somewhere and use a course time stamp to (approximate the start position) and look for the last document before processing more documents.
The only real issue with this solution is that you will be limited in the number of documents you can write in the collection and it won't grow over time. There are limits on the write operations you can perform on the documents (they can't grow) but it does not sound like you are modifying the documents.
The second option, which is more complex, is to use the oplog. For a standalone configuration you will need to still pass the -replSet option to create and use the oplog. You will just not configure the oplog. In a sharded configuration you will need to track each "replica set" separately. The oplog contains a document for each insert, update, delete done to all collections/documents on the server. Each entry contains a timestamp, operation and id (at a minimum). Here are examples of each.
Insert
{ "ts" : { "t" : 1362958492000, "i" : 1 },
"h" : NumberLong("5915409566571821368"), "v" : 2,
"op" : "i",
"ns" : "test.test",
"o" : { "_id" : "513d189c8544eb2b5e000001" } }
Delete
{ ... "op" : "d", ..., "b" : true,
"o" : { "_id" : "513d189c8544eb2b5e000001" } }
Update
{ ... "op" : "u", ...,
"o2" : { "_id" : "513d189c8544eb2b5e000001" },
"o" : { "$set" : { "i" : 1 } } }
The timestamps are generated on the server and are guaranteed to be monotonically increasing. which allows you to quickly find the documents of interest.
This option is the most robust but requires some work on your part.
I wrote some demo code to create a "watcher" on a collection that is almost what you want. You can find that code on GitHub. Specifically look at the code in the com.allanbank.mongodb.demo.coordination package.
HTH, Rob
You can actually use timestamps if your _id is of type ObjectId:
prefix = Math.floor((new Date( 2013 , 03 , 11 )).getTime()/1000).toString(16)
db.foo.find( { _id : { $gt : new ObjectId( prefix + "0000000000000000" ) } } )
This way, it doesn't matter where the source of the event was or when it was,
it only matters when document insertion was recorded (higher than previous timer)
Of course, it is schema-less and you can always set a field such as isNew to true,
and set it to false in conjunction with your query / cursor

Multiple update of embedded documents' properties

I have the following collection:
{
"Milestones" : [
{ "ActualDate" : null,
"Index": 0,
"Name" : "milestone1",
"TargetDate" : ISODate("2011-12-13T22:00:00Z"),
"_id" : ObjectId("4ee89ae7e60fc615c42e28d1")},
{ "ActualDate" : null,
"Index" : 0,
"Name" : "milestone2",
"TargetDate" : ISODate("2011-12-13T22:00:00Z"),
"_id" : ObjectId("4ee89ae7e60fc615c42e28d2") } ]
,
"Name" : "a", "_id" : ObjectId("4ee89ae7e60fc615c42e28ce")
}
I want to update definite documents: that have specified _id, List of Milestones._id and ActualDate is null.
I dotnet my code looks like:
var query = Query.And(new[] { Query.EQ("_id", ObjectId.Parse(projectId)),
Query.In("Milestones._id", new BsonArray(values.Select(ObjectId.Parse))),
Query.EQ("Milestones.ActualDate", BsonNull.Value) });
var update = Update.Set("Milestones.$.ActualDate", DateTime.Now.Date);
Coll.Update(query, update, UpdateFlags.Multi, SafeMode.True);
Or in native code:
db.Projects.update({ "_id" : ObjectId("4ee89ae7e60fc615c42e28ce"), "Milestones._id" : { "$in" : [ObjectId("4ee89ae7e60fc615c42e28d1"), ObjectId("4ee89ae7e60fc615c42e28d2"), ObjectId("4ee8a648e60fc615c41d481e")] }, "Milestones.ActualDate" : null },{ "$set" : { "Milestones.$.ActualDate" : ISODate("2011-12-13T22:00:00Z") } }, false, true)
But only the first item is being updated.
This is not possible in current moment. Flag multi in update means update of multiple root documents. Positional operator can match only one nested array item. There is such feature in mongodb jira. You can vote up and wait.
Current solution can be only load document, update as you wish and save back or multiple atomic update for each nested array id.
From documentation at mongodb.org:
Currently the $ operator only applies to the first matched item in the
query
As answered by Andrew Orsich, this is not possible for the moment, at least not as you wish. But loading the document, modifying the array then saving it back will work. The risk is that some other process could modify the array in the meantime, so you would overwrite its changes. To avoid this, you can use optimistic locking, especially if the array is not modified every second.
load the document, including a new attribute: milestones_version
modify the array as needed
save back to mongodb, but now add a query constraint on the milestones_version, and increment it:
db.Projects.findAndModify({
query: {
_id: your_project_id,
milestones_version: expected_milestones_version
},
update: {
$set: {
Milestones: modified_milestones
},
$inc: {
milestones_version: 1
}
},
new: 1
})
If another process modified the milestones array (and hence the milestones_version) before we did, then this command will do nothing and simply return null. We just need to reload the document and try again. If the array is not modified every second, then this will be very rare and will not have any impact on performance.
The main problem with this solution is that you have to edit every Project, one by one (no multi: true). You could still write a javascript function and have it run on the server though.
According to their JIRA page "This new feature is available starting with the MongoDB 3.5.12 development version, and included in the MongoDB 3.6 production version"
https://jira.mongodb.org/browse/SERVER-1243