pymongo update creates a new record without upserting - mongodb

I have an issue where I do an update on a document, however, the update creates a new document and I'm not upserting in my update.
This is my testing code.
I do a find to see if the document exists by checking if "lastseen" doesn't exist:
result = DATA_Collection.find({"sessionID":"12345","lastseen":{"$exists":False}})
if result.count() == 1:
DATA_Collection.update({"sessionID":"12345"},{"$set":{"lastseen":"2021-05-07"}})
When I do an aggregate check to find duplicates I get a few, one example below.
> db.DATA_Collection.find({ "sessionID" : "237a5fb8" })
{ "_id" : ObjectId("60bdf7b05c961b4d27d33bde"), "sessionID" : "237a5fb8", "firstseen" : ISODate("1970-01-19T20:51:09Z"), "lastseen" : ISODate("2021-06-07T12:34:20Z") }
{ "_id" : ObjectId("60bdf7fa7d35ea0f046a2514"), "sessionID" : "237a5fb8", "firstseen" : ISODate("1970-01-19T20:51:09Z") }
I remove all the records in the collection and rerun the script, the same happens again.
Any advice will be much appreciated.

Firstly your pymongo commands are deprecated; use update_one() or update_many() instead of update(); count_documents() instead of count().
Secondly double check you are referencing the same collections as you mention DATA_Collection and VPN_DATA;
How are you defining a "duplicate"? Unless you create a unique index on the field(s), the records won't be duplicates as they have different _id fields.
You need something like:
record = db.VPN_DATA.find_one({'sessionID': '12345', 'lastseen': {'$exists': False}})
if record is not None:
db.VPN_DATA.update_one({'_id': record.get('_id')}, {'$set': {'lastseen': '2021-05-07'}})

Related

Issue with cosmos DB collection order

I'm trying to order my collection using the following query:
db.getCollection('trip').find().sort({'itinerary.0.timestamp': 1})
The result is not being correctly sorted, however I exported the full collection to a local mongoDB database and the same query works like a charm. In order to perform that sort in cosmos DB I had to create the index 'itinerary.0.timestamp'.
data example:
{
"_id" : ObjectId("6087104ca68f171ce7715448"),
"tripId" : NumberLong(38533184),
"itinerary" : [
{
"transId" : NumberLong(39800097),
"timestamp" : NumberLong(1619372446291)
},
{
"transId" : NumberLong(39800576),
"timestamp" : NumberLong(1619372446321)
},
],
"results" : [],
"tripTimeSent" : ISODate("2021-04-29T14:44:53.253Z")
}
What am I missing?
Thanks!!
The solution was to create a new field, itiTimestamp, outside the array containing the value 'itinerary.0.timestamp'. Then just order by itiTimestamp
It's true that you need to create an index for the sort field. Here's the doc related:
To apply a sort to a query, you must create an index on the fields
used in the sort operation.
==========================================
I've tested in my side, after creating wildcard index on itinerary, sort query could be executed but has no luck. I also refer to this answer(new BasicDBObject("labels.0.value", 1)) and this one(db.testCollection.find().sort({"someArray.0": 1})), they all don't work for the date format Op provided.
But when I added a properity "score":[20,55,80] in each item in the collection, I found it can be sorted by the first item when sort by score directly.
I assume that this feature hasn't supported.

MongoDB get object id by finding on another column value

I am new to querying dbs and especially mongodb.If I run :
db.<customers>.find({"contact_name: Anny Hatte"})
I get:
{
"_id" : ObjectId("55f7076079cebe83d0b3cffd"),
"company_name" : "Gap",
"contact_name" : "Anny Hatte",
"email" : "ahatte#gmail.com"
}
I wish to get the value of the "_id" attribute from this query result. How do I achieve that?
Similarly, if I have another collection, named items, with the following data:
{
"_id" : ObjectId("55f7076079cebe83d0b3d009"),
"_customer" : ObjectId("55f7076079cebe83d0b3cfda"),
"school" : "St. Patrick's"
}
Here, the "_customer" field is the "_id" of the customer collection (the previous collection). I wish to get the "_id", the "_customer" and the "school" field values for the record where "_customer" of items-collection equals "_id" of customers-collection.
How do I go about this?
I wish to get the value of the "_id" attribute from this query result.
How do I achieve that?
The find() method returns a cursor to the results, which you can iterate and retrieve the documents in the result set. You can do this using forEach().
var cursor = db.customers.find({"contact_name: Anny Hatte"});
cursor.forEach(function(customer){
//access all the attributes of the document here
var id = customer._id;
})
You could make use of the aggregation pipeline's $lookup stage that has been introduced as part of 3.2, to look up and fetch the matching rows in some other related collection.
db.customers.aggregate([
{$match:{"contact_name":"Anny Hatte"}},
{$lookup:{
"from":"items",
"localField":"_id",
"foreignField":"_customer",
"as":"items"
}}
])
In case you are using a previous version of mongodb where the stage is not supported, then, you would need to fire an extra query to lookup the items collection, for each customer.
db.customers.find(
{"contact_name":"Anny Hatte"}).map(function(customer){
customer["items"] = [];
db.items.find({"_customer":customer._id}).forEach(function(item){
customer.items.push(item);
})
return customer;
})

Getting error while Updating Collection attribute name in the MongoDB

So I have following structure of MongoDB collection
{ "_id" : ObjectId("516c48631f6c263a24fbbe7a"), "oldname" : 1, "name" : "somename" }
and I want to rename OLD NAME to NEW NAME so it will look like,
{ "_id" : ObjectId("516c48631f6c263a24fbbe7a"), "newname" : 1, "name" : "somename" }
so I am writing this command,
db.element_type.update({}, {$rename: {'oldname': 'newname'}}, false, true);
But it is giving me this error
failing update: objects in a capped ns cannot grow
The problem, per the error message, is that you're trying to update a capped collection, presumably with a newname that is longer than the oldname.
You can read about capped collections in the docs. They're designed to maintain their order, which is why you're running into this.
If you must use a capped collection, perhaps you should remove and re-insert instead of updating.

Get position of selected document in collection [mongoDB]

How to get position (index) of selected document in mongo collection?
E.g.
this document: db.myCollection.find({"id":12345})
has index 3 in myCollection
myCollection:
id: 12340, name: 'G'
id: 12343, name: 'V'
id: 12345, name: 'A'
id: 12348, name: 'N'
If your requirement is to find the position of the document irrespective of any order, that is not
possible as MongoDb does not store the documents in specific order.
However,if you want to know the index based on some field, say _id , you can use this method.
If you are strictly following auto increments in your _id field. You can count all the documents
that have value less than that _id, say n , then n + 1 would be index of the document based on _id.
n = db.myCollection.find({"id": { "$lt" : 12345}}).count() ;
This would also be valid if documents are deleted from the collection.
As far as I know, there is no single command to do this, and this is impossible in general case (see Derick's answer). However, using count() for a query done on an ordered id value field seems to work. Warning: this assumes that there is a reliably ordered field, which is difficult to achieve in a concurrent writer case. In this example _id is used, however this will only work with a single writer case.:
MongoDB shell version: 2.0.1
connecting to: test
> use so_test
switched to db so_test
> db.example.insert({name: 'A'})
> db.example.insert({name: 'B'})
> db.example.insert({name: 'C'})
> db.example.insert({name: 'D'})
> db.example.insert({name: 'E'})
> db.example.insert({name: 'F'})
> db.example.find()
{ "_id" : ObjectId("4fc5f040fb359c680edf1a7b"), "name" : "A" }
{ "_id" : ObjectId("4fc5f046fb359c680edf1a7c"), "name" : "B" }
{ "_id" : ObjectId("4fc5f04afb359c680edf1a7d"), "name" : "C" }
{ "_id" : ObjectId("4fc5f04dfb359c680edf1a7e"), "name" : "D" }
{ "_id" : ObjectId("4fc5f050fb359c680edf1a7f"), "name" : "E" }
{ "_id" : ObjectId("4fc5f053fb359c680edf1a80"), "name" : "F" }
> db.example.find({_id: ObjectId("4fc5f050fb359c680edf1a7f")})
{ "_id" : ObjectId("4fc5f050fb359c680edf1a7f"), "name" : "E" }
> db.example.find({_id: {$lte: ObjectId("4fc5f050fb359c680edf1a7f")}}).count()
5
>
This should also be fairly fast if the queried field is indexed. The example is in mongo shell, but count() should be available in all driver libs as well.
This might be very slow but straightforward method. Here you can pass as usual query. Just I am looping all the documents and checking if condition to match the record. Here I am checking with _id field. You can use any other single field or multiple fields to check it.
var docIndex = 0;
db.url_list.find({},{"_id":1}).forEach(function(doc){
docIndex++;
if("5801ed58a8242ba30e8b46fa"==doc["_id"]){
print('document position is...' + docIndex);
return false;
}
});
There is no way that MongoDB can return this as it does not keep documents in order in the database, just like MySQL f.e. doesn't name row numbers.
The ObjectID trick from jhonkola will only work if only one client creates new elements, as the ObjectIDs are generated on the client side, with the first part being a timestamp. There is no guaranteed order if different clients talk to the same server. Still, I would not rely on this.
I also don't quite understand what you are trying to do though, so perhaps mention that in your question? I can then update the answer.
Restructure your collection to include the position of any entry i.e {'id': 12340, 'name': 'G', 'position': 1} then when searching the database collection(myCollection) using the desired position as a query
The queries I use that return the entire collection all use sort to get a reproducible order, find.sort.forEach works with the script above to get the correct index.

Multiple update of embedded documents' properties

I have the following collection:
{
"Milestones" : [
{ "ActualDate" : null,
"Index": 0,
"Name" : "milestone1",
"TargetDate" : ISODate("2011-12-13T22:00:00Z"),
"_id" : ObjectId("4ee89ae7e60fc615c42e28d1")},
{ "ActualDate" : null,
"Index" : 0,
"Name" : "milestone2",
"TargetDate" : ISODate("2011-12-13T22:00:00Z"),
"_id" : ObjectId("4ee89ae7e60fc615c42e28d2") } ]
,
"Name" : "a", "_id" : ObjectId("4ee89ae7e60fc615c42e28ce")
}
I want to update definite documents: that have specified _id, List of Milestones._id and ActualDate is null.
I dotnet my code looks like:
var query = Query.And(new[] { Query.EQ("_id", ObjectId.Parse(projectId)),
Query.In("Milestones._id", new BsonArray(values.Select(ObjectId.Parse))),
Query.EQ("Milestones.ActualDate", BsonNull.Value) });
var update = Update.Set("Milestones.$.ActualDate", DateTime.Now.Date);
Coll.Update(query, update, UpdateFlags.Multi, SafeMode.True);
Or in native code:
db.Projects.update({ "_id" : ObjectId("4ee89ae7e60fc615c42e28ce"), "Milestones._id" : { "$in" : [ObjectId("4ee89ae7e60fc615c42e28d1"), ObjectId("4ee89ae7e60fc615c42e28d2"), ObjectId("4ee8a648e60fc615c41d481e")] }, "Milestones.ActualDate" : null },{ "$set" : { "Milestones.$.ActualDate" : ISODate("2011-12-13T22:00:00Z") } }, false, true)
But only the first item is being updated.
This is not possible in current moment. Flag multi in update means update of multiple root documents. Positional operator can match only one nested array item. There is such feature in mongodb jira. You can vote up and wait.
Current solution can be only load document, update as you wish and save back or multiple atomic update for each nested array id.
From documentation at mongodb.org:
Currently the $ operator only applies to the first matched item in the
query
As answered by Andrew Orsich, this is not possible for the moment, at least not as you wish. But loading the document, modifying the array then saving it back will work. The risk is that some other process could modify the array in the meantime, so you would overwrite its changes. To avoid this, you can use optimistic locking, especially if the array is not modified every second.
load the document, including a new attribute: milestones_version
modify the array as needed
save back to mongodb, but now add a query constraint on the milestones_version, and increment it:
db.Projects.findAndModify({
query: {
_id: your_project_id,
milestones_version: expected_milestones_version
},
update: {
$set: {
Milestones: modified_milestones
},
$inc: {
milestones_version: 1
}
},
new: 1
})
If another process modified the milestones array (and hence the milestones_version) before we did, then this command will do nothing and simply return null. We just need to reload the document and try again. If the array is not modified every second, then this will be very rare and will not have any impact on performance.
The main problem with this solution is that you have to edit every Project, one by one (no multi: true). You could still write a javascript function and have it run on the server though.
According to their JIRA page "This new feature is available starting with the MongoDB 3.5.12 development version, and included in the MongoDB 3.6 production version"
https://jira.mongodb.org/browse/SERVER-1243