MongoDB : Can't insert twice the same document - mongodb

On my pymongo code, inserting twice the same doc raises an error :
document = {"auteur" : "romain",
"text" : "premier post",
"tag" : "test2",
"date" : datetime.datetime.utcnow()}
collection.insert_one(document)
collection.insert_one(document)
raises :
DuplicateKeyError: E11000 duplicate key error collection: test.myCollection index: _id_ dup key: { : ObjectId('5aa282eff1dba231beada9e3') }
inserting two documents with different content works fine.
Seems like according to https://docs.mongodb.com/manual/reference/method/db.collection.createIndex/#options I should do something aobut option of indexes:
unique boolean
Optional. Creates a unique index so that the collection will not accept insertion or update of documents where the index key value matches an existing value in the index.
Specify true to create a unique index. The default value is false.
The option is unavailable for hashed indexes.

Adding to Peba's answer, you can use the .copy() method of python dictionary to avoid the mutation of the document itself.
document = {"auteur" : "romain",
"text" : "premier post",
"tag" : "test2",
"date" : datetime.datetime.utcnow()}
collection.insert_one(document.copy())
collection.insert_one(document.copy())
This way, each insert_one call get's a shallow copy of the document and at the same time keeps your code more pythonic.

Inserting a document implicitly generates an _id.
So after inserting the document it will mutate to
document = {"_id" : ObjectId('random_id_here'),
"auteur" : "romain",
"text" : "premier post",
"tag" : "test2",
"date" : datetime.datetime.utcnow()}
Trying to insert said document again will result in an error due to the duplicated _id.
You can create a new document with the same values and insert it.
document = {"auteur" : "romain",
"text" : "premier post",
"tag" : "test2",
"date" : datetime.datetime.utcnow()}
collection.insert_one(document)
document = {"auteur" : "romain",
"text" : "premier post",
"tag" : "test2",
"date" : datetime.datetime.utcnow()}
collection.insert_one(document)

Related

MongoDb _id field rename

I have a collection named 'summary' in azure cosmos db and _id field for my collection is 'orderId'. I have millions of records in my collection. Now I want to rename my _id field 'orderId' to 'purchaseOrderId'(as per business domain design). This collection has '_id.orderId' index. To achieve this one straight approach is dropping the collection and reload with new id field name, but that cost more and also time consuming since it needs to reload millions of data. So is there any way to achieve this by updating _id field name for rename (by retrieving existing record and do a rename update)with spring mongotemplate or mongodb driver 3.11.1.
old id field name : 'orderId',
recommended id name : 'purchaseOrderId',
Existing index : '_id.orderId',
Mongo db version: 3.6
Mongo document structure
{
"_id" : {
"orderId" : 10164
},
"countryCode" : null,
"sequenceNumber" : "5693",
"deptNumber" : "92",
"type" : "20",
"addrNumber" : 12,
"venNumber" : 0,
"shipPtDescr" : " ",
"whsNumber" : "6001",
"purchId" : 1006,
"statCode" : "C",
"groceryId" : "N",
"openToBuyMonth" : 12,
"updateSource" : "MF",
"authorizedDate" : null,
"deposit" : null,
"cost" : null,
"boardCode" : null,
"authorizedBy" : null,
...
..
...
}
Unfortunately, _id is an immutable field in MongoDB and it does not allow to change the _id of a document after you have inserted it.
Here is the behavior of _id field as mentioned in this link :
_id Field Once set, you cannot update the value of the _id field nor can you replace an existing document with a replacement document that has a different _id field value.
As a work-around, let MongoDB create it's own _id field while you add another field ( say, custom_id outside _id field ) that your application will refer and use.
{
"_id" : Object("xxxxxxxxxxxxxxxx"),
"custom_id" : {
"orderId" : 10164
},
"countryCode" : null,
"sequenceNumber" : "5693",
...
...
}
Here you can rename the field as
db.collection.updateMany( {}, { $rename: { "custom_id.orderId": "custom_id.purchaseOrderId" } } )

Adding to a double-nested array in MongoDB

I have a double nested array in my MongoDB schema and I'm trying to add an entirely new array element to a second-level nested array using $push. I'm getting the error cannot use the part (...) to traverse the element
A documents have the following structure
{
"_id" : ObjectId("5d8e37eb46c064790a28a467"),
"org-name" : "Manchester University NHS Foundation Trust",
"domain" : "mft.nhs.uk",
"subdomains" : [ {
"name" : "careers.mft.nhs.uk",
"firstSeen" : "2017-10-06 11:32:00",
"history" : [
{
"a_rr" : "80.244.185.184",
"timestamp" : ISODate("2019-09-27T17:24:57.148Z"),
"asn" : 61323,
"asn_org" : "Ukfast.net Limited",
"city" : null,
"country" : "United Kingdom",
"shodan" : {
"ports" : [
{
"port" : 443,
"versions" : [
"TLSv1",
"-SSLv2",
"-SSLv3",
"TLSv1.1",
"TLSv1.2",
"-TLSv1.3"
],
"cpe" : "cpe:/a:apache:http_server:2.4.18",
"product" : "Apache httpd"
}
],
"timestamp" : ISODate("2019-09-27T17:24:58.538Z")
}
}
]
}
]
}
What I'm attempting to do is refresh the details held in the history array and add another entire array entry to represent the most recently collected data for the subdomain.name
The net result is that I will have multiple entries in the history array, each one timestamped the the date that the data was refreshed. That way I have a historical record of changes to any of the data held.
I've read that I can't use $push on a double-nested array but the other advice about using arrayfilters all appear to be related to updating an entry in an array rather than simply appending an entirely new document - unless I'm missing something!
I'm using PyMongo and would simply like to build a new dictionary containing all of the data elements and simply append it to the history.
Thanks!
Straightforward in pymongo:
record = db.mycollection.find_one()
record['subdomains'][0]['history'].append({'another': 'record'})
db.mycollection.replace_one({'_id': record['_id']}, record)

MongoDB aggregation and paging

I have documents with my internal id field inside of each document and date when this document was added. There could be number of documents with the same id (differents versions of the same document), but dates will always be different for those documents. I want in some query, to bring only one document from all versions of the same document (with same id field) that was relevant to specified date, and I want to display them with paging (50 rows in the page). So, is there any chance to do this in MongoDB (operations - query documents by some field, group them by id field, sort by date field and take only first, and all this should be with paging.) ?
Please see example :Those are documents, some of them different documents,like documents A,B and C, and some are versions of the same documents,
like _id: 1, 2 and 3 are all version of the same document A
Document A {
_id : 1,
"id" : "A",
"author" : "value",
"date" : "2015-11-05"
}
Document A {
_id : 2,
"id" : "A",
"author" : "value",
"date" : "2015-11-06"
}
Document A {
_id : 3,
"id" : "A",
"author" : "value",
"date" : "2015-11-07"
}
Document B {
_id : 4,
"id" : "B",
"author" : "value",
"date" : "2015-11-06"
}
Document B {
_id : 5,
"id" : "B",
"author" : "value",
"date" : "2015-11-07"
}
Document C {
_id : 6,
"id" : "C",
"author" : "value",
"date" : "2015-11-07"
}
And I want to query all documents that has "value" in the "author" field.
And from those documents to bring only one document of each with latest date for
the specified date, for example 2015-11-08. So, I expect the result to be :
_id : 3, _id : 5, _id : 6
And also paging , for example 10 documents in each page.
Thanks !!!!!
Two documents can't have the same _id. There is a unique index on _id by default.
As per 1. you need to have a compound _id field which includes the date:
{
"_id":{
docId: yourFormerIdValue,
date: new ISODate()
}
// other fields
}
To get the version valid at a specified date, the query becomes rather easy:
db.yourColl.find({
"_id":{
"docId": idToFind,
// get only the version valid up to a specific date...
"date":{ "$lte": someISODate }
}
})
// ...sort the results descending...
.sort("_id.date":-1)
// ...and get only the first and therefor newest entry
.limit(1)

mongodb get elements which was inserted after some document

I have a document and I need to query mongodb database to return me all the documents which was inserted after current document.
Is it possible and how to do that query?
If you do not override the default _id field you can use that objectID (see the mongodb docs) to make a comparison by time. For instance, the following query will find all the documents that are inserted after curDoc has been inserted (assuming none overwrite the _id field):
>db.test.find({ _id : {$gt : curDoc._id}})
Note that these timestamps are not super granular, if you would like a finer grained view of the time that documents are inserted I encourage you to add your own timestamp field to the documents you are inserting and use that field to make such queries.
If you are using Insert time stamp as on of the parameter, you can query like below
> db.foo.find()
{ "_id" : ObjectId("514bf8bbbe11e483111af213"), "Name" : "abc", "Insert_time" : ISODate("2013-03-22T06:22:51.422Z") }
{ "_id" : ObjectId("514bf8c5be11e483111af214"), "Name" : "xyz", "Insert_time" : ISODate("2013-03-22T06:23:01.310Z") }
{ "_id" : ObjectId("514bf8cebe11e483111af215"), "Name" : "pqr", "Insert_time" : ISODate("2013-03-22T06:23:10.006Z") }
{ "_id" : ObjectId("514bf8eabe11e483111af216"), "Name" : "ijk", "Insert_time" : ISODate("2013-03-22T06:23:38.410Z") }
>
Here my Insert_time corresponds to the document inserted time, and following query will give you the documents after a particular Insert_time,
> db.foo.find({Insert_time:{$gt:ISODate("2013-03-22T06:22:51.422Z")}})
{ "_id" : ObjectId("514bf8c5be11e483111af214"), "Name" : "xyz", "Insert_time" : ISODate("2013-03-22T06:23:01.310Z") }
{ "_id" : ObjectId("514bf8cebe11e483111af215"), "Name" : "pqr", "Insert_time" : ISODate("2013-03-22T06:23:10.006Z") }
{ "_id" : ObjectId("514bf8eabe11e483111af216"), "Name" : "ijk", "Insert_time" : ISODate("2013-03-22T06:23:38.410Z") }
>

Can't append to array using string field name [$] when performing update on array fields

rowsI am attempting to perform a mongodb update on each field in an array of records.
An example schema is below:
{
"_id" : ObjectId("508710f16dc636ec07000022"),
"summary" : "",
"uid" : "ABCDEF",
"username" : "bigcheese",
"name" : "Name of this document",
"status_id" : 0,
"rows" : [
{
"score" : 12,
"status_id" : 0,
"uid" : 1
},
{
"score" : 51,
"status_id" : 0,
"uid" : 2
}
]
}
So far I have been able to perform single updates like this:
db.mycollection.update({"uid":"ABCDEF","rows.uid":1}, {$set:{"rows.$.status_id":1}},false,false)
However, I am struggling as to how to perform an update that will update all array records to a status_id of 1 (for instance).
Below is how I imagine it should work:
db.mycollection.update({"uid":"ABCDEF"}, {$set:{"rows.$.status_id":1}},false,true)
However I get the error:
can't append to array using string field name [$]
I have tried for quite a while with no luck. Any pointers?
You can't do the sort of 'wildcard' update of array elements that you're looking for. I think the best you can do is simultaneously set each element's status_id value like this:
db.mycollection.update(
{"uid":"ABCDEF"},
{$set:{
"rows.0.status_id":1,
"rows.1.status_id":1
}}, false, true);