ReplaceOne throws duplicate key exception - mongodb

My app receives data from a remote server and calls ReplaceOne to either insert new or replace existing document with a given key with Upsert = true. (the key is made anonymous with *) The code only runs in a single thread.
However, occasionally, the app crashes with the following error:
Unhandled Exception: MongoDB.Driver.MongoWriteException: A write operation resulted in an error.
E11000 duplicate key error collection: ****.orders index: _id_ dup key: { : "****-********-********-************" } ---> MongoDB.Driver.MongoBulkWriteException`1[MongoDB.Bson.BsonDocument]: A bulk write operation resulted in one or more errors.
E11000 duplicate key error collection: ****.orders index: _id_ dup key: { : "****-********-********-************" }
at MongoDB.Driver.MongoCollectionImpl`1.BulkWrite(IEnumerable`1 requests, BulkWriteOptions options, CancellationToken cancellationToken)
at MongoDB.Driver.MongoCollectionBase`1.ReplaceOne(FilterDefinition`1 filter, TDocument replacement, UpdateOptions options, CancellationToken cancellationToken)
--- End of inner exception stack trace ---
at MongoDB.Driver.MongoCollectionBase`1.ReplaceOne(FilterDefinition`1 filter, TDocument replacement, UpdateOptions options, CancellationToken cancellationToken)
at Dashboard.Backend.AccountMonitor.ProcessOrder(OrderField& order)
at Dashboard.Backend.AccountMonitor.OnRtnOrder(Object sender, OrderField& order)
at XAPI.Callback.XApi._OnRtnOrder(IntPtr ptr1, Int32 size1)
at XAPI.Callback.XApi.OnRespone(Byte type, IntPtr pApi1, IntPtr pApi2, Double double1, Double double2, IntPtr ptr1, Int32 size1, IntPtr ptr2, Int32 size2, IntPtr ptr3, Int32 size3)
Aborted (core dumped)
My question is, why is it possible to have dup key when I use ReplaceOne with Upsert = true options?
The app is working in the following environment and runtime:
.NET Command Line Tools (1.0.0-preview2-003121)
Product Information:
Version: 1.0.0-preview2-003121
Commit SHA-1 hash: 1e9d529bc5
Runtime Environment:
OS Name: ubuntu
OS Version: 16.04
OS Platform: Linux
RID: ubuntu.16.04-x64
And MongoDB.Driver 2.3.0-rc1.

Upsert works based on the filter query. If the filter query doesn't match, it will try to insert the document.
If the filter query finds the document, it will replace the document.
In your case, it could have gone in either way i.e. insert/update. Please check the data to analyze the scenario.
Insert scenario:-
The actual _id is created automatically by upsert if _id is not present in filter criteria. So, _id shouldn't create uniqueness issue. If some other fields are part of unique index, it would create uniqueness issue.
Replace scenario:-
The field that you are trying to update should have unique index defined on it. Please check the indexes on the collection and its attributes.
Optional. When true, replaceOne() either: Inserts the document from
the replacement parameter if no document matches the filter. Replaces
the document that matches the filter with the replacement document.
To avoid multiple upserts, ensure that the query fields are uniquely
indexed.
Defaults to false.
MongoDB will add the _id field to the replacement document if it is
not specified in either the filter or replacement documents. If _id is
present in both, the values must be equal.

I could not get IsUpsert = true to work correctly due to a unique index on the same field used for the filter, leading to this error: E11000 duplicate key error collection A retry, as suggested in this Jira ticket, is not a great workaround.
What did seem to work was a Try/Catch block with InsertOne and then ReplaceOne without any options.
try
{
// insert into MongoDB
BsonDocument document = BsonDocument.Parse(obj.ToString());
collection.InsertOne(document);
}
catch
{
BsonDocument document = BsonDocument.Parse(obj.ToString());
var filter = Builders<BsonDocument>.Filter.Eq("data.order_no", obj.data.order_no);
collection.ReplaceOne(filter, document);
}

There is not enough information from you, but probably the scenario is the following:
You receive data from server, replaceOne command doesn't match any record and try to insert new one, but probably you have a key in a document that is unique and already exists in a collection. Review and make some changes in your data before trying to update or insert it.

I can co-sign on this one:
public async Task ReplaceOneAsync(T item)
{
try
{
await _mongoCollection.ReplaceOneAsync(x => x.Id.Equals(item.Id), item, new UpdateOptions { IsUpsert = true });
}
catch (MongoWriteException)
{
var count = await _mongoCollection.CountAsync(x => x.Id.Equals(item.Id)); // lands here - and count == 1 !!!
}
}

There is a bug in older MongoDB drivers, for example v2.8.1 has this problem. Update your MongoDB driver and the problem will go away. Please note when you use a new driver the DB version also needs to be updated and be compatible.

Related

MongoDB cursor contains retrieved documents before iterating over it?

When I executed find method and I have a cursor, the cursor contains retrieved documents before iterating over it?
I have this method:
public async Task<Post> GetPostByIdAsync(ObjectId id, CancellationToken cancellationToken)
{
using var cursor = await _Collection.FindAsync(post => post.Id == id, cancellationToken: cancellationToken);
return await cursor.SingleAsync(cancellationToken);
}
First, I'm calling FindAsync that returns a cursor, and then, I'm calling SingleAsync that returns the document. I have some questions about it.
In what point the query is executed? When I call FindAsync or when I call SingleAsync?
When I executed find method and I have a cursor, the cursor contains retrieved documents before iterating over it?
If I have a cursor with 100 documents, but I only iterate over first 20, the others 80 documents are queried and retrieved from server?
Why getting the cursor and iterating over it are the two operations async?
public async Task<Post> GetPostByIdAsync(ObjectId id, CancellationToken cancellationToken)
{
var query = _Collection.AsQueryable().Where(post => post.Id == id);
return await query.SingleAsync(cancellationToken);
}
If I call AsQueryable because I want to use LINQ, I have only one async method. There are any server operation running sync and blocking the thread?
I see that IMongoQueryable<T> is a subtype of IAsyncCursorSource<T>. What is the difference between Cursor and Cursor Source?
When a find is issued and the result set exceeds default (or specified) batch size, the response to the find includes:
The first batch of documents
Cursor id to retrieve the next batch
If you know you will only be processing a certain number of documents, use limit to restrict the result set to that many documents.

Mongo's bulkWrite with updateOne + upsert works the first time but gives duplicate key error subsequent times

I'm using the Mongo-php-library to insert many documents into my collection (using bulkWrite). I want the documents to be updated if it already exists or to be inserted if it doesn't, so I'm using "upsert = true".
The code works fine the first time I run it (it inserts the documents), but the second time I run it it gives me this error:
Fatal error: Uncaught MongoDB\Driver\Exception\BulkWriteException: E11000 duplicate key error collection: accounts.posts index: postid dup key: { id: "2338...
I can't see anything wrong with my code. I have already gone through all SO posts but none helped.
This is my code:
// I prepare the array $post_operations with all updateOne operations
// where $data is an object that contains all the document elements I want to insert
$posts_operations = array();
foreach ($this->posts as $id => $data) {
array_push($posts_operations, array('updateOne' => [['id' => $id], ['$set' => $data], ['upsert' => true]]));
}
// Then I execute the method bulkWrite to run all the updateOne operations
$insertPosts = $account_posts->bulkWrite($posts_operations);
It works the first time (when it inserts), but then it doesn't the second time (when it should update).
I have a unique index set up in the collection for 'id'.
Thanks so much for your help.
Ok I was able to fix it. I believe this might be a bug and I've reported it already in the Github repo.
The problem occurred only when "id" was a string of numbers. Once I converted "id" (the field that I was indexing) to an integer it works perfectly.

How can I upsert a record and array element at the same time?

That is meant to be read as a dual upsert operation, upsert the document then the array element.
So MongoDB is a denormalized store for me (we're event sourced) and one of the things I'm trying to deal with is the concurrent nature of that. The problem is this:
Events can come in out of order, so each update to the database need to be an upsert.
I need to be able to not only upsert the parent document but an element in an array property of that document.
For example:
If the document doesn't exist, create it. All events in this stream have the document's ID but only part of the information depending on the event.
If the document does exist, then update it. This is the easy part. The update command is just written as UpdateOneAsync and as an upsert.
If the event is actually to update a list, then that list element needs to be upserted. So if the document doesn't exist, it needs to be created and the list item will be upserted (resulting in an insert); if the document does exist, then we need to find the element and update it as an upsert, so if the element exists then it is updated otherwise it is inserted.
If at all possible, having it execute as a single atomic operation would be ideal, but if it can only be done in multiple steps, then so be it. I'm getting a number of mixed examples on the net due to the large change in the 2.x driver. Not sure what I'm looking for beyond the UpdateOneAsync. Currently using 2.4.x. Explained examples would be appreciated. TIA
Note:
Reiterating that this is a question regarding the MongoDB C# driver 2.4.x
Took some tinkering, but I got it.
var notificationData = new NotificationData
{
ReferenceId = e.ReferenceId,
NotificationId = e.NotificationId,
DeliveredDateUtc = e.SentDate.DateTime
};
var matchDocument = Builders<SurveyData>.Filter.Eq(s => s.SurveyId, e.EntityId);
// first upsert the document to make sure that you have a collection to write to
var surveyUpsert = new UpdateOneModel<SurveyData>(
matchDocument,
Builders<SurveyData>.Update
.SetOnInsert(f => f.SurveyId, e.EntityId)
.SetOnInsert(f => f.Notifications, new List<NotificationData>())){ IsUpsert = true};
// then push a new element if none of the existing elements match
var noMatchReferenceId = Builders<SurveyData>.Filter
.Not(Builders<SurveyData>.Filter.ElemMatch(s => s.Notifications, n => n.ReferenceId.Equals(e.ReferenceId)));
var insertNewNotification = new UpdateOneModel<SurveyData>(
matchDocument & noMatchReferenceId,
Builders<SurveyData>.Update
.Push(s => s.Notifications, notificationData));
// then update the element that does match the reference ID (if any)
var matchReferenceId = Builders<SurveyData>.Filter
.ElemMatch(s => s.Notifications, Builders<NotificationData>.Filter.Eq(n => n.ReferenceId, notificationData.ReferenceId));
var updateExistingNotification = new UpdateOneModel<SurveyData>(
matchDocument & matchReferenceId,
Builders<SurveyData>.Update
// apparently the mongo C# driver will convert any negative index into an index symbol ('$')
.Set(s => s.Notifications[-1].NotificationId, e.NotificationId)
.Set(s => s.Notifications[-1].DeliveredDateUtc, notificationData.DeliveredDateUtc));
// execute these as a batch and in order
var result = await _surveyRepository.DatabaseCollection
.BulkWriteAsync(
new []{ surveyUpsert, insertNewNotification, updateExistingNotification },
new BulkWriteOptions { IsOrdered = true })
.ConfigureAwait(false);
The post linked as being a dupe was absolutely helpful, but it was not the answer. There were a few things that needed to be discovered.
The 'second statement' in the linked example didn't work
correctly, at least when translated literally. To get it to work, I had to match on the
element and then invert the logic by wrapping it in the Not() filter.
In order to use 'this index' on the match, you have to use a
negative index on the array. As it turns out, the C# driver will
convert any negative index to the '$' character when the query is
rendered.
In order to ensure they are run in order, you must include bulk write
options with IsOrdered set to true.

Why would this upsert fail with a duplicate id exception?

This one keep cropping up from time to time. I have the operation done as an upsert, but every so often, the service crashes because it runs into this error and I don't understand how it's even possible. I try to do the upsert using the SurveyId as the key on which to match:
await _surveyRepository.DatabaseCollection.UpdateOneAsync(
Builders<SurveyData>.Filter.Eq(survey => survey.SurveyId, surveyData.SurveyId),
Builders<SurveyData>.Update
.Set(survey => survey.SurveyLink, surveyData.SurveyLink)
.Set(survey => survey.ClientId, surveyData.ClientId)
.Set(survey => survey.CustomerFirstName, surveyData.CustomerFirstName)
.Set(survey => survey.CustomerLastName, surveyData.CustomerLastName)
.Set(survey => survey.SurveyGenerationDateUtc, surveyData.SurveyGenerationDateUtc)
.Set(survey => survey.PortalUserId, surveyData.PortalUserId)
.Set(survey => survey.PortalUserFirst, surveyData.PortalUserFirst)
.Set(survey => survey.PortalUserLast, surveyData.PortalUserLast)
.Set(survey => survey.Tags, surveyData.Tags),
new UpdateOptions { IsUpsert = true })
.ConfigureAwait(false);
And I'll occasionally get this error:
Message: A write operation resulted in an error. E11000 duplicate
key error collection: surveys.surveys index: SurveyId dup key: { :
"" }
The id is a string representation of a Guid and is set to unique in mongo.
So why would this happen? It is my understanding that if it finds the key, it'll update the defined properties, and if not, it'll insert. Is that not correct? Because, that is the effect that I need.
C# driver version is 2.4.1.18
This happens because according to this Jira ticket:
During an update with upsert:true option, two (or more) threads may attempt an upsert operation using the same query predicate and, upon not finding a match, the threads will attempt to insert a new document. Both inserts will (and should) succeed, unless the second causes a unique constraint violation.
It is my understanding that if it finds the key, it'll update the
defined properties, and if not, it'll insert. Is that not correct
Yes that's what upsert does. And newly inserted document will contain all fields from criteria part(in your case surveyId) as well as update modification part(all other specified fields) of your update query.
You need to set upsert=false in your query. Then it will only update documents with matching criteria, and update will fail if no match is found.

spring-data mongodb exclude fields from update

How to make sure that specific fields can be inserted upon creation, but can optionally be excluded when updating the object.
I'm essentially looking for something like the following:
mongoOperations.save(theObject, <fields to ignore>)
From what I see, recently introduced #ReadOnlyProperty will ignore the property both for inserts and updates.
I was able to get the desired behavior by implementing my Custom MongoTemplate and overriding its doUpdate method as follows:
#Override
protected WriteResult doUpdate(String collectionName, Query query,
Update originalUpdate, Class<?> entityClass, boolean upsert, boolean multi) {
Update updateViaSet = new Update();
DBObject dbObject = originalUpdate.getUpdateObject();
Update filteredUpdate = Update.fromDBObject(dbObject, "<fields to ignore>");
for(String key : filteredUpdate.getUpdateObject().keySet()){
Object val = filteredUpdate.getUpdateObject().get(key);
System.out.println(key + "::" + val);
updateViaSet.set(key, filteredUpdate.getUpdateObject().get(key));
}
return super
.doUpdate(collectionName, query, updateViaSet, entityClass, upsert, multi);
}
But the issue is that now it will use Mongo $set form of updates for everything, not just for specific cases.
Please advise if there is any simpler (and correct) way to achieve this.
While creating an update object, use $setOnInsert instead of $set. It is available in spring mongo as well.
If an update operation with upsert: true results in an insert of a document, then $setOnInsert assigns the specified values to the fields in the document. If the update operation does not result in an insert, $setOnInsert does nothing.