Why would this upsert fail with a duplicate id exception? - mongodb

This one keep cropping up from time to time. I have the operation done as an upsert, but every so often, the service crashes because it runs into this error and I don't understand how it's even possible. I try to do the upsert using the SurveyId as the key on which to match:
await _surveyRepository.DatabaseCollection.UpdateOneAsync(
Builders<SurveyData>.Filter.Eq(survey => survey.SurveyId, surveyData.SurveyId),
Builders<SurveyData>.Update
.Set(survey => survey.SurveyLink, surveyData.SurveyLink)
.Set(survey => survey.ClientId, surveyData.ClientId)
.Set(survey => survey.CustomerFirstName, surveyData.CustomerFirstName)
.Set(survey => survey.CustomerLastName, surveyData.CustomerLastName)
.Set(survey => survey.SurveyGenerationDateUtc, surveyData.SurveyGenerationDateUtc)
.Set(survey => survey.PortalUserId, surveyData.PortalUserId)
.Set(survey => survey.PortalUserFirst, surveyData.PortalUserFirst)
.Set(survey => survey.PortalUserLast, surveyData.PortalUserLast)
.Set(survey => survey.Tags, surveyData.Tags),
new UpdateOptions { IsUpsert = true })
.ConfigureAwait(false);
And I'll occasionally get this error:
Message: A write operation resulted in an error. E11000 duplicate
key error collection: surveys.surveys index: SurveyId dup key: { :
"" }
The id is a string representation of a Guid and is set to unique in mongo.
So why would this happen? It is my understanding that if it finds the key, it'll update the defined properties, and if not, it'll insert. Is that not correct? Because, that is the effect that I need.
C# driver version is 2.4.1.18

This happens because according to this Jira ticket:
During an update with upsert:true option, two (or more) threads may attempt an upsert operation using the same query predicate and, upon not finding a match, the threads will attempt to insert a new document. Both inserts will (and should) succeed, unless the second causes a unique constraint violation.

It is my understanding that if it finds the key, it'll update the
defined properties, and if not, it'll insert. Is that not correct
Yes that's what upsert does. And newly inserted document will contain all fields from criteria part(in your case surveyId) as well as update modification part(all other specified fields) of your update query.
You need to set upsert=false in your query. Then it will only update documents with matching criteria, and update will fail if no match is found.

Related

Mongo's bulkWrite with updateOne + upsert works the first time but gives duplicate key error subsequent times

I'm using the Mongo-php-library to insert many documents into my collection (using bulkWrite). I want the documents to be updated if it already exists or to be inserted if it doesn't, so I'm using "upsert = true".
The code works fine the first time I run it (it inserts the documents), but the second time I run it it gives me this error:
Fatal error: Uncaught MongoDB\Driver\Exception\BulkWriteException: E11000 duplicate key error collection: accounts.posts index: postid dup key: { id: "2338...
I can't see anything wrong with my code. I have already gone through all SO posts but none helped.
This is my code:
// I prepare the array $post_operations with all updateOne operations
// where $data is an object that contains all the document elements I want to insert
$posts_operations = array();
foreach ($this->posts as $id => $data) {
array_push($posts_operations, array('updateOne' => [['id' => $id], ['$set' => $data], ['upsert' => true]]));
}
// Then I execute the method bulkWrite to run all the updateOne operations
$insertPosts = $account_posts->bulkWrite($posts_operations);
It works the first time (when it inserts), but then it doesn't the second time (when it should update).
I have a unique index set up in the collection for 'id'.
Thanks so much for your help.
Ok I was able to fix it. I believe this might be a bug and I've reported it already in the Github repo.
The problem occurred only when "id" was a string of numbers. Once I converted "id" (the field that I was indexing) to an integer it works perfectly.

How to use Where condition inside Include in entity framework LINQ?

My sample code lines are,
var question = context.EXTests
.Include(i => i.EXTestSections.Where(t => t.Status != (int)Status.InActive))
.Include(i => i.EXTestQuestions)
.FirstOrDefault(p => p.Id == testId);
Here Include was not supporting Where Clause. How can I modify above code?
You have a sequence of ExTests. Every ExText has zero or more ExTestSections, Every Extest also has a property ExtestQuestions, which is probably also a sequence. Finally every ExTest is identified by an Id.
You want a query where you get the first ExTest that has Id equal to testId, inclusive all its ExTestQuestions and some ExTestSections. You want only those ExTestSections whith an InActive status.
Use Select instead of Using
One of the slower parts of database queries is the transfer of the data from the DBMS to your process. Hence it is wise to limit it to only the data you actually plan to use.
It seems that you have designed a one-to-many relation between ExTests and its ExTestSections: every ExTest has zero or more ExTestSections and every ExTestSection belongs to exactly one ExTest. In databases this is done by giving the ExTestSection a foreign key to the ExTest that it belongs to. It might be that you've designed a many-to-many relation. The principle remains the same.
If you ask an ExTest with its hundred ExTestSections, you get the Id of the the ExTest and hundred times the value of the foreign key of the ExTestSection, thus sending the same value 101 times. What a waste.
So if you query data from the database, only query for the data you actually plan to use.
Use Include if you plan to update the queried data, otherwise use Select
Back to your question
var result = myDbContext.EXTests
.Where(exTest => exTest.Id == testId)
.Select( exTest => new
{
// only select the properties you plan to use
Id = exTest.Id;
Name = exTest.Name,
Result = exText.Result,
... // other properties
ExTestSections = exTest.Sections
.Where(exTestSection => exTestSection.Status != (int)Status.InActive)
.Select(exTestSection => new
{
// again: select only those properties you actually plan to use
Id = exTestSection.Id,
// foreign key not needed, you know it equals ExTest primary key
// ExTestId = exTestSection.ExtTestId
... // other ExtestSection properties you plan to use
})
.ToList(),
ExTestQuestions = exTest.ExTestQuestions
.Select( ...) // only the properties you'll use
})
.FirstOrDefault();
I've transferred the test on equal TestId to a Where. This would allow you to omit the Id of the requested item: you know it will equal testId, so not meaningful to transfer it.

How can I upsert a record and array element at the same time?

That is meant to be read as a dual upsert operation, upsert the document then the array element.
So MongoDB is a denormalized store for me (we're event sourced) and one of the things I'm trying to deal with is the concurrent nature of that. The problem is this:
Events can come in out of order, so each update to the database need to be an upsert.
I need to be able to not only upsert the parent document but an element in an array property of that document.
For example:
If the document doesn't exist, create it. All events in this stream have the document's ID but only part of the information depending on the event.
If the document does exist, then update it. This is the easy part. The update command is just written as UpdateOneAsync and as an upsert.
If the event is actually to update a list, then that list element needs to be upserted. So if the document doesn't exist, it needs to be created and the list item will be upserted (resulting in an insert); if the document does exist, then we need to find the element and update it as an upsert, so if the element exists then it is updated otherwise it is inserted.
If at all possible, having it execute as a single atomic operation would be ideal, but if it can only be done in multiple steps, then so be it. I'm getting a number of mixed examples on the net due to the large change in the 2.x driver. Not sure what I'm looking for beyond the UpdateOneAsync. Currently using 2.4.x. Explained examples would be appreciated. TIA
Note:
Reiterating that this is a question regarding the MongoDB C# driver 2.4.x
Took some tinkering, but I got it.
var notificationData = new NotificationData
{
ReferenceId = e.ReferenceId,
NotificationId = e.NotificationId,
DeliveredDateUtc = e.SentDate.DateTime
};
var matchDocument = Builders<SurveyData>.Filter.Eq(s => s.SurveyId, e.EntityId);
// first upsert the document to make sure that you have a collection to write to
var surveyUpsert = new UpdateOneModel<SurveyData>(
matchDocument,
Builders<SurveyData>.Update
.SetOnInsert(f => f.SurveyId, e.EntityId)
.SetOnInsert(f => f.Notifications, new List<NotificationData>())){ IsUpsert = true};
// then push a new element if none of the existing elements match
var noMatchReferenceId = Builders<SurveyData>.Filter
.Not(Builders<SurveyData>.Filter.ElemMatch(s => s.Notifications, n => n.ReferenceId.Equals(e.ReferenceId)));
var insertNewNotification = new UpdateOneModel<SurveyData>(
matchDocument & noMatchReferenceId,
Builders<SurveyData>.Update
.Push(s => s.Notifications, notificationData));
// then update the element that does match the reference ID (if any)
var matchReferenceId = Builders<SurveyData>.Filter
.ElemMatch(s => s.Notifications, Builders<NotificationData>.Filter.Eq(n => n.ReferenceId, notificationData.ReferenceId));
var updateExistingNotification = new UpdateOneModel<SurveyData>(
matchDocument & matchReferenceId,
Builders<SurveyData>.Update
// apparently the mongo C# driver will convert any negative index into an index symbol ('$')
.Set(s => s.Notifications[-1].NotificationId, e.NotificationId)
.Set(s => s.Notifications[-1].DeliveredDateUtc, notificationData.DeliveredDateUtc));
// execute these as a batch and in order
var result = await _surveyRepository.DatabaseCollection
.BulkWriteAsync(
new []{ surveyUpsert, insertNewNotification, updateExistingNotification },
new BulkWriteOptions { IsOrdered = true })
.ConfigureAwait(false);
The post linked as being a dupe was absolutely helpful, but it was not the answer. There were a few things that needed to be discovered.
The 'second statement' in the linked example didn't work
correctly, at least when translated literally. To get it to work, I had to match on the
element and then invert the logic by wrapping it in the Not() filter.
In order to use 'this index' on the match, you have to use a
negative index on the array. As it turns out, the C# driver will
convert any negative index to the '$' character when the query is
rendered.
In order to ensure they are run in order, you must include bulk write
options with IsOrdered set to true.

ReplaceOne throws duplicate key exception

My app receives data from a remote server and calls ReplaceOne to either insert new or replace existing document with a given key with Upsert = true. (the key is made anonymous with *) The code only runs in a single thread.
However, occasionally, the app crashes with the following error:
Unhandled Exception: MongoDB.Driver.MongoWriteException: A write operation resulted in an error.
E11000 duplicate key error collection: ****.orders index: _id_ dup key: { : "****-********-********-************" } ---> MongoDB.Driver.MongoBulkWriteException`1[MongoDB.Bson.BsonDocument]: A bulk write operation resulted in one or more errors.
E11000 duplicate key error collection: ****.orders index: _id_ dup key: { : "****-********-********-************" }
at MongoDB.Driver.MongoCollectionImpl`1.BulkWrite(IEnumerable`1 requests, BulkWriteOptions options, CancellationToken cancellationToken)
at MongoDB.Driver.MongoCollectionBase`1.ReplaceOne(FilterDefinition`1 filter, TDocument replacement, UpdateOptions options, CancellationToken cancellationToken)
--- End of inner exception stack trace ---
at MongoDB.Driver.MongoCollectionBase`1.ReplaceOne(FilterDefinition`1 filter, TDocument replacement, UpdateOptions options, CancellationToken cancellationToken)
at Dashboard.Backend.AccountMonitor.ProcessOrder(OrderField& order)
at Dashboard.Backend.AccountMonitor.OnRtnOrder(Object sender, OrderField& order)
at XAPI.Callback.XApi._OnRtnOrder(IntPtr ptr1, Int32 size1)
at XAPI.Callback.XApi.OnRespone(Byte type, IntPtr pApi1, IntPtr pApi2, Double double1, Double double2, IntPtr ptr1, Int32 size1, IntPtr ptr2, Int32 size2, IntPtr ptr3, Int32 size3)
Aborted (core dumped)
My question is, why is it possible to have dup key when I use ReplaceOne with Upsert = true options?
The app is working in the following environment and runtime:
.NET Command Line Tools (1.0.0-preview2-003121)
Product Information:
Version: 1.0.0-preview2-003121
Commit SHA-1 hash: 1e9d529bc5
Runtime Environment:
OS Name: ubuntu
OS Version: 16.04
OS Platform: Linux
RID: ubuntu.16.04-x64
And MongoDB.Driver 2.3.0-rc1.
Upsert works based on the filter query. If the filter query doesn't match, it will try to insert the document.
If the filter query finds the document, it will replace the document.
In your case, it could have gone in either way i.e. insert/update. Please check the data to analyze the scenario.
Insert scenario:-
The actual _id is created automatically by upsert if _id is not present in filter criteria. So, _id shouldn't create uniqueness issue. If some other fields are part of unique index, it would create uniqueness issue.
Replace scenario:-
The field that you are trying to update should have unique index defined on it. Please check the indexes on the collection and its attributes.
Optional. When true, replaceOne() either: Inserts the document from
the replacement parameter if no document matches the filter. Replaces
the document that matches the filter with the replacement document.
To avoid multiple upserts, ensure that the query fields are uniquely
indexed.
Defaults to false.
MongoDB will add the _id field to the replacement document if it is
not specified in either the filter or replacement documents. If _id is
present in both, the values must be equal.
I could not get IsUpsert = true to work correctly due to a unique index on the same field used for the filter, leading to this error: E11000 duplicate key error collection A retry, as suggested in this Jira ticket, is not a great workaround.
What did seem to work was a Try/Catch block with InsertOne and then ReplaceOne without any options.
try
{
// insert into MongoDB
BsonDocument document = BsonDocument.Parse(obj.ToString());
collection.InsertOne(document);
}
catch
{
BsonDocument document = BsonDocument.Parse(obj.ToString());
var filter = Builders<BsonDocument>.Filter.Eq("data.order_no", obj.data.order_no);
collection.ReplaceOne(filter, document);
}
There is not enough information from you, but probably the scenario is the following:
You receive data from server, replaceOne command doesn't match any record and try to insert new one, but probably you have a key in a document that is unique and already exists in a collection. Review and make some changes in your data before trying to update or insert it.
I can co-sign on this one:
public async Task ReplaceOneAsync(T item)
{
try
{
await _mongoCollection.ReplaceOneAsync(x => x.Id.Equals(item.Id), item, new UpdateOptions { IsUpsert = true });
}
catch (MongoWriteException)
{
var count = await _mongoCollection.CountAsync(x => x.Id.Equals(item.Id)); // lands here - and count == 1 !!!
}
}
There is a bug in older MongoDB drivers, for example v2.8.1 has this problem. Update your MongoDB driver and the problem will go away. Please note when you use a new driver the DB version also needs to be updated and be compatible.

MongoDb replacing a document and inserting when non-existant

I want to replace a document when this already exists and if it doesn't I want it inserted.
How can I do that in mongoDb?
I need something like this, but in one query:
find by a "where statement"
if exists, replace whole document
else, insert
Thank you!
you could also use the save operation, that is much faster than the update (x70 faster from my tests), and is adapt to your purpose, but in case remember to give in input the whole document
Use collection update.
In the example below, the first update call will "insert or replace" the document (including name field from the query). In the second the update call will insert the document or just update Joe's job leaving the rest of the document intact. The difference is the "$set" operation.
<?php
$c->update(
array("name" => "joe"),
array("username" => "joe312", "job" => "Codemonkey"),
array("upsert" => true));
$c->update(
array("name" => "joe"),
array("$set" => array("job" => "Bartender")),
array("upsert" => true));
?>