Failed to target upsert by query :: could not extract exact shard key', details={}}.; nested exception is com.mongodb.MongoWriteException - mongodb

We have recently moved to mongo java driver core/sync 4.4.0 from 3.12.1 with spring data mongo db from 2.2.5.RELEASE to 3.3.0 also spring boot version 2.6.2 and mongo server version 4.2.5
We are getting exceptions while hitting upsert queries on sharded collection with above mentioned error
There is way to insert shard key in query filter but that is not feasible for us, hence we tried adding #Sharded annotations to our DTO as we have different shard keys for different collections
Still we are getting the above mentioned error also we are unable to get exact meaning of what is full copy of entity meant in below statement
update/upsert operations replacing/upserting a single existing document as long as the given UpdateDefinition holds a full copy of the entity.
Other queries are working fine also upsert queries are working fine on addition of shard key in query filter but that change is not feasible for us we need quick solution
Please help as not able to find any solution on any platform. Thanks in advance!

So here's the deal:
You cannot upsert on Sharded Collection in MongoDB UNLESS you include the shard key in the filter provided for update operation.
To quote MongoDB Docs:
For a db.collection.update() operation that includes upsert: true and is on a sharded collection, you must include the full shard key in the filter:
For an update operation.
For a replace document operation (starting
in MongoDB 4.2).
If you are using MongoDB 4.4+ then you have a workaround as mentioned below:
However, starting in version 4.4, documents in a sharded collection can be missing the shard key fields. To target a document that is missing the shard key, you can use the null equality match in conjunction with another filter condition (such as on the _id field). For example:
{ _id: <value>, <shardkeyfield>: null } // _id of the document missing shard key
Ref: - https://docs.mongodb.com/manual/reference/method/db.collection.update/#behavior

Related

Is it possible/ how to perform upsert on a sharded MongoDB cluster

I’d like to perform an upsert operation on a sharded MongoDB cluster, but I get the exact same error for every variation I try. When I write this way to a local standalone Mongo, it works perfectly, but on the sharded cluster I get the errors.
My cluster currently has the structure of a unique (obviously) id at the _id field, and another (also unique) id at the itemId field - which is the sharded key. I’m using Reactive Spring Data, but I’ve managed to restore the same error on the CLI of this sharded Mongo instance.
The error was Failed to target upsert by query :: could not extract exact shard key. My queries were variations of this query:
db.MyCollection.updateOne({"_id": "a2","itemId": "a2"},
{"$set": {"itemId": "a3"}},
{upsert: true},
{multi: false})
Also when I don’t update the sharded field, the error remains the same:
db.MyCollection.updateOne({"_id": "a2","itemId": "a2"},
{"$set": {"itemColor": "brown"}},
{upsert: true},
{multi: false})
updateMany() \ multi: true variations didn't work as well. This page wasn't very helpful too.
I’ve read suggestions of adding the #Sharded annotation so that Spring will automatically add the sharded key to the filter, but since I did it manually I didn’t find that helpful.
Any reason why this can happen? Or what this error even means?
I’m using version 5.0.

In MongoDB, can I create unsharded collection on each shard for $lookup?

I have been trying to use $lookup on mongos shards which isn't allowed.
If I create an unsharded collection, I know it is by default only created at primary shards. However, by using $lookup only from shards to primary shard is not efficient.
Therefore, what I have been thinking was to create the same collection on each shard then insert specifically on that shard using the same shard rule on config.
Then if I use $lookup from the sharded collection on the local collection, it will achieve my goal.
I was searching about this and found comments on this jira has the same issue SERVER-29159 as below.
Is there a way to achieve what I have just explained this?
From logical point of view, it should be achievable but the way to connect to shards is through routers so I believe unless mongodb offers such feature on routers, it is not possible... At least please tell me if it is not possible if you know mongo well.
P.S. I am using spring-data-mongodb as a client.

Sharded collections and MongoDb

I've created a sharded collection on Cosmos (for use with C# MongoDB driver) through the portal. Created using Data Explorer -> New Collection - Shard Key set at this point.
I've set the shard key to be partitionId.
As an example when trying to insert this document into a collection named "data":
db.data.insert({partitionId:"test"})
I receive the error Command insert failed: document does not contain shard key.
Edit:
There seem to be issues when creating the sharded collection using the portal. Manually creating the sharded collection should work, see: https://stackoverflow.com/a/48202411/5405453
Original:
From the docs:
The shard key determines the distribution of the collection’s
documents among the cluster’s shards. The shard key is either an
indexed field or indexed compound fields that exists in every document
in the collection.
On creation of the sharded collection, you have provided a key which should be used as shard key. Next if you insert a document it has to contain that key. See here.

Elastic search doesn't index new mongodb documents

My problem is with Elasticasearch, I have 1564 indexes and 1564 documents in MongoDB (after my last populating operation : in Symfony with Elasticabundle :php app/console foqs:elastica:populate)
but when I add a document manually the number of indexes remains 1564 where it should be 1565
Did I miss something ?
The functionality to update Elasticsearch indexes when Doctrine entities are modified is documented in the README file under Realtime, selective index update. The configuration option is listeners, which falls under the persistence option you should already have defined per model.

Duplicate documents on _id (in mongo)

I have a sharded mongo collection, with over 1.5 mil documents. I use the _id column as a shard key, and the values in this column are integers (rather than ObjectIds).
I do a lot of write operations on this collection, using the Perl driver (insert, update, remove, save) and mongoimport.
My problem is that somehow, I have duplicate documents on the same _id. From what I've read, this shouldn't be possible.
I've removed the duplicates, but others still appear.
Do you have any ideas where could they come from, or what should I start looking at?
(Also, I've tried to replicate this on a smaller, test collection, but no duplicates are inserted, no matter what write operation I perform).
This actually isn't a problem with the Perl driver .. it is related to the characteristics of sharding. MongoDB is only able to enforce uniqueness among the documents located on a single shard at the time of creation, so the default index does not require uniqueness.
In the MongoDB: Configuring Sharding documentation there is specific mention that:
When you shard a collection, you must specify the shard key. If there is data in the collection, mongo will require an index to be created upfront (it speeds up the chunking process); otherwise, an index will be automatically created for you.
You can use the {unique: true} option to ensure that the underlying index enforces uniqueness so long as the unique index is a prefix of the shard key.
If the "unique: true" option is not used, the shard key does not have to be unique.
How have you implemented generating the integer Ids?
If you use a system like the one suggested on the MongoDB website, you should be fine. For reference:
function counter(name) {
var ret = db.counters.findAndModify({
query:{_id:name},
update:{$inc:{next:1}},
"new":true,
upsert:true});
return ret.next;
}
db.users.insert({_id:counter("users"), name:"Sarah C."}) // _id : 1
db.users.insert({_id:counter("users"), name:"Bob D."}) // _id : 2
If you are generating your Ids by reading a most recent record in the document store, then incrementing the number in the perl code, then inserting with the incremented number you could be running into timing issues.