how MongoRepository.findAll() will wok on sharded collection - mongodb

I have a collection which is sharded by 2 fields, and my requirement is to find all the data irrespective of the shard key.
I'm using Spring Data MongoDb.
Will MongoRepository.findAll() work in sharded collection?
and will it fetch data from all the sharded collection ?
I'm using MongoRepository.findAll()
it should return all the data from all the sharded collection

Related

Sharded collections and MongoDb

I've created a sharded collection on Cosmos (for use with C# MongoDB driver) through the portal. Created using Data Explorer -> New Collection - Shard Key set at this point.
I've set the shard key to be partitionId.
As an example when trying to insert this document into a collection named "data":
db.data.insert({partitionId:"test"})
I receive the error Command insert failed: document does not contain shard key.
Edit:
There seem to be issues when creating the sharded collection using the portal. Manually creating the sharded collection should work, see: https://stackoverflow.com/a/48202411/5405453
Original:
From the docs:
The shard key determines the distribution of the collection’s
documents among the cluster’s shards. The shard key is either an
indexed field or indexed compound fields that exists in every document
in the collection.
On creation of the sharded collection, you have provided a key which should be used as shard key. Next if you insert a document it has to contain that key. See here.

How does MongoDB distribute data across a cluster

I've read about sharding a collection in MongoDB. MongoDB lets me shard a collection explicitly by calling shardCollection method. There I can choose whether I want it to be rangely shareded or hashingly sharded.
My question is, what would happen if I didn't call the shardCollection method, and I had say 100 nodes?
Would MongoDB keep the collections intact and distribute them across the cluster?
Would MongoDB keep all the collections in a single node?
Do I completely not understand how this works?
A database can have a mixture of sharded and unsharded collections. Sharded collections are partitioned and distributed across shards in the cluster. As at MongoDB 3.4, each database has a primary shard where the unsharded collections are stored. If your deployment has a number of databases this may result in some distribution of unsharded collections, but there is no balancing activity for unsharded data. For more information on expected behaviours, see the Sharding section in the MongoDB manual.
If you are interested in distribution of unsharded collections within a sharded database, there is a relevant feature request you can watch/upvote in the MongoDB issue tracker: SERVER-939: Ability to distribute collections in a single DB.

Mongodb sharded cluster $in VS $or

If I have MongoDB shurded cluster in sharded key: "my_key".
I have to find in collection pack documents (about 10-500 items) with different my_key's.
Foe example:
db.test.find({my_key: {$in:[1,3,5,67,45,56...]}})
Mongos knows where chunks with 'my_key' stored.
Can mongos split my query to small queries to exact shards where documents stored? Or mongos will send this query to ALL shards?
And the same question about $or
db.test.find({$or:[{my_key: 1},{my_key: 3},{my_key: 5}...]})
I have run tests.
If $in contains values only from one shard mongos will send SINGLE_SHARD query.
If $in contains values from several shards then mongos will send SHARD_MERGE query only for shards than contains needed data (not all cluster).

MongoDB sharding is possible on collections?

Can is it possible sharding only on collections ? if yes than how..?
What is difference between sharding on database and on collections?
Mongodb shards collections. You enable sharding on database but just enabling sharding on database will not distribute data across shards. To distribute data accross shards you need to tell mongodb what collection to distribute. So, you have to shard your collection and then only that collection will be spread across the shards.
Remember, mongodb will distribute data on the basis of collections sharded. If you have 2 collections in your database and you shard one of them then data of sharded collection will be spread out across the shards but the other collection will have all data on one shard.
In plain language, mongodb doesn't shard whole database automatically. Mongodb sharding works on collection level.

About the Mongodb sharding?

I have a collection, like is:
{ name:aaa,
nick:bbbb,
item:[{item_id:123456,price:13.00},
{item_id:833457,price:12.00},
.....
}]
}
I had set the collection to sharding , the sharding key is name and item.item_id. If I had save many many items into collection.
I have a question:
If these data saved to one collection, not contain a embeded document, will lead to sharding. But I have save these data to embeded document(the example), can sharding or not sharding?
Thanks.