I have a cluster with three shards using MongoDB 4.2. I have a collection (users) that, before sharding can be checked it has 600000 documents:
mongos> db.users.count()
600000
Next, I shard it with the usual commands (first DB, next collection):
mongos> sh.enableSharding("app")
mongos> sh.shardCollection("app.users", {"name.first": 1})
getting after a couple of minutes or so an equally distrution of chunks amongs the shards:
chunks:
shard0000 3
shard0001 2
shard0002 3
So far so good.
However, if I get a count just after this, I get a weird value, higher than the number of documents in the collection:
mongos> db.users.count()
994243
mongos> db.users.find({}).count()
994243
Moreover, the getShardDistribution() result on the collection is also weird, showing the total number of document all them in one of the shards (which makes no sense, as part of them have been distributed in the other two shards):
mongos> db.users.getShardDistribution()
Shard shard0000 at localhost:27018
data : 95.85MiB docs : 236611 chunks : 3
estimated data per chunk : 31.95MiB
estimated docs per chunk : 78870
Shard shard0001 at localhost:27019
data : 64.06MiB docs : 157632 chunks : 2
estimated data per chunk : 32.03MiB
estimated docs per chunk : 78816
Shard shard0002 at localhost:27020
data : 243.69MiB docs : 600000 chunks : 3
estimated data per chunk : 81.23MiB
estimated docs per chunk : 200000
Totals
data : 403.62MiB docs : 994243 chunks : 8
Shard shard0000 contains 23.74% data, 23.79% docs in cluster, avg obj size on shard : 424B
Shard shard0001 contains 15.87% data, 15.85% docs in cluster, avg obj size on shard : 426B
Shard shard0002 contains 60.37% data, 60.34% docs in cluster, avg obj size on shard : 425B
Interestingly, if I wait a while (not sure how much, but not more than 30 minutes), count and getShardDistribution() are back to normality:
mongos> db.users.count()
600000
mongos> db.users.getShardDistribution()
Shard shard0001 at localhost:27019
data : 64.06MiB docs : 157632 chunks : 2
estimated data per chunk : 32.03MiB
estimated docs per chunk : 78816
Shard shard0002 at localhost:27020
data : 83.77MiB docs : 205757 chunks : 3
estimated data per chunk : 27.92MiB
estimated docs per chunk : 68585
Shard shard0000 at localhost:27018
data : 95.85MiB docs : 236611 chunks : 3
estimated data per chunk : 31.95MiB
estimated docs per chunk : 78870
Totals
data : 243.69MiB docs : 600000 chunks : 8
Shard shard0001 contains 26.28% data, 26.27% docs in cluster, avg obj size on shard : 426B
Shard shard0002 contains 34.37% data, 34.29% docs in cluster, avg obj size on shard : 426B
Shard shard0000 contains 39.33% data, 39.43% docs in cluster, avg obj size on shard : 424B
Why is this happening? How I can avoid this effect? (maybe forcing somekind of sync with a command?)
Thanks!
PD: In the case it may be relevant, I'm using a testing environment setup, which uses a standalone mongod process to implement each shard. The config server uses a mono-node replica set configuration.
count provides an estimated count, and may not be accurate. Use countDocuments to get an accurate count.
You can read the source of getShardDistribution by typing db.users.getShardDistribution in the shell. It seems to use information stored in the config database.
It is quite reasonable to expect that the statistics stored by the database aren't exactly accurate. This is because there is a cost to have them be up-to-date whenever any operation is performed anywhere in the cluster.
You seem to be looking at statistics at a point in time after some chunks have been copied from one shard to another and before these chunks are removed from the original shard. In this situation the data is stored twice in the cluster. The statistics aren't accurate in this case. To obtain an accurate count, use countDocuments.
Related
We are reaching the limits of a standard replica set, and we are testing the migration to a Sharded server.
I have created a fresh new sharded cluster on 6.0.3 with 3 shards (each shard is 2 data + 1 arbiter).
I have restored a sample collection of 92 Go (about 10 millions documents).
I have successfully created indexes and sharded the collection :
sh.shardCollection(
"saba_ludu.MyCollection",
{ UniqueId:" hashed" },
{
collation: {locale : "simple"}
}
)
After that, the shard was not balancing at all ; all the data was fully on the primary shard. The following command was returning balancerCompliant as true.
sh.balancerCollectionStatus("saba_ludu.MyCollection")
First thing odd, I encountered a command returning an error saying the command was not available because of the compatibility version of the cluster was too low (I never made a configuration like this on the cluster...). I ran the command to move to compatibility version 6, and right after that the collection started to balance across the shards and creates a lot of chunks.
But I h am facing an another issue: the primary shard has not created chunks. There's still only one chunk.
db.getSiblingDB("saba_ludu").MyCollection.getShardDistribution();
Shard i2a-poc-mgdb-cl-03 at i2a-poc-mgdb-cl-03/i2a-poc-mgdb-cl-03-0.i2a-poc-mgdb-cl-03-svc.i2a-poc.svc.cluster.local:27017,i2a-poc-mgdb-cl-03-1.i2a-poc-mgdb-cl-03-svc.i2a-poc.svc.cluster.local:27017
{
data: '31.14GiB',
docs: 3372644,
chunks: 1,
'estimated data per chunk': '31.14GiB',
'estimated docs per chunk': 3372644
}
---
Shard i2a-poc-mgdb-cl-02 at i2a-poc-mgdb-cl-02/i2a-poc-mgdb-cl-02-0.i2a-poc-mgdb-cl-02-svc.i2a-poc.svc.cluster.local:27017,i2a-poc-mgdb-cl-02-1.i2a-poc-mgdb-cl-02-svc.i2a-poc.svc.cluster.local:27017
{
data: '30.87GiB',
docs: 3344801,
chunks: 247,
'estimated data per chunk': '127.99MiB',
'estimated docs per chunk': 13541
}
---
Shard i2a-poc-mgdb-cl-01 at i2a-poc-mgdb-cl-01/i2a-poc-mgdb-cl-01-0.i2a-poc-mgdb-cl-01-svc.i2a-poc.svc.cluster.local:27017,i2a-poc-mgdb-cl-01-1.i2a-poc-mgdb-cl-01-svc.i2a-poc.svc.cluster.local:27017
{
data: '30.86GiB',
docs: 3344803,
chunks: 247,
'estimated data per chunk': '127.94MiB',
'estimated docs per chunk': 13541
}
---
Totals
{
data: '3.114100851894496e+23GiB',
docs: 10062248,
chunks: 495,
'Shard i2a-poc-mgdb-cl-03': [
'0 % data',
'33.51 % docs in cluster',
'9KiB avg obj size on shard'
],
'Shard i2a-poc-mgdb-cl-02': [
'0 % data',
'33.24 % docs in cluster',
'9KiB avg obj size on shard'
],
'Shard i2a-poc-mgdb-cl-01': [
'0 % data',
'33.24 % docs in cluster',
'9KiB avg obj size on shard'
]
}
Does anybody knows why I was facing the first compatibility version issue, or why I not able to balance the primary shard ?
Thanks
In MongoDB 6 the documents are not balanced to have even number of chunks (as it was in releases before 6.0), the target is to have even distribution of data. Each of your shard has almost exactly 1/3 (i.e. around 33%) of all data, so it seems to be correct. Also the number of document (around 3.3 millions) is similar in all shards.
Chunks are used to distribute the data across shards. When you enable sharding on existing data, then you have initially just one single chunk. The balancer starts to split the chunk and moves the new chunk to another shard. Once all data is evenly distributed, there is no reason anymore to split the inital chunk, so on your primary shard you get only one big chunk.
I guess, by time the number of chunks will also increase on the primary shard.
I am running a MongoDB with 5 shards. I have many collections in my database, and only a few of them are sharded, meaning the non sharded collections are on the primary.
Is there any way to let MongoDB know that is should distribute chunk of the sharded collections in such a way as to have less on the primary then the other shards. This would allow the primary to have more room to handle requests to the non sharded collections. Currently it is always a lot more busy (cpu, disk usgae) then the other shards
current situation:
Shard shard1 contains 19.98% data, 20.06% docs in cluster, avg obj size on shard : 123KiB
Shard shard2 contains 19.89% data, 19.91% docs in cluster, avg obj size on shard : 124KiB
Shard shard3 contains 20.00% data, 20.00% docs in cluster, avg obj size on shard : 124KiB
Shard shard4 contains 20.07% data, 19.95% docs in cluster, avg obj size on shard : 125KiB
Shard shard5 contains 20.04% data, 20.05% docs in cluster, avg obj size on shard : 124KiB
Target situation (shard 1 is primary):
Shard shard1 contains 4% data, 4% docs in cluster, avg obj size on shard : 123KiB
Shard shard2 contains 24% data, 24% docs in cluster, avg obj size on shard : 124KiB
Shard shard3 contains 24% data, 24% docs in cluster, avg obj size on shard : 124KiB
Shard shard4 contains 24% data, 24% docs in cluster, avg obj size on shard : 125KiB
Shard shard5 contains 24% data, 24% docs in cluster, avg obj size on shard : 124KiB
NOTE: I can't shard all my collections as there are queries run on them which would not work (https://docs.mongodb.com/manual/core/sharded-cluster-requirements/#sharding-operational-restrictions) and I don't have the ability to change these. (for reasons I won't go in to)
I am new on playing with mongodb.
Due to the fact that I have to store +-50 mln of documents, I had to set up a mongodb shard cluster with two replica sets
The document looks like this:
{
"_id" : "predefined_unique_id",
"appNr" : "abcde",
"modifiedDate" : ISODate("2016-09-16T13:00:57.000Z"),
"size" : NumberLong(803),
"crc32" : NumberLong(538462645)
}
The shard key is appNr (was selected because for query performance reasons, all documents having same appNr have to stay within one chunk).
Usually multiple documents have the same appNr.
After loading like two million records, I see the chunks are equally balanced however when running db.my_collection.getShardDistribution(), I get :
Shard rs0 at rs0/...
data : 733.97MiB docs : 5618348 chunks : 22
estimated data per chunk : 33.36MiB
estimated docs per chunk : 255379
Shard rs1 at rs1/...
data : 210.09MiB docs : 1734181 chunks : 19
estimated data per chunk : 11.05MiB
estimated docs per chunk : 91272
Totals
data : 944.07MiB docs : 7352529 chunks : 41
Shard rs0 contains 77.74% data, 76.41% docs in cluster, avg obj size on shard : 136B
Shard rs1 contains 22.25% data, 23.58% docs in cluster, avg obj size on shard : 127B
My question is what settings I should do in order to get the data equally distributed between shards? I would like to understand how the data gets split in chunks. I have defined a ranged shard key and chunk size 264.
MongoDB uses the shard key associated to the collection to partition the data into chunks. A chunk consists of a subset of sharded data. Each chunk has a inclusive lower and exclusive upper range based on the shard key.
Diagram of the shard key value space segmented into smaller ranges or chunks.
The mongos routes writes to the appropriate chunk based on the shard key value. MongoDB splits chunks when they grows beyond the configured chunk size. Both inserts and updates can trigger a chunk split.
The smallest range a chunk can represent is a single unique shard key
value. A chunk that only contains documents with a single shard key
value cannot be split.
Chunk Size will have a major impact on the shards.
The default chunk size in MongoDB is 64 megabytes. We can increase or reduce the chunk size. But modification of the chunk size should be done after considering the below items
Small chunks lead to a more even distribution of data at the expense of more frequent migrations. This creates expense at the query routing (mongos) layer.
Large chunks lead to fewer migrations. This is more efficient both from the networking perspective and in terms of internal overhead at the query routing layer. But, these efficiencies come at the expense of a potentially uneven distribution of data.
Chunk size affects the Maximum Number of Documents Per Chunk to Migrate.
Chunk size affects the maximum collection size when sharding an existing collection. Post-sharding, chunk size does not constrain collection size.
By referring these information and your shard key "appNr", this would have happened because of chunk size.
Try resizing the chunk size instead of 264MB(which you have currently) to a lower size and see whether there is a change in the document distribution. But this would be a trial and error approach and it would take considerable amount of time and iterations.
Reference : https://docs.mongodb.com/v3.2/core/sharding-data-partitioning/
Hope it Helps!
I'll post my findings here - maybe they will have some further use.
The mongodb documentation says that "when a chunk grows beyond specified chunk size" it gets splitted.
I think the documentation is not fully accurate or rather incomplete.
When mongo does auto-splitting, splitVector command will ask the primary shard for splitting points, then will split accordingly.This will happen first when like 20% from specified chunk size is reached and - if no splitting points found - will retry at 40%,60% so on...so the splitting should not wait for max size .
In my case, for the first half of the shards this happened ok, but then for the second half - the split happened only after the max chunk size was exceeded. Still have to investigate why the split didn't happened earlier, as I see no reason for this behaviour.
After splitting in chunks, the balancer starts. This will divide the chunks equally across shards, without considering chunk size ( a chunk with 0 documents is equal to a chunk with 100 documents from this regard).The chunks will be moved following the order of their creation.
My problem was that the second half of the chunks was almost twice the size than the first half. Therefore as balancer allways moved the first half of the chunks collection to the other shard, the cluster became unbalanced.
a much better explanation I found here
In order to fix it, I have changed the sharding key to "hashed".
I'm studying sharding with mongodb and I have the follow structure:
1 Mongod to my ConfigServer with just 1 member in ReplicaSet
2 Shards each with 2 members in ReplicaSet
1 Mongos
I have one database named erp and 3 collections, pessoas, produtos and contatos.
So I have add my collections using:
sh.shardCollection("erp.<collection>", { id: 1 }, true)
I begin with collection pessoas, this collection have 2000 documents and are distributed this way:
mongos> db.pessoas.getShardDistribution()
Shard rs1 at rs1/desenv1:27019,desenv1:27020
data : 57KiB docs : 1497 chunks : 36
estimated data per chunk : 1KiB
estimated docs per chunk : 41
Shard rs3 at rs3/desenv1:27022,desenv1:27023
data : 19KiB docs : 503 chunks : 36
estimated data per chunk : 541B
estimated docs per chunk : 13
Totals
data : 77KiB docs : 2000 chunks : 72
Shard rs1 contains 75.27% data, 74.85% docs in cluster, avg obj size on shard : 39B
Shard rs3 contains 24.72% data, 25.15% docs in cluster, avg obj size on shard : 38B"
After this I have add the collection produtos, and I gave to her 1001 registers, so why this collection are distributed this way:
mongos> db.produtos.getShardDistribution()
Shard rs1 at rs1/desenv1:27019,desenv1:27020
data : 67KiB docs : 1001 chunks : 1
estimated data per chunk : 67KiB
estimated docs per chunk : 1001
Totals
data : 67KiB docs : 1001 chunks : 1
Shard rs1 contains 100% data, 100% docs in cluster, avg obj size on shard : 69B"
Questions:
Why only replicaSet "rs1" are getting data? The same thing happen with the collection contatos, only replicaSet "rs1" gets the data and I can't distribute the data to the other shard.
Why this happens and what I'm doing wrong?
How do I distribute equally the data? For example with 2000 registers, 1000 registers in one shard and 1000 in the another shard.
If you guys need more information just tell me.
Thanks
MongoDB balance the shards by using the number of chunks, not documents (see https://docs.mongodb.com/manual/core/sharding-balancer-administration/). Therefore, from the output you provided, the cluster is balanced. Shard rs1 contains 36 chunks, and shard rs3 also contains 36 chunks for the pessoas collection.
If the number of documents is not balanced, that means that your inserts are going into a small number of chunks (or even a single chunk in the worst case), and not distributed across all the chunks. This typically caused by using a monotonically increasing shard key.
Please see Shard Keys for more information about this subject, and how to avoid this situation. Note that shard key selection is very important, since once a shard key is selected, it cannot be changed anymore. The only way to change the shard key of a collection is to dump the collection, and change the shard key during the restore process.
I have a sharded cluster setup for my app but unfortunately one of the shard is taking 17 GB of data size and others are taking average 3 GB of data size. What could be the issue?
sh.status() gives me huge output. Shared here: https://www.dropbox.com/s/qqsucbm6q9egbhf/shard.txt?dl=0
My bad collection shard distribution details is below.
mongos> db.MyCollection_1_100000.getShardDistribution()
Shard shard_0 at shard_0/mongo-11.2816.mongodbdns.com:270
00,mongo-12.2816.mongodbdns.com:27000,mongo-13.2816. mongodbdns.com:27000,mongo-3.2816.mongodbdns.com:27003
data : 143.86MiB docs : 281828 chunks : 4
estimated data per chunk : 35.96MiB
estimated docs per chunk : 70457
Shard shard_1 at shard_1/mongo-10.2816.mongodbdns.com:270 00,mongo-11.2816.mongodbdns.com:27002,mongo-19.2816. mongodbdns.com:27001,mongo-9.2816.mongodbdns.com:27005
data : 107.66MiB docs : 211180 chunks : 3
estimated data per chunk : 35.88MiB
estimated docs per chunk : 70393
Shard shard_2 at shard_2/mongo-14.2816.mongodbdns.com:270 00,mongo-3.2816.mongodbdns.com:27000,mongo-4.2816.mo ngodbdns.com:27000,mongo-6.2816.mongodbdns.com:27002
data : 107.55MiB docs : 210916 chunks : 3
estimated data per chunk : 35.85MiB
estimated docs per chunk : 70305
Shard shard_3 at shard_3/mongo-14.2816.mongodbdns.com:270 04,mongo-18.2816.mongodbdns.com:27002,mongo-6.2816.m ongodbdns.com:27000,mongo-8.2816.mongodbdns.com:27000
data : 107.99MiB docs : 211506 chunks : 3
estimated data per chunk : 35.99MiB
estimated docs per chunk : 70502
Shard shard_4 at shard_4/mongo-12.2816.mongodbdns.com:270 01,mongo-13.2816.mongodbdns.com:27001,mongo-17.2816. mongodbdns.com:27002,mongo-6.2816.mongodbdns.com:27003
data : 107.92MiB docs : 211440 chunks : 3
estimated data per chunk : 35.97MiB
estimated docs per chunk : 70480
Shard shard_5 at shard_5/mongo-17.2816.mongodbdns.com:270 01,mongo-18.2816.mongodbdns.com:27001,mongo-19.2816. mongodbdns.com:27000
data : 728.64MiB docs : 1423913 chunks : 4
estimated data per chunk : 182.16MiB
estimated docs per chunk : 355978
Shard shard_6 at shard_6/mongo-10.2816.mongodbdns.com:270 01,mongo-14.2816.mongodbdns.com:27005,mongo-3.2816.m ongodbdns.com:27001,mongo-8.2816.mongodbdns.com:27003
data : 107.52MiB docs : 211169 chunks : 3
estimated data per chunk : 35.84MiB
estimated docs per chunk : 70389
Shard shard_7 at shard_7/mongo-17.2816.mongodbdns.com:270 00,mongo-18.2816.mongodbdns.com:27000,mongo-19.2816. mongodbdns.com:27003,mongo-9.2816.mongodbdns.com:27003
data : 107.87MiB docs : 211499 chunks : 3
estimated data per chunk : 35.95MiB
estimated docs per chunk : 70499
Shard shard_8 at shard_8/mongo-19.2816.mongodbdns.com:270 02,mongo-4.2816.mongodbdns.com:27002,mongo-8.2816.mo ngodbdns.com:27001,mongo-9.2816.mongodbdns.com:27001
data : 107.83MiB docs : 211154 chunks : 3
estimated data per chunk : 35.94MiB
estimated docs per chunk : 70384
Shard shard_9 at shard_9/mongo-10.2816.mongodbdns.com:270 02,mongo-11.2816.mongodbdns.com:27003,mongo-12.2816. mongodbdns.com:27002,mongo-13.2816.mongodbdns.com:27002
data : 107.84MiB docs : 211483 chunks : 3
estimated data per chunk : 35.94MiB
estimated docs per chunk : 70494
Totals
data : 1.69GiB docs : 3396088 chunks : 32
Shard shard_0 contains 8.29% data, 8.29% docs in cluster, avg obj size on shard : 535B
Shard shard_1 contains 6.2% data, 6.21% docs in cluster, avg obj size on shard : 5 34B
Shard shard_2 contains 6.2% data, 6.21% docs in cluster, avg obj size on shard : 5 34B
Shard shard_3 contains 6.22% data, 6.22% docs in cluster, avg obj size on shard : 535B
Shard shard_4 contains 6.22% data, 6.22% docs in cluster, avg obj size on shard : 535B
Shard shard_5 contains 42% data, 41.92% docs in cluster, avg obj size on shard : 5 36B
Shard shard_6 contains 6.19% data, 6.21% docs in cluster, avg obj size on shard : 533B
Shard shard_7 contains 6.21% data, 6.22% docs in cluster, avg obj size on shard : 534B
Shard shard_8 contains 6.21% data, 6.21% docs in cluster, avg obj size on shard : 535B
Shard shard_9 contains 6.21% data, 6.22% docs in cluster, avg obj size on shard : 534B
I have 150+ similar collections where I have divided data by user_id's
e.g. MyCollection_1_100000
MyCollection_100001_200000
MyCollection_200001_300000
Here I have divided data of user id's ranging from 1 to 100000 in MyCollection_1_100000 likewise for other collections
shard key for all 150+ collection is sequential number but it is hashed. Applied by following way
db.MyCollection_1_100000.ensureIndex({"column": "hashed"})
sh.shardCollection("dbName.MyCollection_1_100000", { "column": "hashed" })
Please suggest me corrective steps to get rid of unbalanced shard problem.
Unshared Collections
Shard 5 is the primary shard in your cluster, which means it will take all unsharded collections and therefore grows bigger in size. You should check for that. See here.
Chunk Split
As Markus pointed out, distribution is done by chunk and not by documents. Chunks may grow up to their defined chunk size. When they exceed the chunk size they are split and redistributed. In your case there seems to be at least one collection that has 1 additional chunk than all the other shards. The reason could be that either the chunk has not yet reached it's chunk limit (check db.settings.find( { _id:"chunksize" }) default size is 64MB, see also here) or that the chunk can not be split because the range represented by the chunk can not be further split automatically. You should check the ranges using the sh.status(true) command (the output of the ranges is omitted for some collections in the large output you posted)
However you may split the chunk manually.
There is also a quite good answer on the dba forum.
Shard Key
If you have no unsharded collections, the problem may be the shard key itself. Mongo suggest to use a shard key with high cardinality and a high degree of randomness. Without knowing the value range of your columns, I assume the cardinality is rather low (i.e. 1000 columns) compared to, lets say a timestamp (1 for every single entry, making up to a LOT of different values).
Further, the data should be evenly distributed. So lets say you have 10 possible columns. But there are a lot more entries with a particular value for the column name all that entries would be written to the same shard. For example
entries.count({column: "A"} = 10 -> shard 0
entries.count({column: "B"} = 10 -> shard 1
...
entries.count({column: "F"} = 100 -> shard 5
The sh.status() command should give you some more information about the chunks.
If you use the object id or a timestamp - which are values that are monotonically increasing - will lead to data being written to the same chunk as well.
So Mongo suggests to use a compound key which will lead to a higher cardinality (value-range of field1 x value-range of field2). In your case you could combine the column name with a timestamp.
But either way, you're out of luck for your current installation, as you can not change the shard key afterwards.
DB Design
The verbose output you printed also indicates, you have several dbs/collections with same schema or purpose which occur to me to be sort of manually partitioned. Is there a particular reason for this? This could have an effect on the distribution of the data in the cluster as well as every collection start to be filled on the primary node. There is at least one collection with just a single chunk in the primary, and some with 3 or 4 chunks in total, all having at least one chunk on the primary (i.e. the z_best_times_*).
Preferrably you should only have a single collection for one purpose and probably use a compound shard key (i.e. hashed timestamp in addition).