MongoDB doesn't scale properly when adding new shard with collection already filled - mongodb

My MongoDB sharded cluster ingestion performances don't scale up when adding a new shard.
I have a small cluster setup with 1 mongos + 1 config replica set (3 nodes) + N shards replica sets (3 nodes each).
Mongos is on a dedicated Kubernetes node, and each mongo process hosting shards has its dedicated k8s node, while the config mong processes run a bit here and there where they happens to be deployed.
The cluster is used mainly for GridFS file hosting, with a typical file being around 100Mb.
I am doing stress tests with 1, 2 and 3 shards to see if it scales properly, and it doesn't.
If I start a brand new cluster with 2 shards and run my test it ingest files at (approx) twice the speed I had with 1 shard, but if I start the cluster with 1 shard, then perform the test, then add 1 more shard (total 2 shards), then perform the test again, the speed of ingestion is approx the same as before with 1 shard.
Looking at where chunks go, when I start the cluster immediately with 2 shards the load is evenly balanced between shards.
If I start with 1 shard and add a second after a some insertions, then the chunks tend to go all on the old shard and the balancer must bring them later to the second shard.
Quick facts:
chunksize 1024 MB
sharding key is GridFS file_id, hashed

This is due to how hashed sharding and balancing works.
In an empty collection (from Shard an Empty Collection):
The sharding operation creates empty chunks to cover the entire range of the shard key values and performs an initial chunk distribution. By default, the operation creates 2 chunks per shard and migrates across the cluster.
So if you execute sh.shardCollection() on a cluster with x numbers of shards, it will create 2 chunks per shard and distribute them across the shards, totalling 2x chunks across the cluster. Since the collection is empty, moving the chunks around take little time. Your ingestion will now be distributed evenly across the shards (assuming other things e.g. good cardinality of the hashed field).
Now if you add a new shard after the chunks were created, that shard starts empty and the balancer will start to send chunks to it using the Migration Thresholds. In a populated collection, this process may take a while to finish.
If while the balancer is still moving chunks around (which may not be empty now) you do another ingestion, the cluster is now doing two different jobs at the same time: 1) ingestion, and 2) balancing.
When you're doing this with 1 shard and add another shard, it's likely that the chunks you're ingesting into are still located in shard 1 and haven't moved to the new shard yet, so most data will go into that shard.
Thus you should wait until the cluster is balanced after adding the new shard before doing another ingestion. After it's balanced, the ingestion load should be more evenly distributed.
Note: since your shard key is file_id, I'm assuming that each file is approximately the same size (~100 MB). If some files are much larger than others, some chunks will be busier than others as well.

Related

Inserting data in empty Sharded database in mongo when balancer is not enabled result all data in one shard

We have to 2 mongo db shard servers(3 Replica Set each).
We created Sharded collection and inserted 200k documents. Balancer was disabled in that window and we enabled it after first test and started insert again.
While in first test all data was inserted in one shard and we got lots of warning in mongolog:-
splitChunk cannot find chunk [{ articleId: MinKey, sessionId: MinKey },{ articleId: "59830791", sessionId: "fb0ccc50-3d6a-4fc9-aa66-e0ccf87306ea" }) to split, the chunk boundaries may be stale
Reason mentioned in log is possible low cardinality shard key
After second and third test when balancer was on data was balanced on both shards.
We did one more test and stopped balancer again in this test, data was going in both shards even balancer was off (pageIds were reader ids which are repeated from old tests along with some new ids for both)
Could you please tell how this mechanism is working as data should go in both shards no matter balancer is ON or OFF when key's cardinality is good.
Shard Key is :- (pageid) and (unique readerid)
Below are the insertion stats:-
Page read in duration 200k
Unique page IDs 2000
Unqiue session reading pages in duration :- 70000
Thanks in Advance!
When you enable sharding for a database, a primary shard will get assigned for each database.
If you insert data with balancer as disabled, all the data will go into the primary shard. Mongo Split will calculate the split point as your data grows and chunks will get created.
Since your balancer is disabled, all the chunks will remain on same shard.
If your balancer is in enabled state then it will balance those chunks between the shards which will result in better data distribution.
We did one more test and stopped balancer again in this test, data was going
in both shards even balancer was off (pageIds were reader ids which are
repeated from old tests along with some new ids for both)
The data is already distributed in chunks and these chunks are well distributed between 2 shards. If the range of your shard key is also distributed evenly among the chunks then any new document will go in respective chunk which will lead into even data distribution.

MongoDB bulk data load on sharded cluster: any specific settings I can take advantage of?

I have to insert #150 million records into a mongodb sharded cluster. The cluster comprises 5 3-member replica sets (primary, secondary, arbiter). Each record is uniform, about 1kb/record.
Given that I know the number and size of records, is there anything I should do with respect to shard configuration to optimize this data load? I'm thinking number/size of chunks, etc?

How to balance data when write overload is too heavy

I deployed a sharded cluster of two shards with MongoDB version 3.0.3.
Unfortunately, I chose a monotonic shard key just like:
{insertTime: 1}
When data size was small and the write speed was slow, the balancer can balance the data between the two shards. But when the data size grows big and our write speed is much faster, the balancing speed is so slow.
Now, the hard disk's storage of one of the two shards called shard2 is near the limit.
How Can I solve this problem without stopping our service and application??
I strongly suggest that you change your shard key while it's not too late to do so to avoid the preditable death of your cluster.
When a shard key increase monotonically, all the writes operations are sent to a single shard. Thus, this shard will grow then split into 2 shards. You will continue to hammer one of them until it splits again. At some point, you cluster won't be balanced anymore and your cluster will trigger some chunk moves and slow down your cluster even more.
MongoDB generates ObjectId values upon document creation to produce a unique identifier for the object. However, the most significant bits of data in this value represent a time stamp, which means that they increment in a regular and predictable pattern. Even though this value has high cardinality, when using this, any date, or other monotonically increasing number as the shard key, all insert operations will be storing data into a single chunk, and therefore, a single shard. As a result, the write capacity of this shard will define the effective write capacity of the cluster.
You do not benefit from the good part of the sharding with this shard key. It's actually worst in performance than a single node.
You should read this to select your new shard key and avoid the typical anti patterns. http://docs.mongodb.org/manual/tutorial/choose-a-shard-key/
You could add a shard to the cluster to increase capacity.
From the docs:
You add shards to a sharded cluster after you create the cluster or any time that you need to add capacity to the cluster. If you have not created a sharded cluster, see Deploy a Sharded Cluster.
When adding a shard to a cluster, always ensure that the cluster has enough capacity to support the migration required for balancing the cluster without affecting legitimate production traffic.

Migrating chunks from primary shard to the other single shard takes too long

Each chunk move takes about 30-40 mins.
The shard key is a random looking but monotically increasing integer string which is a long sequence of digits. A "hashed" index is created for that field.
There are 150M documents each about 1.5Kb in size. The sharded collection has 10 indexes (some of them compound).
I have a total of ~11k chunks reported in sh.status(). So far I could only transfer 42 of them to the other shard.
The system consists of one mongos, one config server and one primary (mongod) shard and other (mongod) shard. All in the same server which has 8 cores and 32 GB ram.
I know the ideal is to use seperate machines but none of the CPUs were utilized so I thought it was good for a start.
What is your comment?
What do I need to investigate?
Is it normal?
As said on the mongodb documentation : " Sharding is the process of storing data records across multiple machines and is MongoDB’s approach to meeting the demands of data growth. As the size of the data increases, a single machine may not be sufficient to store the data nor provide an acceptable read and write throughput. Sharding solves the problem with horizontal scaling. With sharding, you add more machines to support data growth and the demands of read and write operations."
You should definitely not have your shards on the same machine. It is useless. The interest of sharding is that you scale horizontaly. So if you shard on the same machine.... You are just killing your throughput.
Your database will be faster without sharding if you have one machine.
To avoid data loss, before using sharding you should use : raid (not 0), replicaset and then sharding.

MongoDB sharding, how does it rebalance when adding new nodes?

I'm trying to understand MongoDB and the concept of sharding. If we start with 2 nodes and partition say, customer data, based on last name where A thru M data is stored on node 1 and N thru Z data is stored on node 2. What happens when we want to scale out and add more nodes? I just don't see how that will work.
If you have 2 nodes it doesn't mean that data is partitioned into 2 chunks. It can by partitioned to let's say 10 chunks and 6 of them are on server 1 ane rest is on server 2.
When you add another server MongoDB is able to redistribute those chunks between nodes of new configuration
You can read more in official docs:
http://www.mongodb.org/display/DOCS/Sharding+Introduction
http://www.mongodb.org/display/DOCS/Choosing+a+Shard+Key
If there are multiple shards available, MongoDB will start migrating data to other shards
once you have a sufficient number of chunks. This migration is called balancing and is
performed by a process called the balancer.The balancer moves chunks from one shard to another.
For a balancing round to occur, a shard must have at least nine more chunks than the
least-populous shard. At that point, chunks will be migrated off of the crowded shard
until it is even with the rest of the shards.
When you add new node to cluster, MongoDB redistribute those chunks among nodes of new configuration.
It's a little extract ,to get complete understanding of how does it rebalance when adding new node read chapter 2 "Understanding Sharding" of Kristina Chodrow's book "Scaling MongoDB"