mongodb: sharded collection count keep reducing - mongodb

I have a little cluster which consists of several shards, and every shard is a replica set of 2 real nodes and 1 ARBITER. sharding is enabled on a collection, let's say generator_v1_food.
I've stopped all the programs updating the collection (in these programs, there are ONLY upsert and find operations, no remove at all). Then, the collection count returns like this (2-3 second interval). I've also turned off the balancer. The last lines of the log( the shard I operated on) were all about replica set.
mongos> db.generator_v1_food.find().count()
28279890
mongos> db.generator_v1_food.find().count()
28278067
mongos> db.generator_v1_food.find().count()
28278008
...
What is happening behind the scene? Any pointers would be great.

quote:
Just because you set balancer state to "off" does not mean it's not still running, and finishing cleaning up from the last moveChunk that was performed.
You should be able to see in the config DB in changelog collection when the last moveChunk.commit event was - that's when the moveChunk process committed to documents from some chunk being moved to the new (target) shard. But after that, asynchronously the old shard needs to delete the documents that no longer belong to it. Since the "count" is taken from meta data and does not actually query for how many documents there are "for real" it will double count documents "in flight" during balancing rounds (or any that are not properly cleaned up or from aborted balance attempts).
Asya

Related

MongoDB sharded cluster writing more records than inserted

I have a spark dataframe with around 43 million records, which i'm trying to write to a Mongo Collection.
When I write it to a unsharded collection, the output records are same as i'm trying to insert. But when i write the same data to a sharded collection (hashed), the number of records increase by 3 millinos.
What's interesting is that the number of records keep on fluctuating even after my spark job has been completed. (no other connections to it)
When i did the same with range sharded collection, the number of records were consistent.
(edit: even with range sharded cluster, it started fluctuating after a while)
Can someone help me understand why this is happening? and also, i'm sharding my collection as i've to write about 300 Billion records everyday, and I want to increase my write throughputs; so any other suggestion would be really appreciated.
I have 3 shards, each replicated on 3 instances
I'm not using any other option in the spark mongo connector, only using ordered=False
Edit:
The count of records seemed to stabalize after a few hours with the correct number of records, still it would be great if someone could help me understand why mongo exhibits this behaviour
The confusion is the differences between collection metadata and logical documents while there is balancing in progress.
The bottom line is you should use db.collection.countDocuments() if you need an accurate count.
Deeper explanation:
When MongoDB shards a collection it assigns a range of the documents to each shard. As you insert documents these ranges will usually grow unevenly, so the balancer process will split ranges into smaller ones when necessary to keep their data size about the same.
It also moves these chunks between shards so that each shard has about the same number of chunks.
The process of moving a chunk from one shard to another involves copying all of the documents in that range, verifying they have all been written to new shard, then deleting them from the old shard. This means that the documents being moved will exist on both shards for a while.
When you submit a query via mongos, the shard will perform a filter stage to exclude documents in chunks that have not been fully move to this shard, or have not been deleted after fully moving out a chunk.
To count documents with the benefit of this filter, use db.collection.countDocuments()
Each mongod maintains metadata for each collection it holds, which includes a count of documents. This count is incremented for each insert and decremented for each delete. The metadata count can't exclude orphan documents from incomplete migrations.
The document count returned by db.collection.stats() is based on the metadata. This means if the balancer is migrating any chunks the copied but not yet deleted documents will be reported by both shards, so the overall count will be higher.

Inserting data in empty Sharded database in mongo when balancer is not enabled result all data in one shard

We have to 2 mongo db shard servers(3 Replica Set each).
We created Sharded collection and inserted 200k documents. Balancer was disabled in that window and we enabled it after first test and started insert again.
While in first test all data was inserted in one shard and we got lots of warning in mongolog:-
splitChunk cannot find chunk [{ articleId: MinKey, sessionId: MinKey },{ articleId: "59830791", sessionId: "fb0ccc50-3d6a-4fc9-aa66-e0ccf87306ea" }) to split, the chunk boundaries may be stale
Reason mentioned in log is possible low cardinality shard key
After second and third test when balancer was on data was balanced on both shards.
We did one more test and stopped balancer again in this test, data was going in both shards even balancer was off (pageIds were reader ids which are repeated from old tests along with some new ids for both)
Could you please tell how this mechanism is working as data should go in both shards no matter balancer is ON or OFF when key's cardinality is good.
Shard Key is :- (pageid) and (unique readerid)
Below are the insertion stats:-
Page read in duration 200k
Unique page IDs 2000
Unqiue session reading pages in duration :- 70000
Thanks in Advance!
When you enable sharding for a database, a primary shard will get assigned for each database.
If you insert data with balancer as disabled, all the data will go into the primary shard. Mongo Split will calculate the split point as your data grows and chunks will get created.
Since your balancer is disabled, all the chunks will remain on same shard.
If your balancer is in enabled state then it will balance those chunks between the shards which will result in better data distribution.
We did one more test and stopped balancer again in this test, data was going
in both shards even balancer was off (pageIds were reader ids which are
repeated from old tests along with some new ids for both)
The data is already distributed in chunks and these chunks are well distributed between 2 shards. If the range of your shard key is also distributed evenly among the chunks then any new document will go in respective chunk which will lead into even data distribution.

Why do individual shards in MongoDB report more delete operations compared to corresponding mongos in a sharded cluster?

So I have a production sharded MongoDB cluster that has 8 shards (replica sets) managed by mongos. Let's say I have 20 servers which are running my application and each of the servers runs a mongos process that manages the 8 shards.
Given this setup, when I check the number of ops on each of my mongos on the 20 servers, I can see that my number of inserts and deletes are in proportion - which is in accordance with my application logic. However, when I run mongostat --discover on the individual shards, I see that deletes are nearly 4x the number of inserts which violates both my application logic as well as the 1:1 ratio indicated by mongos. Straightforward intuition supports that mongos would write to only one shard and so the average ratio of inserts and deletes across individual shards should be the same as that on mongos (which the application directly writes to) unless mongos does something different internally with the shards.
Could anyone point me to any relevant info on why this would happen or let me know if something could possibly wrong with my infra?
Thanks
The reason for this is that I was running the remove() queries to mongos without specifying my shard key. In that case, mongos does not know which shard to direct the query to and thus broadcasts the query to all the shards effectively performing more deletes than a targeted query.
Check documentation for more information.

MongoDB shard removal taking too long

I had set up a 9-node MongoDB shard cluster. I need to remove one of the nodes, so I invoked the:
mongos> db.runCommand( { removeShard: "clustnode9:27017" } )
Keep in mind my dataset contains about 25 million rows of data so approximately 2.7m rows per node.
For some reason the drain operation is taking a very long time. I can see that it's moving forward, every half an hour or so, if I invoke:
mongos> db.collection.getShardDistribution()
I can see that my 9th shard is draining, it goes from 8.34% to 8.33%, etc... But it's taking such a long time. Additionally, I actually have a need to remove not just this one shard but 3 additional shards, but MongoDB will not allow me to remove more than one shard at any given time, so I am just concerned that the removal operation is taking such a long time and looking for ways to speed up the drainage...

How to remove chunks from mongodb shard

I have a collection where the sharding key is UUID (hexidecimal string). The collection is huge: 812 millions of documents, about 9600 chunks on 2 shards. For some reason I initially stored documents which instead of UUID had integer in sharding key field. Later I deleted them completely, and now all of my documents are sharded by UUID. But I am now facing a problem with chunk distribution. While I had documents with integer instead of UUID, balancer created about 2700 chunks for these documents, and left all of them on one shard. When I deleted all these documents, chunks were not deleted, they stayed empty and they will always be empty because I only use UUID now. Since balancer distrubutes chunks relying on chunk count per shard, not document count or size, one of my shards takes 3 times more disk space than another:
--- Sharding Status ---
db.click chunks:
set1 4863
set2 4784 // 2717 of them are empty
set1> db.click.count()
191488373
set2> db.click.count()
621237120
The sad thing here is mongodb does not provide commands to remove or merge chunks manually.
My main question is, whould anything of this work to get rid of empty chunks:
Stop the balancer. Connect to each config server, remove from config.chunks ranges of empty chunks and also fix minKey slice to end at beginning of first non-empty chunk. Start the balancer.
Seems risky, but as far as I see, config.chunks is the only place where chunk information is stored.
Stop the balancer. Start a new mongod instance and connect it as a 3rd shard. Manually move all empty chunks to this new shard, then shut it down forever. Start the balancer.
Not sure, but as long as I dont use integer values in sharding key again, all queries should run fine.
Some might read this and think that the empty chunks are occupying space. That's not the case - chunks themselves take up no space - they are logical ranges of shard keys.
However, chunk balancing across shards is based on the number of chunks, not the size of each chunk.
You might want to add your voice to this ticket: https://jira.mongodb.org/browse/SERVER-2487
Since the mongodb balancer only balances chunks number across shards, having too many empty chunks in a collection can cause shards to be balanced by chunk number but severely unbalanced by data size per shard (e.g., as shown by db.myCollection.getShardDistribution()).
You need to identify the empty chunks, and merge them into chunks that have data. This will eliminate the empty chunks. This is all now documented in Mongodb docs (at least 3.2 and above, maybe even prior to that).