How to shrink MongoDB's oplog.rs collection? - mongodb

After storing some binary data in MongoDB 4.2.5 (3 nodes replicate set) the oplog.rs collection did grow to ca. 700MB. The binary data was removed and the data model restructured, but the oplog.rs collection stays the same size (as expected). I do understand that it's a capped collection with a maximum size and eventually it'll reuse the space. In my case though, I'd like to reclaim the space and start over. The database is used mostly for internal testing purposes. I don't mind losing some data from the oplog, but I do mind having a big oplog file, since the whole database is just a few MB.
Is it safe to use the emptycapped command on the oplog.rs collection in a replicate set scenario? Do I need to run this command on each node? Do I need to compact the collection after the deletion (last part from https://docs.mongodb.com/manual/tutorial/change-oplog-size/)?
Is there any other way to gracefully "reset" the oplog and free up the space?

OpLog is limited by what size you have defined in config or whether you have left it to default.
The OpLog (operations log) is a special capped collection that keeps a rolling record of all operations that modify the data stored in your databases.
It fills up to the defined size as the changes are coming through (or noops heartbeats).
If you want to reduce the size, reset the OpLog size in your config. But don't forget, larger OpLog size means you get a better OpLog window.
OpLog Window tells you how long a secondary member can be offline and still catch up to the primary without doing a full resync.

Related

Mongo Collection with TTL index not reclaiming disk space

I have a mongo collection with TTL index. I see the documents are getting evicted as expected but i dont see that the disk space is getting reclaimed. Did anyone see this kind of issue?
Let me know if you need more details.
We have discussed this with mongo team and based on all information there are couple of things and its not that easy
If you have already have more space then TTL will delete documents and ideally space should be reclaimed but it will not.
If new documents will come then it will be reused
If size is going to remain constant then you need to run compact command. In case of sharded cluster it will be on each shard.
Other options are create new collection and move your data to newer collection. and once done drop the previous collection
Take backup of this collection and drop collection and then restore it.
After all this things, there is possibility that mongo holds all in memory and you need to restart cluster, once restarted it will release the storage.

Is MongoDB always write to primary Shard and then rebalance?

use vsm;
sh.enableSharding('vsm');
sh.shardCollection('vsm.pricelist', {maker_id:1});
Ok, we enabled sharding for Database (vsm) and collection in this database (pricelist).
We trying to write about 80 million documents to 'pricelist' collection.
And we have about 2000 distributed uniformly different maker_ids.
We have three shards. And Shard002 is PRIMARY for 'vsm' database.
We write to 'pricelist' collection from four application nodes with started mongos on each.
And during write data to 'pricelist' collection we see CPU Usage 100% ONLY on Shard002 !
We see rebalancing process. And data migrate to Shard000 and Shard003. But Shard002 has hight CPU Usage and Load Average!
Shards deployed on c4.xlarge EBS Optimized instances. dbdata stored on io1 with 2000 IOPS EBS Volumes.
It is looks like MongoDB write data only to one Shard :( What we do wrong?
The problem
What you describe is usually the indication that you have chosen a poor shard key with makerid, most likely monotonically increasing.
What usually happens is that one shard is assigned the key range from x to infinity (shard002 in your case). Now all new documents get written to that shard, until that shards holds more chunks in excess of the current migration threshold. Now the balancer kicks in and moves some chunks. Problem is that new documents still get written to said shard.
The solution
An easy solution for that problem is to use hashed keys for sharding
Now here comes the serious problem: you can not change the shard key.
So what you have to do is to make a backup of the sharded collection, drop it, reshard the collection using the hashed makerId and restore the backup into the new collection.
Is MongoDB always write to primary Shard and then rebalance?
Yes, if you are relying on auto balancer. And loading huge amounts of data into an empty collection
In your situation, you are relying on the autobalancer to do all the sharding / balancing stuff. I assume what you require is, as your data gets loaded it goes to each shard during load hence having less CPU usage etc.
This how sharding / autobalancing will take place on a high level.
Create chunks of data using split http://docs.mongodb.org/manual/reference/command/split/
Move the chunks to other shards http://docs.mongodb.org/manual/reference/command/moveChunk/#dbcmd.moveChunk
Now, when autobalancer is ON these two steps occur when your data is already loaded or loading.
Solution
Create empty collection. Execute the shard command on it. The collection where your data is going to get loaded.
Turn off the auto balancer http://docs.mongodb.org/manual/tutorial/manage-sharded-cluster-balancer/#disable-the-balancer
Manually create empty chunks using split. http://docs.mongodb.org/manual/tutorial/create-chunks-in-sharded-cluster/
Move those empty chunks to different shards. http://docs.mongodb.org/manual/tutorial/migrate-chunks-in-sharded-cluster/
Start the load, This time all your data should go directly to their respective shards.
Turn ON the balancer. (Once the load is complete) http://docs.mongodb.org/manual/tutorial/manage-sharded-cluster-balancer/#enable-the-balancer
You will have to test this approach using a small data set first. But I guess I have got enough information for you to get started.

Will Updating a document key values uses more space in Mongodb

In Mongodb if i continuously update Key Values of a document in a collection, will it consume more space? If i update its value 100 thousand times, will the space be wasted on the hard disc.
Basically it won't use more space as the writes happens in place, so if the new value doesn't require more space it won't have to allocate more.
About rapid updates - mongodb writes are lazy so it can group multiple writes to one physical write to the disk.
you can find more info here
Please note that if you have logging enabled, it will use more disk space, but it is depends on your configuration.
MongoDB dbStats provide you the database storage usage, try to use it.

Is space reclaimed when I delete an index at MongoDB?

I have looked around the whole documentation but can't figure out if this actually happens or not. If I remove an index from a collection in MongoDB, does it delete the index files right away? Is space reclaimed?
No, MongoDB won't automatically release diskspace after collection data or indexes are deleted. Allocating new files is a relatively slow thing compared to other functions in a high-performance databases so MongoDB keeps all previously allocated files open and available by design.
If you need to reclaim diskspace use the repairDatabase command which achieves compaction as a side-effect of it's checking/fixing functionality.
An alternative that is available when using replica sets is to add a new member and let it sync- the data will be inserted fairly compactly in the replica set member's new database extent files. To compact all members you would do it in a rolling fashion, and probably force the primary to step down at the end so it can be re-synced too.

TTL index on oplog or reducing the size of oplog?

I am using mongodb with elasticsearch for my application. Elasticsearch creates indexes by monitioring oplog collection. When both the applications are running constantly then any changes to the collections in mongodb are immediately indexed. The only problem I face is if for some reason I had to delete and recreate the index then it takes ages(2days) for the indexing to complete.
When I was looking at the size of my oplog by default it's capacity is 40gb and its holding around 60million transactions because of which creating a fresh index is taking a long time.
What would be the best way to optimize fresh index creation?
Is it to reduce the size of oplog so that it holds less number of transactions and still not affect my replication or is it possible to create a ttl index(which I failed to do on several attempts) on oplog.
I am using elasticsearch with mongodb using mongodb river https://github.com/richardwilly98/elasticsearch-river-mongodb/.
Any help to overcome the above mentioned issues is appreciated.
I am not a Elastic Search Pro but your question:
What would be the best way to optimize fresh index creation?
Does apply a little to all who use third party FTS techs with MongoDB.
The first thing to note is that if you have A LOT of records then there is no easy way around this unless you are prepared to lose some of them.
The oplog isn't really a good idea for this, you should probably seek out using a custom script using timers in the main collection to do this personally, or a change table giving you a single place to quickly query for new or updated records.
Unless you are filtering the oplog to get specific records, i.e. inserts, then you could be pulling out ALL oplog records including deletes, collection operations and even database operations. So you could try stripping out unneeded records from your oplog search, however, this then creates a new problem; the oplog has no indexes or index updating.
This means that if you start to read in a manner more appropiate you will actually use an unindexed query over these 60 million records. This will result in slow(er) performance.
The oplog having no index updating answers another one of your questions:
is it possible to create a ttl index(which I failed to do on several attempts) on oplog.
Nope.
As for the other one of your questions:
Is it to reduce the size of oplog so that it holds less number of transactions
Yes, but you will have a smaller recovery window of replication and not only that but you will lose records from your "fresh" index so only a part of your data is actually indexed. I am unsure, from your question, if this is a problem or not.
You can reduce the oplog for a single secondary member that no replica is synching from. Look up rs.syncFrom and "Change the Size of the Oplog" in the mongodb docs.