MongoDB poor performance few hours after data deletion - mongodb

I've got 3 Mongo DB (v 3.4.10) servers (256 Gb RAM, 1 Tb HDD, 12 CPUs each) as a replica setup. Servers are under decent load and HDD is eaten up quite rapidly. I'm considering sharding big collections, but not yet there.
In the meantime, the typical scenario I face:
Morning I see an alert that database HDD is 92% used
Midday I delete bunch of redundant data from big collections (1M - 4M entries) on master. I either update collection like this:
update({}, {'$unset' : {'key_1' : true, 'key_2' : true, 'key_3' : true}}, {"multi" : 1})
or create new collection, insert only needed data there and drop old one.
Evening (about 4-5 hours after deletion, usually peak of the load) Mongo response time increases dramatically from 3-4ms to 500ms. This period lasts for a while, during which my application is almost down. It only restores back to normal performance after I stop my application completely for 10-20 minutes and try to start it back again.
The days I do not delete data - database performs like normal.
I read a bit about oplog and nuances of deleting data on replicated servers. However, in my case the lag between deletion and performance drop is several hours.
Is there any internal Mongo process, which happens hours after massive update/insert? How should I bulk update/insert to avoid this?

Related

Can I prevent disk filling up by changing a Mongodb collection from TTL to Capped?

I have a MongoDB 4.4 server (single node, no replicas) for storing IoT-style data. Data is written to several collections every few seconds by my NodeJS app. Documents are not updated/modified, and reads are less common than writes.
I have TTL indices on my big collections so that data older than 6 months is deleted. However, Mongo seems to consume more and more disk space. When the disk inevitably fills up, Mongo and my app stop working. I need to stop Mongo from consuming increasing amounts of disk space.
If I call stats() on my big collections, I can see that there are gigabytes of "file bytes available for reuse". But when I use db.runCommand({compact:'big_collection'}), it doesn't seem to release any space. Other people seem to have similar experiences. I wish I understood why compact isn't working.
I suspect the best alternative approach is to remove the TTL index, and then Cap the collection to a fixed size, but I'd like to hear if anyone has experience with such a process, or alternative recommendations.

MongoDB degrading write performance over time

I am importing a lot of data (18GB, 3 million documents) over time, almost all the data are indexed, so there are lots of indexing going on. The system consist of a single client (single process on a separate machine) establishing a single connection (using pymongo) and doing insertMany in batch of 1000 docs.
MongoDB setup:
single instance,
journaling enabled,
WiredTiger with default cache,
RHEL 7,
version 4.2.1
192GB RAM, 16 CPUs
1.5 TB SSD,
cloud machine.
When I start the server (after full reboot) and insert the collection, it takes 1.5 hours. If the server run for a while inserting some other data (from a single client), it finishes to insert the data, I delete the collection and run the same data to insert - it takes 6 hours to insert it (there is still sufficient disk more than 60%, nothing else making connections to the db). It feels like the server performance degrades over time, may be OS specific. Any similar experience, ideas?
I had faced similar issue, the problem was RAM.
After full restart the server had all RAM free, but after insertions the RAM was full. Deletion of collection and insertion same data again might take time as some RAM was still utilised and less was free for mongo.
Try freeing up RAM and cache after you drop the collection, and check if same behaviour persists.
As you haven't provided any specific details, I would recommend you enable profiling; this will allow you to examine performance bottlenecks. At the mongo shell run:
db.setProfilingLevel(2)
Then run:
db.system.profile.find( { "millis": { "$gt": 10 } }, { "millis": 1, "command": 1 }) // find operations over 10 milliseconds
Once done set reset the profiling mode:
db.setProfilingLevel(0)

Mongodb cluster sudden crashes with read / write concern queries

We have a mongodb cluster with 5 PSA replica sets and one sharded collection. About 3,5 TB of data, 2 billion docs on primaries. Average insert rate: 300rps. Average select rate: 1000rps. Mongodb version 4.0.6. Collection has only one extra unique index, all read queries use one of the indexes (no long running queries).
PROBLEM. Sometimes (4 times in a last 2 month) one of the nodes stops responding to the queries with specified read concern or write concern. The same query without read/write concern executes successfully regardless of doing it locally or through mongos. These queries never execute, no errors, no timeouts even when restarting mongos, that initiate the query. No errors in mongod logs, no errors in system logs. Restart of this node fixes the problem. Mongodb sees such broken node as normal, rs.status() shows that everything is ok.
Have no idea how to reproduce this problem, much more intense load testing have no results.
We would appreciate any help and suggestions.

Any performance gotchas when doing mass delete from MongoDb

We are looking at writing log information to a MongoDB logging database but have essentially zero practical experience running Mongo in a production environment.
Every day we'll be writing a million+ log entries. Logs older than (say) a month need to be purged (say) daily. My concern is how Mongo will handle these deletes.
What are the potential issues with this plan with Mongo?
Do we need to chunk the deletes?
Given we'll be deleting by chronological age (ie: insert order), can I assume fragmentation will not be an issue?
Will the database need to be compacted regularly?
Potential issues: None, if you can live with eventual consistency.
No. A far better approach is to have an (ISO)Date field in your documents and set up a TTL index on it. Assuming the mentioned field holds the time at which the log entry was made, you would setup said index like:
db.yourCollection.createIndex(
{"nameOfDateField":1},
// Seconds in Minutes * Minutes in hour * hours a day * days in month (commercial)
{"expireAfterSeconds": 2592000}
)
This way, a mongod subprocess would take care of deleting the expired data, turning the collection in sort of a round robin database. Less moving parts, less to care about. Please note that the documents will not be deleted the very same second they expire. Under the worst circumstances, it can take up to 2 minutes from their time of expiration (iirc) before they are actually deleted. At median, an expired document should be deleted some 30 seconds after its expiration.
Compacting does not reclaim disk space on mmapv1, only on WiredTiger.Keep in mind that documents are never fragmented. With the fun fact that the database being compacted will be locked, I have yet to find a proper use case for the compact command. If disk space is your concern: Freed space in the datafiles will be reused. So yes, in a worst case scenario you can have a few additional datafiles allocated. Since I don't know the project's requirements and details, it is you who must decide wether reclaiming a few GB of disk space is worth locking the database for extended periods of time.
You can configure MongoDB for log files rotation :
http://docs.mongodb.org/manual/tutorial/rotate-log-files/
You'd certainly be interested by "Manage journaling" section too :
http://docs.mongodb.org/manual/tutorial/manage-journaling/
My last suggestion is about "smallfiles" option :
Set to false to prevent the overhead of journaling in situations where durability is not required. To reduce the impact of the journaling on disk usage, you can leave journal enabled, and set smallfiles to true to reduce the size of the data and journal files.

MongoDB upsert operation blocks inconsistently (with syncdelay set to 0)

There is a database with 9 million rows, with 3 million distinct entities. Such a database is loaded everyday into MongoDB using perl driver. It runs smoothly on the first load. However from the second, third, etc. the process is really slowed down. It blocks for long times every now and then.
I initially realised that this was because of the automatic flushing to disk every 60 seconds, so I tried setting syncdelay to 0 and I tried the nojournalling option. I have indexed the fields that are used for upsert. Also I have observed that the blocking is inconsistent and not always at the same time for the same line.
I have 17 G ram and enough hard disk space. I am replicating on two servers with one arbiter. I do not have any significant processes running in the background. Is there a solution/explanation for such blocking?
UPDATE: The mongostat tool says in the 'res' column that around 3.6 G is used.