MongoDB very slow deletes - mongodb

I've got a small replica set of three mongod servers (16GB RAM each, at least 4 CPU cores and real HDDs) and one dedicated arbiter. The replicated data has about 100,000,000 records currently. Nearly all of this data is in one collection with an index on _id (the auto-generated Mongo ID) and date, which is a native Mongo date field. Periodically I delete old records from this collection using the date index, something like this (from the mongo shell):
db.repo.remove({"date" : {"$lt" : new Date(1362096000000)}})
This does work, but it runs very, very slowly. One of my nodes has slower I/O than the other two, having just a single SATA drive. When this node is primary, the deletes run at about 5-10 documents/sec. By using rs.stepDown() I have demoted this slower primary and forced an election to get a primary with better I/O. On that server, I am getting about 100 docs/sec.
My main question is, should I be concerned? I don't have the numbers from before I introduced replication, but I know the delete was much faster. I'm wondering if the replica set sync is causing I/O wait, or if there is some other cause. I would be totally happy with temporarily disabling sync and index updates until the delete statement finishes, but I don't know of any way to do that currently. For some reason, when I disable two of the three nodes, leaving just one node and the arbiter, the remaining node is demoted and writes are impossible (isn't the arbiter supposed to solve that?).
To give you some indication of the general performance, if I drop and recreate the date index, it takes about 15 minutes to scan all 100M docs.

This is happening because even though
db.repo.remove({"date" : {"$lt" : new Date(1362096000000)}})
looks like a single command it's actually operating on many documents - as many as satisfy this query.
When you use replication, every change operation has to be written to a special collection in the local database called oplog.rs - oplog for short.
The oplog has to have an entry for each deleted document and every one of those entries needs to be applied to the oplog on each secondary before it can also delete the same record.
One thing I can suggest that you consider is TTL indexes - they will "automatically" delete documents based on expiration date/value you set - this way you won't have one massive delete and instead will be able to spread the load more over time.

Another suggestion that may not fit you, but it was optimal solution for me:
drop indeces from collection
iterate over all entries of collection and store id's of records to delete into memory array
each time array is big enough (for me it was 10K records), i removed these records by ids
rebuild indeces
It is the fastest way, but it requires stopping the system, which was suitable for me.

Related

How to efficiently Shard a MongoDb collections that already has millions of documents?

I have a collection named order_error. Which has over 60 million documents. Today I was trying to shard it. I have 3 replica sets. Initially, no issues were there. The balancer was distributing the chunks among the clusters. But eventually, it has started to consume all Ram space and after all swap space too. Now everything is unresponsive. We can't follow this procedure in production. We need a better solution for that. How can I do the sharding in a better way?
If someone could help me with that please let me know
When you insert documents into an empty collection, then initially all date will be written to the primary shard, so it will not solve your issue.
But you can use sh.splitAt on empty collection to pre-split the it.
Note, even if the collection is empty it will take some time till chunks are distributed over all shards! When you split a chunk, then it still remains on the current shard. Check with db.collection.getShardDistribution() whether chunks are evenly distributed.

Intercept or filter out oplog transactions from MongoDB

There is a MongoDB which has interesting data I want to examine. Unfortunately, due to size concerns, once every 48 hours, the database is purged of "old" records.
I created a replica set with a secondary database system that has priority 0 and vote 0 so as not to interfere with the main database performance. This works great as I can query the secondary and get my data. However, there are many occasions that my system cannot process all the records in time and will lose some old records if I did not get to them within 48 hours.
Is there a way where I can cache the oplog on another system which I can then process at my leisure, possibly filtering out the deletes until I am ready?
I considered the slavedelay parameters, but that will affect all transactions. I also looked into Tungsten Replicate as a solution so I can essentially cache the the oplogs, however, they do not support MongoDB as a source of the data.
Is the oplog stored in plain text on the secondary such that I can read it and extract what I want from it.
Any pointers to this would be helpful, unfortunately I could not find much documentation on Oplog in the MongoDB website.
MongoDB oplog is stored as a capped collection called 'oplog.rs' in your local DB:
use local
db.oplog.rs.find()
If you want to store more old data in oplog for later use, you can try to increase the size of that collection. See http://docs.mongodb.org/manual/tutorial/change-oplog-size/
Alternatively, you can recreate oplog.rs as an uncapped collection (though this is not recommended since you will have to maually clean up oplog). Follow the same steps to increase the size above, but when recreating the oplog, use this command
db.runCommand( { create: "oplog.rs", capped: false})
Another solution is to create a cron job with the following command dump oplog into the folder YYYYMMDD:
mongodump --db local --collection oplog.rs -o $(date +%Y%m%d)
Hope that helps.
I wonder why you would do that manually. The "canonical" way to do it is to either identify the lifetime or expiration date of a record. If it is a lifetime, you'd do sth like
db.collection.insert({'foo':'bar' [...], created: ISODate("2014-10-06T09:00:05Z")})
and
db.collection.ensureIndex({'created':1},{expireAfterSeconds:172800})
By doing so, a thread called TTLMonitor will awake every minute and remove all documents which have a created field which is older than two days.
If you have a fixed expiration date for each document, you'd basically do the same:
db.collection.insert({'foo':'bar' [...], expirationDate: ISODate("2100-01-01T00:00:00Z"})
and
db.collection.ensureIndex({expirationDate:1},{expireAfterSeconds:0})
This will purge the documents in the first run of TTLMonitor after the expirationDate.
You could adjust expireAfterSeconds to a value which safely allows you to process the records before they are purged, keeping the overall size at acceptable needs and making sure that even when your application goes down during the purging work, the records are removed. (Not to mention that you don't need to maintain the purging logic yourself).
That being said and in the hope it might be useful to you, I think your problem is a conceptual one.
You have a scaling problem. You system is unable to deal with peaks, hence it occasionally is unable to process all data in time. Instead of fiddling with the internals of MongoDB (which might be quite dangerous, as #chianh correctly pointed out), you should rather scale accordingly by identifying your bottleneck and scale it according to your peaks.

TTL index on oplog or reducing the size of oplog?

I am using mongodb with elasticsearch for my application. Elasticsearch creates indexes by monitioring oplog collection. When both the applications are running constantly then any changes to the collections in mongodb are immediately indexed. The only problem I face is if for some reason I had to delete and recreate the index then it takes ages(2days) for the indexing to complete.
When I was looking at the size of my oplog by default it's capacity is 40gb and its holding around 60million transactions because of which creating a fresh index is taking a long time.
What would be the best way to optimize fresh index creation?
Is it to reduce the size of oplog so that it holds less number of transactions and still not affect my replication or is it possible to create a ttl index(which I failed to do on several attempts) on oplog.
I am using elasticsearch with mongodb using mongodb river https://github.com/richardwilly98/elasticsearch-river-mongodb/.
Any help to overcome the above mentioned issues is appreciated.
I am not a Elastic Search Pro but your question:
What would be the best way to optimize fresh index creation?
Does apply a little to all who use third party FTS techs with MongoDB.
The first thing to note is that if you have A LOT of records then there is no easy way around this unless you are prepared to lose some of them.
The oplog isn't really a good idea for this, you should probably seek out using a custom script using timers in the main collection to do this personally, or a change table giving you a single place to quickly query for new or updated records.
Unless you are filtering the oplog to get specific records, i.e. inserts, then you could be pulling out ALL oplog records including deletes, collection operations and even database operations. So you could try stripping out unneeded records from your oplog search, however, this then creates a new problem; the oplog has no indexes or index updating.
This means that if you start to read in a manner more appropiate you will actually use an unindexed query over these 60 million records. This will result in slow(er) performance.
The oplog having no index updating answers another one of your questions:
is it possible to create a ttl index(which I failed to do on several attempts) on oplog.
Nope.
As for the other one of your questions:
Is it to reduce the size of oplog so that it holds less number of transactions
Yes, but you will have a smaller recovery window of replication and not only that but you will lose records from your "fresh" index so only a part of your data is actually indexed. I am unsure, from your question, if this is a problem or not.
You can reduce the oplog for a single secondary member that no replica is synching from. Look up rs.syncFrom and "Change the Size of the Oplog" in the mongodb docs.

MongoDB fast deletion best approach

My application currently use MySQL. In order to support very fast deletion, I organize my data in partitions, according to timestamp. Then when data becomes obsolete, I just drop the whole partition.
It works great, and cleaning up my DB doesn't harm my application performance.
I would want to replace MySQL with MongoDB, and I'm wondering if there's something similiar in MongoDB, or would I just need to delete the records one by one (which, I'm afraid, will be really slow and will make my DB busy, and slow down queries response time).
In MongoDB, if your requirement is to delete data to limit the collection size, you should use a capped collection.
On the other hand, if your requirement is to delete data based on a timestamp, then a TTL index might be exactly what you're looking for.
From official doc regarding capped collections:
Capped collections automatically remove the oldest documents in the collection without requiring scripts or explicit remove operations.
And regarding TTL indexes:
Implemented as a special index type, TTL collections make it possible to store data in MongoDB and have the mongod automatically remove data after a specified period of time.
I thought, even though I am late and an answer has already been accepted, I would add a little more.
The problem with capped collections is that they regularly reside upon one shard in a cluster. Even though, in latter versions of MongoDB, capped collections are shardable they normally are not. Adding to this a capped collection MUST be allocated on the spot, so if you wish to have a long history before clearing the data you might find your collection uses up significantly more space than it should.
TTL is a good answer however it is not as fast as drop(). TTL is basically MongoDB doing the same thing, server-side, that you would do in your application of judging when a row is historical and deleting it. If done excessively it will have a detrimental effect on performance. Not only that but it isn't good at freeing up space to your $freelists which is key to stopping fragmentation in MongoDB.
drop()ing a collection will literally just "drop" the collection on the spot, instantly and gracefully giving that space back to MongoDB (not the OS) giving you absolutely no fragmentation what-so-ever. Not only that but the operation is a lot faster, 90% of the time, than most other alternatives.
So I would stick by my comment:
You could factor the data into time series collections based on how long it takes for data to become historical, then just drop() the collection
Edit
As #Zaid pointed out, even with the _id field capped collections are not shardable.
One solution to this is using TokuMX which supports partitioning:
https://www.percona.com/blog/2014/05/29/introducing-partitioned-collections-for-mongodb-applications/
Advantages over capped collections: capped collections use a fixed amount of space (even when you don't have this much data) and they can't be resized on-the-fly. Partitioned collections usage depends on data; you can add and remove partitions (for newly inserted data) as you see fit.
Advantages over TTL: TTL is slow, it just takes care of removing old data automatically. Partitions are fast - removing data is basically just a file removal.
HOWEVER: after getting acquired by Percona, development of TokuMX appears to have stopped (would love to be corrected on this point). Unfortunately MongoDB doesn't support this functionality and with TokuMX on its way out it looks like we will be stranded without proper solution.

mongodb got slow when the document count went around 100, 000 . Any performance optimization?

I run a single mongodb instance which is getting inserted with logs from an app server. the current rate of insert in production is 10 inserts per second. And its a capped collection. i DONT USE ANY INDEXES . Queries were running faster when there were small number of records. only one collection has that amount of data. even querying from collection that has very few rows has become very slow. IS there any means to improve the performance.
-Avinash
This is a very difficult question to answer because we dont know much about your configuration or your document structure.
One thing that immediately pops into my head is that you are running out of memory. 10 inserts per second doesn't mean much because we do not know how big the inserted documents are.
If you are inserting larger documents at 10 per second, you could be eating up memory, causing the operating system to push some of your records to disk.
When you query without using an index, you are forced to scan every document. If your documents have been pushed to disk by the OS, you will begin having page faults. Mongo will need to fetch pages of data off the hard disk, and load them into memory so that they can be scanned. Before doing this, the operating system will need to make room for that data in memory by flushing other parts of memory out to disk.
It sounds like you are are I/O bound and the two biggest things you can do to fix this are
Add more memory to the machine running mongod
Start using indexes so that the database does not need to do full collection scans
Use proper indexes, though that will have some effect on the efficiency of insertion in a capped collection.
It would be better if you can share the collection structure and the query you are using.