Why and what case to issue mongodb repair command - mongodb

I am using Mongodb 2.4.8 on a 64 bit machiene with 3 servers as replicaSet, for which i have currently disbaled journaling on my development box .
Durabilty is not so important for our Application , so the reason i have disabled Journaling Option .
I see that there is only one advantage of journaling , that is in case of an unclean shutdown we dont have to issue a repair command as journaling will take care of it .
To produce this unclean shutdown i killed mongo replica process using kill -9 Mongo process Id , i just removed mongo locks and restarted the mongo primary , secondary and the arbitery servers , everything started fine .
My question is that , when i should we issue the repair command actually (as removing locks and restart works )
Please excuse if the question is too dumb , as i wanted to know the risk of disbaling journaling under production .

The repairDatabase command checks your whole database for corrupted data and discards that data so the rest becomes usable again.
This can become necessary after an unclear shutdown. In your case the shutdown didn't appear to corrupt any data (or maybe it did, but it didn't become apparent yet because the data in question wasn't accessed yet). But that doesn't mean that this will always be the case. Was your database actually doing anything at that moment? When the database is idle or only performing read-operations, there is usually not much to worry about. But when it is currently in the middle of a large write-operation, a sudden shutdown without journaling can be much more troublesome.
Another scenario where a database could be corrupted and repairDatabase could help is a physical malfunction of the storage medium or a corruption of the underlying filesystem.
Important note regarding replica-sets: When you have a replica-set, and only one node is corrupted, then you should rather remove that node and rebuild it from the other members of the replica-set. RepairDatabase will destroy any corrupted data. Restoring from a replica-set will not.

Related

MongoDB WiredTiger error: WiredTiger.turtle: handle-open: open: operation not permitted

MongoDB was working beautifully for me for several months until I had an unexpected shutdown a week or two ago. Since then, I've been getting the error in the title that snowballs into an invalid argument, then a library panic, then some fatal assertions which cause MongoDB to crash.
Now, I've done my research: the normal answers are to run the repair function and to make sure SELinux isn't screwing up the process. Neither of those have worked. The error gets thrown during WiredTiger's checkpoint process, so reads/writes to the database aren't the issue, and because it's during the checkpoint process, it guarantees that MongoDB won't stay up for more than a day.
To be clear: all the files in the database are owned by mongod:mongod, have permissions set to 600 (default, and I tried setting them to 755 to see if that fixed it, and it didn't). I'm running mongodb as a service on a CentOS 7 box, and the service file specifies that it should run as user mongod. The mongod.conf file specifies a mounted filesystem as the database, and it was happy with that until the unexpected shutdown. I'm running MongoDB version 4.0.1, so WiredTiger really doesn't like it if I disable Journaling either (disregarding the fact that I shouldn't disable it in the first place).
I feel like I've exhausted all my options, and that the only thing I can do is backup my data and reinstall MongoDB. Are there any that I've missed?
After creating a backup of my data via mongodump, shutting down mongo, removing the entire database with rm -rf 'path-to-database', rebooting mongo (without the replication config), and restoring the data with mongorestore, mongodb still crashes. This time, however, it's with an Invariant failure after the open: operation not permitted. The only conclusion I can think of is that the data itself has become corrupted in some way. Thankfully, this isn't "mission critical" data, so to speak, and I can easily obtain new data.
Unfortunately, this doesn't answer my original question of "what other options do I have?". However, I'm still posting this in case others run into this same kind of issue.
EDIT: invariant issue was caused by me forgetting to re-initialize my replication set. After fixing that, it's clean. Because of this, I no longer believe it was a data corruption issue, but a checkpoint corruption issue.
EDIT 2: So the issue arose again after about a week, and after another week of trying various debugging methods, I tried simply moving the mongo process to another server. So far, that's been working. The previous server was acting up (I couldn't even run top at one point - another process had a lock on a necessary library file to run it), so here's to hoping that the current server doesn't follow suite.

MongoDB reducing db filesize

I have a replica-set.
And I run out of disk space on my secondary instances.
There is no space on disk to run db.repairDatabase()
Is there any other way to free some disk space?
I was thinking:
bring secondary down
Delete all data
run db.repairDatabase() if deleting data will allow it
Bring it back up.
WIll this work?
UPDATE
Worth to mention that I can't currently SSH to servers. Only using mongo client now.
No that won't work - there has to be a database there to run db.repairDatabase() on. However, what works just as well is to bring the secondary down, delete the database files and then bring it back it up. This will force a re-sync with the primary which will in effect do the same thing as a db.repairDatabase() as it will recreate the data files from scratch.
However, in order to delete the datafiles you'll need to ssh in to the instance. If you cannot ssh in you have fairly significant issues that will interfere with any attempt to recover the secondary.

Is mongod --repair still a blocking task in MongoDB 1.8?

I have a 5GB database I want to compact and repair. Unfortunately, I have an active application running on that database.
I'm wondering if running a mongod --repair task with MongoDB 1.8 will block all the other write operations on the database.
I don't want to shutdown the entire application for hours...
You may take a look at --journal key. It keeps binary log for last operations and recovery may take much less time than repair.
http://www.mongodb.org/display/DOCS/Durability+and+Repair
Yes, repairDatabase is a blocking operation, which means you'll need to do it during a scheduled maintenance window.
Alternately, if you are using a replica set, it's possible to repair with no down time by taking one member out of the replica set, repairing it, re-adding it to the replica set, and repeating until all are repaired. See the note in yellow at the end of this section for more info and caveats.

Is it normal for MongoDB whole /data/db to be gone after a electric trip that result in crash

I have a single machine that has MongoDB and its data is at /data/db as usual.
When my machine crashed due to an electric power trip, my MongoDB refuse to start at launch (Mac OS X Server via LaunchAgent) and also /data/db mysteriously disappear!
Also all log file are wipe out. This happen on my development SSD MBA and I thought is just a weird SSD case. But my XServe server is getting it as well when the power trip.
Am I missing some data protection articles somewhere? For sure it can't be this unreliable by just deleting /data/db!!??
MongoDB will never ever remove your database files!
In case of a crash you have to start mongod using the --repair option.
In addition: using the new journaling option of MongoDB in V 1.8+ that should help a lot when you run MongoDB as standalone service.
No that is not normal.
If it won't start, it's likely mongodb is indicating that you need to run a repair because mongod.lock is present and has a certain state in /data/db. But that would mean /data/db exists.
If /data/db exists but were empty (which in this case would be bad obviously), it would start right up.
If you log(s) are missing, sounds like a more general disk issue.
So check the startup message if about mongod.lock there is data there. Also with v1.8+ use journaling. (albeit you wouldn't lose all datafiles even without journaling)

Reducing MongoDB database file size

I've got a MongoDB database that was once large (>3GB). Since then, documents have been deleted and I was expecting the size of the database files to decrease accordingly.
But since MongoDB keeps allocated space, the files are still large.
I read here and there that the admin command mongod --repair is used to free the unused space, but I don't have enough space on the disk to run this command.
Do you know a way I can freed up unused space?
UPDATE: with the compact command and WiredTiger it looks like the extra disk space will actually be released to the OS.
UPDATE: as of v1.9+ there is a compact command.
This command will perform a compaction "in-line". It will still need some extra space, but not as much.
MongoDB compresses the files by:
copying the files to a new location
looping through the documents and re-ordering / re-solving them
replacing the original files with the new files
You can do this "compression" by running mongod --repair or by connecting directly and running db.repairDatabase().
In either case you need the space somewhere to copy the files. Now I don't know why you don't have enough space to perform a compress, however, you do have some options if you have another computer with more space.
Export the database to another computer with Mongo installed (using mongoexport) and then you can Import that same database (using mongoimport). This will result in a new database that is more compressed. Now you can stop the original mongod replace with the new database files and you're good to go.
Stop the current mongod and copy the database files to a bigger computer and run the repair on that computer. You can then move the new database files back to the original computer.
There is not currently a good way to "compact in place" using Mongo. And Mongo can definitely suck up a lot of space.
The best strategy right now for compaction is to run a Master-Slave setup. You can then compact the Slave, let it catch up and switch them over. I know still a little hairy. Maybe the Mongo team will come up with better in place compaction, but I don't think it's high on their list. Drive space is currently assumed to be cheap (and it usually is).
It looks like Mongo v1.9+ has support for the compact in place!
> db.runCommand( { compact : 'mycollectionname' } )
See the docs here: http://docs.mongodb.org/manual/reference/command/compact/
"Unlike repairDatabase, the compact command does not require double disk space to do its work. It does require a small amount of additional space while working. Additionally, compact is faster."
I had the same problem, and solved by simply doing this at the command line:
mongodump -d databasename
echo 'db.dropDatabase()' | mongo databasename
mongorestore dump/databasename
Compact all collections in current database
db.getCollectionNames().forEach(function (collectionName) {
print('Compacting: ' + collectionName);
db.runCommand({ compact: collectionName });
});
If you need to run a full repair, use the repairpath option. Point it to a disk with more available space.
For example, on my Mac I've used:
mongod --config /usr/local/etc/mongod.conf --repair --repairpath /Volumes/X/mongo_repair
Update: Per MongoDB Core Server Ticket 4266, you may need to add --nojournal to avoid an error:
mongod --config /usr/local/etc/mongod.conf --repair --repairpath /Volumes/X/mongo_repair --nojournal
Starting with 2.8 version of Mongo, you can use compression. You will have 3 levels of compression with WiredTiger engine, mmap (which is default in 2.6 does not provide compression):
None
snappy (by default)
zlib
Here is an example of how much space will you be able to save for 16 GB of data:
data is taken from this article.
We need solve 2 ways, based on StorageEngine.
1. MMAP() engine:
command: db.repairDatabase()
NOTE: repairDatabase requires free disk space equal to the size of your current data set plus 2 gigabytes. If the volume that holds dbpath lacks sufficient space, you can mount a separate volume and use that for the repair. When mounting a separate volume for repairDatabase you must run repairDatabase from the command line and use the --repairpath switch to specify the folder in which to store temporary repair files.
eg: Imagine DB size is 120 GB means, (120*2)+2 = 242 GB Hard Disk space required.
another way you do collection wise,
command: db.runCommand({compact: 'collectionName'})
2. WiredTiger:
Its automatically resolved it-self.
There has been some considerable confusion over space reclamation in MongoDB, and some recommended practice are downright dangerous to do in certain deployment types. More details below:
TL;DR repairDatabase attempts to salvage data from a standalone MongoDB deployments that is trying to recover from a disk corruption. If it recovers space, it is purely a side effect. Recovering space should never be the primary consideration of running repairDatabase.
Recover space in a standalone node
WiredTiger: For a standalone node with WiredTiger, running compact will release space to the OS, with one caveat: The compact command on WiredTiger on MongoDB 3.0.x was affected by this bug: SERVER-21833 which was fixed in MongoDB 3.2.3. Prior to this version, compact on WiredTiger could silently fail.
MMAPv1: Due to the way MMAPv1 works, there is no safe and supported method to recover space using the MMAPv1 storage engine. compact in MMAPv1 will defragment the data files, potentially making more space available for new documents, but it will not release space back to the OS.
You may be able to run repairDatabase if you fully understand the consequences of this potentially dangerous command (see below), since repairDatabase essentially rewrites the whole database by discarding corrupt documents. As a side effect, this will create new MMAPv1 data files without any fragmentation on it and release space back to the OS.
For a less adventurous method, running mongodump and mongorestore may be possible as well in an MMAPv1 deployment, subject to the size of your deployment.
Recover space in a replica set
For replica set configurations, the best and the safest method to recover space is to perform an initial sync, for both WiredTiger and MMAPv1.
If you need to recover space from all nodes in the set, you can perform a rolling initial sync. That is, perform initial sync on each of the secondaries, before finally stepping down the primary and perform initial sync on it. Rolling initial sync method is the safest method to perform replica set maintenance, and it also involves no downtime as a bonus.
Please note that the feasibility of doing a rolling initial sync also depends on the size of your deployment. For extremely large deployments, it may not be feasible to do an initial sync, and thus your options are somewhat more limited. If WiredTiger is used, you may be able to take one secondary out of the set, start it as a standalone, run compact on it, and rejoin it to the set.
Regarding repairDatabase
Please don't run repairDatabase on replica set nodes. This is very dangerous, as mentioned in the repairDatabase page and described in more details below.
The name repairDatabase is a bit misleading, since the command doesn't attempt to repair anything. The command was intended to be used when there's disk corruption on a standalone node, which could lead to corrupt documents.
The repairDatabase command could be more accurately described as "salvage database". That is, it recreates the databases by discarding corrupt documents in an attempt to get the database into a state where you can start it and salvage intact document from it.
In MMAPv1 deployments, this rebuilding of the database files releases space to the OS as a side effect. Releasing space to the OS was never the purpose.
Consequences of repairDatabase on a replica set
In a replica set, MongoDB expects all nodes in the set to contain identical data. If you run repairDatabase on a replica set node, there is a chance that the node contains undetected corruption, and repairDatabase will dutifully remove the corrupt documents for you.
Predictably, this makes that node contains a different dataset from the rest of the set. If an update happens to hit that single document, the whole set could crash.
To make matters worse, it is entirely possible that this situation could stay dormant for a long time, only to strike suddenly with no apparent reason.
In case a large chunk of data is deleted from a collection and the collection never uses the deleted space for new documents, this space needs to be returned to the operating system so that it can be used by other databases or collections. You will need to run a compact or repair operation in order to defragment the disk space and regain the usable free space.
Behavior of compaction process is dependent on MongoDB engine as follows
db.runCommand({compact: collection-name })
MMAPv1
Compaction operation defragments data files & indexes. However, it does not release space to the operating system. The operation is still useful to defragment and create more contiguous space for reuse by MongoDB. However, it is of no use though when the free disk space is very low.
An additional disk space up to 2GB is required during the compaction operation.
A database level lock is held during the compaction operation.
WiredTiger
The WiredTiger engine provides compression by default which consumes less disk space than MMAPv1.
The compact process releases the free space to the operating system.
Minimal disk space is required to run the compact operation.
WiredTiger also blocks all operations on the database as it needs database level lock.
For MMAPv1 engine, compact doest not return the space to operating system. You require to run repair operation to release the unused space.
db.runCommand({repairDatabase: 1})
Mongodb 3.0 and higher has a new storage engine - WiredTiger.
In my case switching engine reduced disk usage from 100 Gb to 25Gb.
Database files cannot be reduced in size. While "repairing" database, it is only possible for mongo server to delete some of its files. If large amount of data has been deleted, mongo server will "release" (delete), during repair, some of its existing files.
In general compact is preferable to repairDatabase. But one advantage of repair over compact is you can issue repair to the whole cluster. compact you have to log into each shard, which is kind of annoying.
When i had the same problem, i stoped my mongo server and started it again with command
mongod --repair
Before running repair operation you should check do you have enough free space on your HDD (min - is the size of your database)
For standalone mode you could use compact or repair,
For sharded cluster or replica set, in my experience, after you running compact on the primary, followed by compact the secondary, the size of primary database reduced, but not the secondary.
You might want to do resync member to reduce the size of secondary database. and by doing this you might find that the size of secondary database is even more reduced than the primary, i guess the compact command not really compacting the collection.
So, i ended up switching the primary and secondary of the replica set and doing resync member again.
my conclusion is, the best way to reduce the size of sharded/replica set is by doing resync member, switch primary secondary, and resync again.
mongoDB -repair is not recommended in case of sharded cluster.
If using replica set sharded cluster, use compact command, it will rewrites and defragments all data and index files of all collections.
syntax:
db.runCommand( { compact : "collection_name" } )
when used with force:true, compact runs on primary of replica set.
e.g. db.runCommand ( { command : "collection_name", force : true } )
Other points to consider:
-It blocks the operations. so recommended to execute in maintenance window.
-If replica sets running on different servers, needs to be execute on each member separately
- In case of sharded cluster, compact needs to execute on each shard member separately. Cannot execute against mongos instance.
Just one way that I was able to do it. No guarantee on the safety of your existing data. Try with your own risk.
Delete the data files directly and restart mongod.
For example, with ubuntu (default path to data: /var/lib/mongodb), I had couple files with name like: collection.#. I keep the collection.0 and deleted all others.
Seems an easier way if you don't have serious data in database.