I am running mongoDB on a local appliance machine, with journaling enabled.
Is it always guaranteed that mongo will recover itself automatically even on power outage(meaning that the database was not closed properly) when journaling is enabled?
On what scenarios MongoDB will be corrupted even if journaling is enabled(besides filesystem corruption)?
Yes, it is guaranteed (assuming no filesystem corruption):
With journaling enabled, MongoDB creates a journal subdirectory within the directory defined by dbPath, which is /data/db by default. The journal directory holds journal files, which contain write-ahead redo logs. The directory also holds a last-sequence-number file. A clean shutdown removes all the files in the journal directory. A dirty shutdown (crash) leaves files in the journal directory; these are used to automatically recover the database to a consistent state when the mongod process is restarted.
(Journaling /core/journaling in the manual)
This is a big point for journaling in the first place and one of the primary reason journaling is used. Note data will still likely be lost (from the last 100ms or so) but the DB will be in a consistent state.
Related
From online document, it seems only when MongoDB instance is stopped and no heartbeat is detected, then election for failover will happen. But in case of bad disk or bad disk sector and MongoDB write failure to journal or datafile, how will MongoDB response? Will MongoDB instance crash and hence failover can happen after?
Today in MOngoDB bare metal setup, generally how system admin detect and handle disk failure? Thanks
Since MongoDB 4.2, Storage Node Watchdog is available in the community servers. From the linked page:
The Storage Node Watchdog monitors the following MongoDB directories to detect filesystem unresponsiveness:
The --dbpath directory
The journal directory inside the --dbpath directory if journaling is enabled
The directory of --logpath file
The directory of --auditPath file
If any of the filesystems containing the monitored directories become unresponsive, the Storage Node Watchdog terminates the mongod and exits with a status code of 61. If the mongod is the primary of a replica set, the termination initiates a failover, allowing another member to become primary.
One caveat:
If any of its monitored directories is a symlink to other volumes, the Storage Node Watchdog does not monitor the symlink target.
What's the best/fastest/safest way to recover deleted files from ext4 ?
Specs:
The disk is 1TB SSHD (hibrid HDD + SSD), also the partition is encrypted with LUKS Encryption (version 1)
Mongodb is using WiredTiger as a storage engine.
Also if I manage a partial recovery of files, could I do a partial recovery of mongo's collections?
Step 1: File recovery
Fast Recovery of files using extundelete:
sudo umount /path/to/disk &&
sudo extundelete /path/to/disk --restore-directory /path/to/dir -o /restored/path/
/path/to/disk represents the disk path, e.g. /dev/sdd , /dev/mapping/label
/path/to/dir represents the path that you want recovered relative to disk mounting point, e.g. if /dev/ssd would be mounted at /mnt/label/ the full path would be /mnt/label/path/to/dir and the relative path is /path/to/dir
pros of recovery with extundelete:
it's lightweight
can work if the disk is mounted or encrypted
pretty fast, it gave answers if recovery is possible in seconds and it writes the recovered files with over 100 MB/s
cons for data recovery in general
no guarantee for success
won't work if new data was written in the deleted sectors (so unmount the disk as soon as possible and make an image of the broken disk before any recovery)
Step 2 : repair mongodb if missing data
Backup before this step, mongod --repair could delete good data
Untested, but from my understanding mongod --repair should help repair the database if incomplete otherwise you can continue recovery for WiredTiger with :
Recovering a WiredTiger collection from a corrupt mongodb installation
I try to start mongod.exe but I have and I get the following error:
C:\MongoDB\Server\30\bin>mongod.exe
2015-12-16T19:12:17.108+0100 I CONTROL 2015-12-16T19:12:17.110+0100 W CONTROL 32-bit servers don't have journaling enabled by default.
Please use --journal if you want durability.
2015-12-16T19:12:17.110+0100 I CONTROL
2015-12-16T19:12:17.120+0100 I CONTROL Hotfix KB2731284 or later update is not installed, will zero-out data files
2015-12-16T19:12:17.132+0100 I STORAGE [initandlisten] **************
2015-12-16T19:12:17.132+0100 I STORAGE [initandlisten] Error: journal files are present in journal directory, yet starting without journaling enabled.
2015-12-16T19:12:17.133+0100 I STORAGE [initandlisten] It is recommended that you start with journaling enabled so that recovery may occur.
2015-12-16T19:12:17.133+0100 I STORAGE [initandlisten] **************
2015-12-16T19:12:17.135+0100 I STORAGE [initandlisten] exception in initAndListen: 13597 can't start without --journal enabled when journal/ files are present, terminating
2015-12-16T19:12:17.135+0100 I CONTROL [initandlisten] dbexit: rc: 100
I also tried to run it with --repair but then I get the same error.
Finally, I tried to delete the mongod.lock file but I still get the error.
How should I fix the unclean shutdown?
The solution to this problem is mongod --repair. This command automatically shuts down the all processes and repairs Mongodb issues. You can find more details in the official documentation.
Ok, to get some confusion right here. Journal files are not there to annoy you. They hold data not yet applied to the datafiles, but already received and acknowledged by the server. The mongod process finished a request after applying the data to the journal, but before applying them to the data files.
This behavior is configured by the chosen write concern.
Bottom line: special measurements were taken to make the data in the journal durable, you should not ignore that.
So you should create a configuration file containing this (among other things, if one already exists):
storage:
journal:
enabled: true
Please follow the documentation on running MongoDB on windows to the letter. Adjust the configuration file with options according to your needs.
If you are absolutely, positively sure that you do not need journaling, you can start mongodb with the --journal command line option just once, shut the instance down after the journal was successfully applied and remove the journal files then. Expect any write with a write concern involving the journal to fail, however.
Note 1 You are using the 32-bit version of MongoDB, which is only suitable for testing. Note that the 32-bit version only supports up to 2Gb of data.
Note 2 MongoDB is VERY well documented. You really should read the manual from top to bottom – it get's you started fast enough with providing a lot of information on the internals.
start cmd shell as admin and call start_mongo. This should fix it
Same error.
It' permission issue. If you get this error on Windows platform you should do all operations with administrator privilegies.
On Linux run
mongod --repair
but you should run it with sudo or under root. If under root you should change ownership of the files in data DB directory. Do it carefully or another error will appear.
Try removing the lock file, then running with --repair.
Here's what the Mongo Docs say about recovering data / restarting after an unexpected shutdown.
My MongoDB had crashed due to out of memory error that occurred when it tried appending to a journal file.
At that instance, my mongod.lock file was empty. I restarted mongod without any options. It was accepting connections normally. Then I ran mongo.exe, but was unable to connect to db. It got stuck to "connecting to test" but never connected successfully.
I ended that process and I restarted mongod with --nojournal option. But that didnt help either.
But now I see mongod.lock file non empty. Also,all my journal entries are deleted.
The question is, does --noJournal option deletes existing journal entries? Also, is there a way to recover the journal entries?
Recovering after a crash
First, please read this article:
Recover Data after an Unexpected Shutdown
After a crash, you have two options:
if it is a standalone instance, run mongod with the --repair option;
if the instance is a part of a replica set, wipe all data and either restore from a backup or perform an initial sync from another replica set member.
The --nojournal option
Running mongod --nojournal will not remove journal files. In fact, mongod will not even start if there are journal files present. It will give you the following message and shut down.
Error: journal files are present in journal directory, yet starting without journaling enabled.
It is recommended that you start with journaling enabled so that recovery may occur.
exception in initAndListen: 13597 can't start without --journal enabled when journal/ files are present, terminating
If you then run mongod without the --nojournal option, it will apply all changes saved in journal files and remove the files. Only then can you restart it with --nojournal.
I believe this is what happened in your case. You don't need to attempt to restore your journal files, as they are already applied to your data.
Assumption: Single MongoDB instance.
I have tested a backup and restore using an EBS snapshot of only the volume storing my data (dbpath) and NOT the /logs or /journal volumes. The restore seems to work fine and the data is available.
Are there any risks or downsides to doing this? In other words, do I lose anything if I don't have a backup snapshot of the /logs and /journal volumes?
Backing up if journal and dbpath are on separate EBS volumes
If your /journal directory is on a different EBS volume from your dbpath, the only way to get a consistent backup would be to use db.fsyncLock() to ensure there are no pending write operations. The fsyncLock() command has the side effect of blocking all writes to your database, so typically you would only want to use this approach if you are backing up from a secondary in a replica set (rather than a sole mongod, as per your assumption in the question description).
Backing up if journal and dbpath are on the same EBS volumes
If the journal and dbpath are on the same EBS volume you can get a consistent backup using EBS snapshots.
Do you need to backup the log directory?
Strictly speaking, you do not need to backup the logs. For troubleshooting purposes it can be useful to rotate the logs and keep a few days of recent log files.
I have tested a backup and restore using an EBS snapshot of only the volume storing my data (dbpath) and NOT the /logs or /journal volumes. The restore seems to work fine and the data is available.
This approach will be fine, until it isn't -- that fateful day when you want to need to restore from backup and realise that your last n backups are unusable as you try them one at a time, or perhaps encounter unexpected errors days after you assumed a restored database was OK. If you don't backup the journal file this is effectively the same as running without journaling, and the recommended recovery procedures involve running a repair before restarting. The risk isn't so much about changes that haven't been flushed from the journal, but rather the unlucky timing if the power goes out in the middle of a write to the data files leaving things in an inconsistent state with no recovery information (aka: the journal).
If you're going to take backups, definitely follow the correct procedure to remove unnecessary risk.
For more information see EC2 Backup and Restore in the MongoDB manual.