Mongo backup - performance - mongodb

I'm going to run schedule backup for my MongoDB server base on this opensource script.
My data is about 10GB (and counting) and back it up taking time, my question is - is that backup operation actually blocking the database or will make the server slow to not serve the applications using it? or that Mongo backup is smart enough to do that in background without harm the service?
Do anyone have this such experience?
TU.

Related

Rundeck MariaDB hot backup

On rundeck backup guide, noted that is mandatory to stop rundeck to take full backup when using data file. Now, that guide don't show any secure method to backup full rundeck instance (rundeck server + database) when using MariaDB, PostgreSQL, or any supported database as a backend.
In a real production scenario, not seems to be possible to stop rdeck on a daily basis.
Can anybody share best pratices to take a hot full backup on rdeck installation without stop rdeck?
Is there any secure and supported way to achive a full consistent rdeck projects and jobs definitions and database on a daily basis ?
In this post, answer is not clear, because question don't describe what kinbd of backend are used.
The documentation suggests the instance shutdown because some execution could be active, and that means a potentially active transaction in the middle of the "hot backup process" which means a potential data corruption in your backup. Is the safest way to backup your database.
If you want to do a "hot" backup you can export your projects (with all content, including jobs) and keys.

mongorestore is very slow on AWS DocumentDB

There is any way to make this operation faster?
I'm trying to restore my DB to the AWS DocumentDB, and probably it will take some weeks to finish... my overall data is less than 400MB.
dump is Gzipped
To resolve this it was suggested to run the command from an EC2 instance rather than a remote host.
This enabled a speedy import.
The likely reason is the number of network based operations across the internet rather than a local network resource which has shorter latency between each interaction.

MongoDB database restoration takes too much time

I encountered with the following issue. I was tasked to restore Mongo DB from a backup and I am using mongorestore.exe (on Windows OS) for it. Restoration process takes about 1,5h, a backup file size is about 25G (contains 25M documents).
I tried to restore both on an AWS Document DB cluster (instance type: db.r5.large) and on locally installed MongoDB (EC2-instance, r5n.large). I got almost the same time of the process (about 1.5h)
My question: Is it reasonable time for this operation and how can I optimize/reduce time that needs for this?
All advice is very appreciated.
Agree with Ayoub#, parallelizing with --numParallelCollections has helped many folks speed up the restore process. Also, you can additionally speed up the restore by scaling up the Amazon DocumentDB instance to a larger size for the duration of the restore and scale it back down to an r5.large when the restore is complete. Amazon DocumentDB charges by the second for instance costs to help minimize costs for these scenarios.

MongoDB replication to hard drive

MonogDB's dynamic schema design is driving me towards it to replace MySQL in a production site. But this project runs on only 1 dedicated server (with 2 hard drives).
Docs about "MongoDB for production" recommends multiple servers. This makes me wonder if MongoDB is only suited for large commercial projects?
Anyways... I am wondering if the live database data can be replicated to the second hard drive for backup & recovery (to recover from corrupt data due to hard stop).
Any thoughts against the use of MongoDB in a single server environment is also appreciated. In this project, the biggest database will be less than 7GB.
Thanks

How to scale MongoDB?

I know that MongoDB can scale vertically. What about if I am running out of disk?
I am currently using EC2 with EBS. As you know, I have to assign EBS for a fixed size.
What if the MongoDB growth bigger than the EBS size? Do I have to create a larger EBS and Copy & Paste the files?
Or shall we start more MongoDB instance and each connect to different EBS disk? In such case, I could connect to a different instance for different databases.
If you're running out of disk, you obviously need to get a bigger disk.
There are several ways to migrate your data, it really depends on the type of up-time you need. First steps of course involve bundling the machine and creating the new volume.
These tips go from easiest to hardest.
Can you take the database completely off-line for several minutes?
If so, do this (migration by copy):
Mount new EBS on the server.
Stop your app from connecting to Mongo.
Shut down mongod and wait for everything to write (check the logs)
Copy all of the data files (and probably the logs) to the new EBS volume.
While the copy is happening, update your mongod start script (or config file) to point to the new volume.
Start mongod and check connection
Restart your app.
Can you take the database off-line for just a few minutes?
If so, do this (slaving and switch):
Start up a new instance and mount the new EBS on that server.
Install / start mongod as a --slave pointing at the current database. (you may need to re-start the current as --master)
The slave will do a fresh synchronization. Once the slave is up-to-date, you'll do a "switch" (next steps).
Turn off writes from the system.
Shut down the original mongod process.
Re-start the "new" mongod as a master instead of the slave.
Re-activate system writes pointing at the new master.
Done correctly those last three steps can happen in minutes or even seconds.
Can you not afford any down-time?
If so, do this (master-master):
Start up a new instance and mount the new EBS on that server.
Install / start mongod as a master and a slave against the current database. (may need to re-start current as master, minimal down-time?)
The new computer should do a fresh synchronization.
Once the new computer is up-to-date, switch the system to point at the new server.
I know it seems like this last version is actually the best, but it can be a little dicey (as of this writing). The reason is simply that I've honestly had a lot of issues with "Master-Master" replication, especially if you don't start with both active.
If you plan on using this method, I highly suggest a smaller practice run first. If something bombs here, Mongo might simply wipe all of your data files which will have the effect of taking more stuff down.
If you get a good version of this please post the commands, I'd like to see it in action.
Doesn't the E in EBS stand for elastic meaning something like resizing on the fly?
Currently the MongoDB team is working on finishining sharding which will allow you horizontal scaling by partitioning data separately on different servers. Give it a month or two and it will work fine. The developers are quite good at keeping their promises.
http://api.mongodb.org/wiki/current/Sharding%20Introduction.html
http://api.mongodb.org/wiki/current/Sharding%20Limits.html
You could slave the bigger disk off the smaller until it's caught up
or
fsync+lock and take a file system snapshot and copy it onto the bigger disk.
well, I am using Mongo DB now. I am pretty amazed the performance it generated, especially on some simple sorting.
I believe it's a good tool for simple web application logic. The remaining concern for is how to scale and backup. I will continue to explore.
The only disadvantage I have is that I didn't have any good tools to reveal the data stored inside. For example, I want to put my logging from MYSQL into Mongo as well. However, it's pretty difficult for me to view the log. Previously, i can use MYSQL query to fetch what I want easily.
Anyway, it's a good tool and I will continue to use it.