mongorestore out of memory - mongodb

The mongo process has been killed by out of memory after import data from gz file (~1.75GB) to mongodb (docker container).
it used about 10GB of RAM. Tried to use with --numParallelCollections and numInsertionWorkersPerCollection but still not work.
mongorestore -u superAdmin -p 'password123321!' \
--numParallelCollections 1 \
--numInsertionWorkersPerCollection=1 \
--authenticationDatabase=admin --drop --db "logs" --gzip logs.bson.gz

Related

Restore mongodb dump to different db [duplicate]

In MongoDB, is it possible to dump a database and restore the content to a different database? For example like this:
mongodump --db db1 --out dumpdir
mongorestore --db db2 --dir dumpdir
But it doesn't work. Here's the error message:
building a list of collections to restore from dumpdir dir
don't know what to do with subdirectory "dumpdir/db1", skipping...
done
You need to actually point at the "database name" container directory "within" the output directory from the previous dump:
mongorestore -d db2 dumpdir/db1
And usually just <path> is fine as a positional argument rather than with -dir which would only be needed when "out of position" i.e "in the middle of the arguments list".
p.s. For archive backup file (tested with mongorestore v3.4.10)
mongorestore --gzip --archive=${BACKUP_FILE_GZ} --nsFrom "${DB_NAME}.*" --nsTo "${DB_NAME_RESTORE}.*"
mongodump --db=DB_NAME --out=/path-to-dump
mongorestore --nsFrom "DB_NAME.*" --nsTo "NEW_DB_NAME.*" /path-to-dump
In addition to the answer of Blakes Seven, if your databases use authentication I got this to work using the --uri option, which requires a recent mongo version (>3.4.6):
mongodump --uri="mongodb://$sourceUser:$sourcePwd#$sourceHost/$sourceDb" --gzip --archive | mongorestore --uri="mongodb://$targetUser:$targetPwd#$targetHost/$targetDb" --nsFrom="$sourceDb.*" --nsTo="$targetDb.*" --gzip --archive
Thank you! #Blakes Seven
Adding Docker notes:
container names are interchangeable with container ID's
(assumes authenticated, assumes named container=my_db and new_db)
dump:
docker exec -it my_db bash -c "mongodump --uri mongodb://db:password#localhost:27017/my_db --archive --gzip | cat > /tmp/backup.gz"
copy to workstation:
docker cp my_db:/tmp/backup.gz c:\backups\backup.gz
copy into new container(form backups folder):
docker cp .\backup.gz new_db:/tmp
restore from container tmp folder:
docker exec -it new_db bash -c "mongorestore --uri mongodb://db:password#localhost:27017/new_db --nsFrom 'my_db.*' --nsTo 'new_db.*' --gzip --archive=/tmp/backup.gz"
You can restore DB with another name. The syntax is:
mongorestore --port 27017 -u="username" -p="password"
--nsFrom "dbname.*"
--nsTo "new_dbname.*"
--authenticationDatabase admin /backup_path

How to migrate a MongoDB database between Docker containers?

Migrating databases in MongoDB is a pretty well understood problem domain and there are a range of tools available to do so on a host-level. Everything from mongodump and mongoexport to rsync on the data files. If you're getting very fancy, you can use network mounts like SSHFS and NFS to mitigate diskspace and IOPS constraint problems.
Migrating a Database on a Host
# Using a temporary archive
mongodump --db my_db --gzip --archive /tmp/my_db.dump --port 27017
mongorestore --db my_db --gzip --archive /tmp/my_db.dump --port 27018
rm /tmp/my_db.dump
# Or you can stream it...
mongodump --db my_db --port 27017 --archive \
| mongorestore --db my_db --port 27018 --archive
Performing the same migrations in a containerized environment, however, can be somewhat more complicated and the lightweight, purpose-specific nature of containers means that you often don't have the same set of tools available to you.
As an engineer managing containerized infrastructure, I'm interested in what approaches can be used to migrate a database from one container/cluster to another whether for backup, cluster migration or development (data sampling) purposes.
For the purpose of this question, let's assume that the database is NOT a multi-TB cluster spread across multiple hosts and seeing thousands(++) of writes per second (i.e. that you can make a backup and have "enough" data for it to be valuable without needing to worry about replicating oplogs etc).
I've used a couple of approaches to solve this before. The specific approach depends on what I'm doing and what requirements I need to work within.
1. Working with files inside the container
# Dump the old container's DB to an archive file within the container
docker exec $OLD_CONTAINER \
bash -c 'mongodump --db my_db --gzip --archive /tmp/my_db.dump'
# Copy the archive from the old container to the new one
docker cp $OLD_CONTAINER:/tmp/my_db.dump $NEW_CONTAINER:/tmp/my_db.dump
# Restore the archive in the new container
docker exec $NEW_CONTAINER \
bash -c 'mongorestore --db my_db --gzip --archive /tmp/my_db.dump'
This approach works quite well and avoids many encoding issues encountered when piping data over stdout, however it also doesn't work particularly well when migrating to containers on different hosts (you need to docker cp to a local file and then repeat the process to copy that local file to the new host) as well as when migrating from, say, Docker to Kubernetes.
Migrating to a different Docker cluster
# Dump the old container's DB to an archive file within the container
docker -H old_cluster exec $OLD_CONTAINER \
bash -c 'mongodump --db my_db --gzip --archive /tmp/my_db.dump'
docker -H old_cluster exec $OLD_CONTAINER rm /tmp/my_db.dump
# Copy the archive from the old container to the new one (via your machine)
docker -H old_cluster cp $OLD_CONTAINER:/tmp/my_db.dump /tmp/my_db.dump
docker -H new_cluster cp /tmp/my_db.dump $NEW_CONTAINER:/tmp/my_db.dump
rm /tmp/my_db.dump
# Restore the archive in the new container
docker -H new_cluster exec $NEW_CONTAINER \
bash -c 'mongorestore --db my_db --gzip --archive /tmp/my_db.dump'
docker -H new_cluster exec $NEW_CONTAINER rm /tmp/my_db.dump
Downsides
The biggest downside to this approach is the need to store temporary dump files everywhere. In the base case scenario, you would have a dump file in your old container and another in your new container; in the worst case you'd have a 3rd on your local machine (or potentially on multiple machines if you need to scp/rsync it around). These temp files are likely to be forgotten about, wasting unnecessary space and cluttering your container's filesystem.
2. Copying over stdout
# Copy the database over stdout (base64 encoded)
docker exec $OLD_CONTAINER \
bash -c 'mongodump --db my_db --gzip --archive 2>/dev/null | base64' \
| docker exec $NEW_CONTAINER \
bash -c 'base64 --decode | mongorestore --db my_db --gzip --archive'
Copying the archive over stdout and passing it via stdin to the new container allows you to remove the copy step and join the commands into a beautiful little one liner (for some definition of beautiful). It also allows you to potentially mix-and-match hosts and even container schedulers...
Migrating between different Docker clusters
# Copy the database over stdout (base64 encoded)
docker -H old_cluster exec $(docker -H old_cluster ps -q -f 'name=mongo') \
bash -c 'mongodump --db my_db --gzip --archive 2>/dev/null | base64' \
| docker -H new_cluster exec $(docker -H new_cluster ps -q -f 'name=mongo') \
bash -c 'base64 --decode | mongorestore --db my_db --gzip --archive'
Migrating from Docker to Kubernetes
# Copy the database over stdout (base64 encoded)
docker exec $(docker ps -q -f 'name=mongo') \
bash -c 'mongodump --db my_db --gzip --archive 2>/dev/null | base64' \
| kubectl exec mongodb-0 \
bash -c 'base64 --decode | mongorestore --db my_db --gzip --archive'
Downsides
This approach works well in the "success" case, but in situations where it fails to dump the database correctly the need to suppress the stderr stream (with 2>/dev/null) can cause serious headaches for debugging the cause.
It is also 33% less network efficient than the file case, since it needs to base64 encode the data for transport (potentially a big issue for larger databases). As with all streaming modes, there's also no way to inspect the data that was sent after the fact, which might be an issue if you need to track down an issue.

mongodb dump and pipe to other db name

Mongodb version 3.2.12. I have two local databases, "base1" and "base2"
I want to copy all data (all collections) from base1 over to base2, replacing everything there (like when dumping production to a dev environment).
Any pipe command (or other simple way) to do this?
I tried
mongodump --archive --db base1 | mongorestore --db base2 --archive
lists a lot of "writing base1.collectionname to archive on stdout", but nothing gets written to base2.
I also tried
mongodump --db base1 --gzip --archive=/path/to/file.gz
mongorestore --db base2 --gzip --archive=/path/to/file.gz
Dump works, restore just says "creating intents for archive", "done"
I come across the same issue and after some googling and search I found this post
https://stackoverflow.com/a/43810346/3785901
I tried this command mentioned:
mongodump --host HOST:PORT --db SOURCE_DB --username USERNAME --password PASSWORD --archive | mongorestore --host HOST:PORT --nsFrom 'SOURCE_DB.*' --nsTo 'TARGET_DB.*' --username USERNAME --password PASSWORD --archive --drop
and it works like a charm.
It should work in your case, good luck.
I use following commands:
mongodump \
--host ${mongo.host} \
--port ${mongo.port} \
--username ${mongo.backup_restore_user} \
--password ${mongo.backup_restore_password} \
--db ${mongo.db} \
--gzip \
--dumpDbUsersAndRoles \
--archive=${archive}
and
mongorestore \
--keepIndexVersion \
--drop \
--gzip \
--restoreDbUsersAndRoles \
--db ${mongo.db} \
--host ${mongo.host} --port ${pims.mongo.port} \
--username ${mongo.backup_restore_user} \
--password ${mongo.backup_restore_password} \
--archive=${archive}

Mongodump not working on AWS EC2 instance

I tried Mongodump on AWS EC2 instance. There is no error, but the files are not dumped.
[ec2-user ~]$ sudo mongodump --host localhost:27017 --db test--out /var/backups/
connected to: localhost:27017
2017-01-19T01:56:05.608+0000 DATABASE: test to /var/backups/test
How to take a dump inside AWS EC2? The database is in data/db folder.
In my mongodb folder first I made a folder for backups:
sudo mkdir backups
Then used 777 just in case, later I changed it to 766
sudo chmod 777 -R backups
sudo mongodump -h ec2-xx-xx-xxx-xxx.compute-1.amazonaws.com --port 27017 --db your_db_name_here -u your_username_here -p your_password_here --out backups/
Of course change your host also
and if you want backup zipped then add this
--oplog --gzip
Hope that helps
First you should set required permission for out directory
sudo chmod 777 -R /var/backups
then
sudo mongodump --port 27017 --db test --out /var/backups/
Take Mongodb backup (Aws DocumentDB)
mongodump --host="Documentdb endpoint" --port=27017 -u "username" -p "" -d "dbname" --authenticationDatabase "admin" --gzip --archive > /path/dbname.gz
enter password
Note
--gzip: zip used in linux.
path: where you want to download the db zip file.

Mongorestore to a different database

In MongoDB, is it possible to dump a database and restore the content to a different database? For example like this:
mongodump --db db1 --out dumpdir
mongorestore --db db2 --dir dumpdir
But it doesn't work. Here's the error message:
building a list of collections to restore from dumpdir dir
don't know what to do with subdirectory "dumpdir/db1", skipping...
done
You need to actually point at the "database name" container directory "within" the output directory from the previous dump:
mongorestore -d db2 dumpdir/db1
And usually just <path> is fine as a positional argument rather than with -dir which would only be needed when "out of position" i.e "in the middle of the arguments list".
p.s. For archive backup file (tested with mongorestore v3.4.10)
mongorestore --gzip --archive=${BACKUP_FILE_GZ} --nsFrom "${DB_NAME}.*" --nsTo "${DB_NAME_RESTORE}.*"
mongodump --db=DB_NAME --out=/path-to-dump
mongorestore --nsFrom "DB_NAME.*" --nsTo "NEW_DB_NAME.*" /path-to-dump
In addition to the answer of Blakes Seven, if your databases use authentication I got this to work using the --uri option, which requires a recent mongo version (>3.4.6):
mongodump --uri="mongodb://$sourceUser:$sourcePwd#$sourceHost/$sourceDb" --gzip --archive | mongorestore --uri="mongodb://$targetUser:$targetPwd#$targetHost/$targetDb" --nsFrom="$sourceDb.*" --nsTo="$targetDb.*" --gzip --archive
Thank you! #Blakes Seven
Adding Docker notes:
container names are interchangeable with container ID's
(assumes authenticated, assumes named container=my_db and new_db)
dump:
docker exec -it my_db bash -c "mongodump --uri mongodb://db:password#localhost:27017/my_db --archive --gzip | cat > /tmp/backup.gz"
copy to workstation:
docker cp my_db:/tmp/backup.gz c:\backups\backup.gz
copy into new container(form backups folder):
docker cp .\backup.gz new_db:/tmp
restore from container tmp folder:
docker exec -it new_db bash -c "mongorestore --uri mongodb://db:password#localhost:27017/new_db --nsFrom 'my_db.*' --nsTo 'new_db.*' --gzip --archive=/tmp/backup.gz"
You can restore DB with another name. The syntax is:
mongorestore --port 27017 -u="username" -p="password"
--nsFrom "dbname.*"
--nsTo "new_dbname.*"
--authenticationDatabase admin /backup_path