Mongorestore through stdin to db with different name - mongodb

I have been trying to find a way to do this but cannot, and I have a feeling there isn't an easy way to do what I want.
I have been testing using mongodump | mongorestore as a way to avoid creating an actual stored set of files, which is very useful on a cloud-based service. So far, I have been testing this by specifying the same db, although it technically isn't necessary, like so...
mongodump -h hostname -d dumpdb --excludeCollection bigcollection --archive | mongorestore --archive -d dumpdb --drop
mongodump -h hostname -d dumpdb -c bigcollection --archive --queryFile onlythese.json | mongorestore --archive -d dumpdb -c bigcollection --drop
I have found that these options work best for me; when I attempt to specify the -o - option with the single db, with the --archive removed, I was having some issues, but since this worked, I didn't mess with it.
Since I was restoring to the same db and because only the collections that were in there at the time were restored, I realize I can knock off the -d and -c in both mongorestore commands. But, it was easy to do, and is the set up for the next step...
All I wanted to do was restore the specified db in two steps, to a db of a different name, like so...
mongodump -h hostname -d dumpdb --excludeCollection bigcollection --archive | mongorestore --archive -d restoredb --drop
mongodump -h hostname -d dumpdb -c bigcollection --archive --queryFile onlythese.json | mongorestore --archive -d restoredb -c bigcollection --drop
The dump works fine, but restore does not. Based on the limited documentation, my assumption is that the dump db and restore db need to be the same for this to work; if the db specified in mongorestore is not in the dump, it just won't restore.
I find this to be rather annoying; my other thought was to restore the db as is, and just copy it to the other db.
I have thought about using other tools such as mongo-dump-stream, but considering everything was working swimmingly so far, I was hoping that it would work with the default tools.
On another point, I did dump the archive to a file (like so: > dumpdb.tar) and attempt to restore from there (like so --archive=dumpdb.tar) which is how I confirmed that the db needed to be in the dump.
Any suggestions/comments/hacks would be welcome.

As of version 3.4 of mongorestore, you can accomplish this using the --nsFrom and --nsTo options, which provide a pattern-based way to manipulate the names of your collections and/or dbs between the source and destination.
For example, to dump from a database named dumpdb into a new database named restoredb:
mongodump -h hostname -d dumpdb --archive | mongorestore --archive --nsFrom "dumpdb.*" --nsTo "restoredb.*" --drop
More from the mongodb docs: https://docs.mongodb.com/manual/reference/program/mongorestore/#change-collections-namespaces-during-restore

I haven't really found an answer to this, but, based on mongodb Jira, it looks like this is an open issue, and a feature request on their timeline, to actually be able to restore to a different DB than what you started with.
In the meantime, I came up with a not-so-great, but okay solution: as the archive file also contains the metadata, which in the code for mongorestore shows that that is all that is changed, you can replace the name of the DB in the binary file stream with one of the same length; so, my code now looks like this:
mongodump -h hostname -d dumpdb --excludeCollection bigcollection --archive | bbe -e "s/$(echo -n dumpdb | xxd -g 0 -u -ps -c 256 | sed -r 's/[A-F0-9]{2}/\\x&/g')\x00/$(echo -n rstrdb | xxd -g 0 -u -ps -c 256 | sed -r 's/[A-F0-9]{2}/\\x&/g')\x00/g" | mongorestore --archive --drop
You'll notice that I used bbe and dropped the -d tag, because this just changes the name of db in the archived metadata, and will restore to the new one.
The obvious issue here is that you cannot change the length gracefully; a longer name won't work, and shorter requires padding, which probably won't work. Otherwise, for those looking for a better solution than dumping to a file, or running a copyDatabase command (don't do it!), this is a workable solution.

Related

How can I replace the following 2 commands to use BSON file?

I use the following command to backup my local db
mongodump -h 127.0.0.1 --port 8001 -d meteor -c products --archive --gzip > dump.gz
Then I use the following command to restore on my server
cat dump.gz | ssh root#66.205.148.23 "cat | docker exec -i mongodb mongorestore --archive --gzip"
I want to do the same but with only one collection. Adding the -c parameter to the above commands does not work when trying to restore. I get a message that states that the -c param can only be used with BSON files.
How can I do the above for only one collection using the -c parameter?
Thanks
Use --out option for mongodump instead of --archieve to write BSON files
Specifies the directory where mongodump will write BSON files for the dumped databases. By default, mongodump saves output files in a directory named dump in the current working directory.
To send the database dump to standard output, specify “-” instead of a path. Write to standard output if you want process the output before saving it, such as to use gzip to compress the dump. When writing standard output, mongodump does not write the metadata that writes in a .metadata.json file when writing to files directly.
You cannot use the --archive option with the --out option.
This will create folder dump with BSON files
mongodump -h 127.0.0.1 --port 8001 -d meteor --gzip --out dump
To restore:
mongorestore -h 127.0.0.1 --port 8001 -d meteor --gzip -c collname foo dump/meteor/collname.bson.gz

Restore mongodb dump to different db [duplicate]

In MongoDB, is it possible to dump a database and restore the content to a different database? For example like this:
mongodump --db db1 --out dumpdir
mongorestore --db db2 --dir dumpdir
But it doesn't work. Here's the error message:
building a list of collections to restore from dumpdir dir
don't know what to do with subdirectory "dumpdir/db1", skipping...
done
You need to actually point at the "database name" container directory "within" the output directory from the previous dump:
mongorestore -d db2 dumpdir/db1
And usually just <path> is fine as a positional argument rather than with -dir which would only be needed when "out of position" i.e "in the middle of the arguments list".
p.s. For archive backup file (tested with mongorestore v3.4.10)
mongorestore --gzip --archive=${BACKUP_FILE_GZ} --nsFrom "${DB_NAME}.*" --nsTo "${DB_NAME_RESTORE}.*"
mongodump --db=DB_NAME --out=/path-to-dump
mongorestore --nsFrom "DB_NAME.*" --nsTo "NEW_DB_NAME.*" /path-to-dump
In addition to the answer of Blakes Seven, if your databases use authentication I got this to work using the --uri option, which requires a recent mongo version (>3.4.6):
mongodump --uri="mongodb://$sourceUser:$sourcePwd#$sourceHost/$sourceDb" --gzip --archive | mongorestore --uri="mongodb://$targetUser:$targetPwd#$targetHost/$targetDb" --nsFrom="$sourceDb.*" --nsTo="$targetDb.*" --gzip --archive
Thank you! #Blakes Seven
Adding Docker notes:
container names are interchangeable with container ID's
(assumes authenticated, assumes named container=my_db and new_db)
dump:
docker exec -it my_db bash -c "mongodump --uri mongodb://db:password#localhost:27017/my_db --archive --gzip | cat > /tmp/backup.gz"
copy to workstation:
docker cp my_db:/tmp/backup.gz c:\backups\backup.gz
copy into new container(form backups folder):
docker cp .\backup.gz new_db:/tmp
restore from container tmp folder:
docker exec -it new_db bash -c "mongorestore --uri mongodb://db:password#localhost:27017/new_db --nsFrom 'my_db.*' --nsTo 'new_db.*' --gzip --archive=/tmp/backup.gz"
You can restore DB with another name. The syntax is:
mongorestore --port 27017 -u="username" -p="password"
--nsFrom "dbname.*"
--nsTo "new_dbname.*"
--authenticationDatabase admin /backup_path

Mongorestore from stdin

Has someone ever managed to restore a mongodump from stdin?
I generated the backup file using the referenced command:
mongodump --archive > file
But the reference never explains in what format the archive is, or how to restore it. Some people say you have to inject it from stdin, but they remains a mystery as there is no evidence that it is actually possible.
As a matter of fact, I did the following tries, without any success:
cat file | mongorestore
mongorestore -vvvvv --archive < file
mongorestore -vvvvv --archive=file
All these commands end up with the same error:
Failed: stream or file does not appear to be a mongodump archive
For information, I managed to restore an archive that was generated the classic way.
Why do I need to use the --archive to stdout?
My mongodb is inside a docker.
The option I'm using now creates the backup file inside the docker, that I then copy to the host, using the docker cp command, so I use twice the space needed (before it's removed inside the container).
Unfortunately, the docker has no mount from the host, and I cannot (yet) restart it to add the option. I was looking for a quick option.
So according to MongoDB official mongodump doc and mongorestore doc:
To output the dump to the standard output stream in order to pipe to another process, run mongodump with the archive option but omit the filename.
To restore from the standard input, run mongorestore with the --archive option but omit the filename.
So actually you don't have to mongodump to file first and then read from it, just simply pipe mongodump and mongorestore like this:
mongodump --archive | mongorestore --archive --drop
However, you may probably get another problem as I did https://stackoverflow.com/a/56550768/3785901.
In my case, I have to use --nsFrom and --nsTo instead of --db, or the mongorestore didn't work as expected. So the final command I successfully execute to mongodump/mongorestore is:
mongodump --host HOST:PORT --db SOURCE_DB --username USERNAME --password PASSWORD --archive | mongorestore --host HOST:PORT --nsFrom 'SOURCE_DB.*' --nsTo 'TARGET_DB.*' --username USERNAME --password PASSWORD --archive --drop
Good luck.

How to migrate a MongoDB database between Docker containers?

Migrating databases in MongoDB is a pretty well understood problem domain and there are a range of tools available to do so on a host-level. Everything from mongodump and mongoexport to rsync on the data files. If you're getting very fancy, you can use network mounts like SSHFS and NFS to mitigate diskspace and IOPS constraint problems.
Migrating a Database on a Host
# Using a temporary archive
mongodump --db my_db --gzip --archive /tmp/my_db.dump --port 27017
mongorestore --db my_db --gzip --archive /tmp/my_db.dump --port 27018
rm /tmp/my_db.dump
# Or you can stream it...
mongodump --db my_db --port 27017 --archive \
| mongorestore --db my_db --port 27018 --archive
Performing the same migrations in a containerized environment, however, can be somewhat more complicated and the lightweight, purpose-specific nature of containers means that you often don't have the same set of tools available to you.
As an engineer managing containerized infrastructure, I'm interested in what approaches can be used to migrate a database from one container/cluster to another whether for backup, cluster migration or development (data sampling) purposes.
For the purpose of this question, let's assume that the database is NOT a multi-TB cluster spread across multiple hosts and seeing thousands(++) of writes per second (i.e. that you can make a backup and have "enough" data for it to be valuable without needing to worry about replicating oplogs etc).
I've used a couple of approaches to solve this before. The specific approach depends on what I'm doing and what requirements I need to work within.
1. Working with files inside the container
# Dump the old container's DB to an archive file within the container
docker exec $OLD_CONTAINER \
bash -c 'mongodump --db my_db --gzip --archive /tmp/my_db.dump'
# Copy the archive from the old container to the new one
docker cp $OLD_CONTAINER:/tmp/my_db.dump $NEW_CONTAINER:/tmp/my_db.dump
# Restore the archive in the new container
docker exec $NEW_CONTAINER \
bash -c 'mongorestore --db my_db --gzip --archive /tmp/my_db.dump'
This approach works quite well and avoids many encoding issues encountered when piping data over stdout, however it also doesn't work particularly well when migrating to containers on different hosts (you need to docker cp to a local file and then repeat the process to copy that local file to the new host) as well as when migrating from, say, Docker to Kubernetes.
Migrating to a different Docker cluster
# Dump the old container's DB to an archive file within the container
docker -H old_cluster exec $OLD_CONTAINER \
bash -c 'mongodump --db my_db --gzip --archive /tmp/my_db.dump'
docker -H old_cluster exec $OLD_CONTAINER rm /tmp/my_db.dump
# Copy the archive from the old container to the new one (via your machine)
docker -H old_cluster cp $OLD_CONTAINER:/tmp/my_db.dump /tmp/my_db.dump
docker -H new_cluster cp /tmp/my_db.dump $NEW_CONTAINER:/tmp/my_db.dump
rm /tmp/my_db.dump
# Restore the archive in the new container
docker -H new_cluster exec $NEW_CONTAINER \
bash -c 'mongorestore --db my_db --gzip --archive /tmp/my_db.dump'
docker -H new_cluster exec $NEW_CONTAINER rm /tmp/my_db.dump
Downsides
The biggest downside to this approach is the need to store temporary dump files everywhere. In the base case scenario, you would have a dump file in your old container and another in your new container; in the worst case you'd have a 3rd on your local machine (or potentially on multiple machines if you need to scp/rsync it around). These temp files are likely to be forgotten about, wasting unnecessary space and cluttering your container's filesystem.
2. Copying over stdout
# Copy the database over stdout (base64 encoded)
docker exec $OLD_CONTAINER \
bash -c 'mongodump --db my_db --gzip --archive 2>/dev/null | base64' \
| docker exec $NEW_CONTAINER \
bash -c 'base64 --decode | mongorestore --db my_db --gzip --archive'
Copying the archive over stdout and passing it via stdin to the new container allows you to remove the copy step and join the commands into a beautiful little one liner (for some definition of beautiful). It also allows you to potentially mix-and-match hosts and even container schedulers...
Migrating between different Docker clusters
# Copy the database over stdout (base64 encoded)
docker -H old_cluster exec $(docker -H old_cluster ps -q -f 'name=mongo') \
bash -c 'mongodump --db my_db --gzip --archive 2>/dev/null | base64' \
| docker -H new_cluster exec $(docker -H new_cluster ps -q -f 'name=mongo') \
bash -c 'base64 --decode | mongorestore --db my_db --gzip --archive'
Migrating from Docker to Kubernetes
# Copy the database over stdout (base64 encoded)
docker exec $(docker ps -q -f 'name=mongo') \
bash -c 'mongodump --db my_db --gzip --archive 2>/dev/null | base64' \
| kubectl exec mongodb-0 \
bash -c 'base64 --decode | mongorestore --db my_db --gzip --archive'
Downsides
This approach works well in the "success" case, but in situations where it fails to dump the database correctly the need to suppress the stderr stream (with 2>/dev/null) can cause serious headaches for debugging the cause.
It is also 33% less network efficient than the file case, since it needs to base64 encode the data for transport (potentially a big issue for larger databases). As with all streaming modes, there's also no way to inspect the data that was sent after the fact, which might be an issue if you need to track down an issue.

How to import dumped Mongodb?

Dumped a MongoDB successfully:
$ mongodump -h ourhost.com:portnumber -d db_name01 -u username -p
I need to import or export it to a testserver and have struggle with it, please help me figure out.
I tried some ways:
$ mongoimport -h host.com:port -c dbname -d dbname_test -u username -p
connected to host.
Password: ...
Gives this error:
assertion: 9997 auth failed: { errmsg: "auth fails", ok: 0.0 }
$ mongoimport -h host.com:port -d dbname_test -u username -p
Gives this error:
no collection specified!
How to specify which collection to use? What should I use for -d? What I'd like to upload or what I want to use as test out there? I would like to import the full DB not only collection of it.
The counterpart to mongodump is mongorestore (and the counterpart to mongoimport is mongoexport) -- the major difference is in the format of the files created and understood by the tools (dump and restore read and write BSON files; export and import deal with text file formats: JSON, CSV, TSV.
If you've already run mongodump, you should have a directory named dump, with a subdirectory for each database that was dumped, and a file in those directories for each collection. You can then restore this with a command like:
mongorestore -h host.com:port -d dbname_test -u username -p password dump/dbname/
Assuming that you want to put the contents of the database dbname into a new database called dbname_test.
You may have to specify the authentication database
mongoimport -h localhost:27017 --authenticationDatabase admin -u user -p -d database -c collection --type csv --headerline --file awesomedata.csv
For anyone else might reach this question after all these years (like I did), and if you are using
a dump which was created using mongodump
and trying to restore from a dump directory
and going to be using the default port 27017
All you got to do is,
mongorestore dump/
Refer to the mongorestore doc for more info. cheers!
When you do a mongodump it will dump in a binary format. You need to use mongorestore to "import" this data.
Mongoimport is for importing data that was exported using mongoexport