mongodump for collection larger than ram

mongodump for collection larger than ram - mongodb

I am using a command like this to dump data from a remote machine:
mongodump --verbose \
--uri="mongodb://mongousr:somepassword#host.domain.com:27017/somedb?authSource=admin" \
--out="$BACKUP_PATH"
This fails like so:
Failed: error writing data for collection `somedb.someCollection` to disk: error reading collection: EOF
somedb.someCollection is about 40GB. I don't have the ability to increase RAM to this size.
I have seen two explanations. One is that the console output is too verbose and fills the RAM. This seems absurd, it's only a few kilobytes and it's on the client machine anyway. Rejected (but I am trying it again now with --quiet just to be sure).
The more plausible explanation is that the host fills its RAM with somedb.someCollection data and then fails. The problem is that the 'solution' that I've seen proposed is to increase the RAM to be bigger than the size of the collection.
Really? That can't be right. What's the point of mongodump with that limitation?
The question: is it possible to mongodump a database with a collection that is larger than my RAM size? How?
mongodump Client:
macOS
mongodump --version
mongodump version: 4.0.3
git version: homebrew
Go version: go1.11.4
os: darwin
arch: amd64
compiler: gc
OpenSSL version: OpenSSL 1.0.2r 26 Feb 2019
Server:
built with docker FROM mongo:
Reports: MongoDB server version: 4.0.8

Simply dump your collection slice by slice:
mongodump --verbose \
--uri="mongodb://mongousr:somepassword#host.domain.com:27017/somedb?authSource=admin" \
--out="$BACKUP_PATH" -q '{_id: {$gte: ObjectId("40ad7bce1a3e827d690385ec")}}'
mongodump --verbose \
--uri="mongodb://mongousr:somepassword#host.domain.com:27017/somedb?authSource=admin" \
--out="$BACKUP_PATH" -q '{_id: {$lt: ObjectId("40ad7bce1a3e827d690385ec")}}'
or partitioning your dump by a different query set on _id or some different field. The reported _id is a mere example.

Stennie's answer really works.
The default value of storage.wiredTiger.engineConfig.cacheSizeGB is max((RAM-1GB)/2, 256MB). If your mongodb server is running in a docker container with default configs, and there are other apps running in the host machine, the memory could be full filled when you are dumping a large collection. The same thing can happen if the containers' RAM is limited due to your configs.
You can use docker run --name some-mongo -d mongo --wiredTigerCacheSizeGB 1.5 (the number is based on you situation).

Another possibility is to add the compress flag to the output of mongodump. It helped me to backup a db that hanged at 48% without compressing. So the syntax would be:
mongodump --uri="mongodb://mongousr:somepassword#host.domain.com:27017/somedbauthSource=admin" --gzip --out="$BACKUP_PATH"

Related

How to upgrade the pg_restore in docker postgres image 10.3 to 10.5

I use tableplus for my general admin.
Currently using the docker postgres image at 10.3 for both production and localhost development.
Because tableplus upgraded their postgres 10 drivers to 10.5, I can no longer use pg_restore to restore the backup files which are dumped using 10.5 --format=custom
See image for how I backup using tableplus. And how it uses 10.5 driver
The error message I get is pg_restore: [archiver] unsupported version (1.14) in file header
What i tried
I tried in localhost to simply change the tag for postgres in my dockerfile from 10.3 to 10.5 and it didn't work
original dockerfile
FROM postgres:10.3
COPY ./maintenance /usr/local/bin/maintenance
RUN chmod +x /usr/local/bin/maintenance/*
RUN mv /usr/local/bin/maintenance/* /usr/local/bin \
&& rmdir /usr/local/bin/maintenance
to
FROM postgres:10.5
COPY ./maintenance /usr/local/bin/maintenance
RUN chmod +x /usr/local/bin/maintenance/*
RUN mv /usr/local/bin/maintenance/* /usr/local/bin \
&& rmdir /usr/local/bin/maintenance
My host system for development is macOS.
I have many existing databases and schemas in my development docker postgres. So I am currently stumped as to how to upgrade safely without destroying old data.
Can advise?
Also I think a long term is to figure out how to have data files outside the docker (i.e. inside my host system) so that everytime I want to upgrade my docker image for postgres I can do so safely without fear.
I like to ask about how to switch to such a setup as well.

If I understand you correctly, you want to restore a custom format dump taken with 10.5 into a 10.3 database.
That won't be possible if the archive format has changed between 10.3 and 10.5.
As a workaround, you could use a “plain format” dump (option --format=plain) which does not have an “archive version”. But any problems during restore are yours to deal with, since downgrading PostgreSQL isn't supported.
You should always use the same version for development and production, and you should always use the latest minor release (currently 10.13). Everything else is asking for trouble.
backup as plain text like this: warning! the file will be huge. Around 17x more than regular custom format. My typical 90mb is now 1.75Gb
copy the backup file into the postgres container docker cp ~/path/to/dump/in-host-system/2020-07-08-1.dump <name_of_postgres_container>:/backups
go to the bash of your postgres container docker exec -it <name_of_postgres_container> bash
inside the bash of postgres container: psql -U username -d dbname < backups/2020-07-08-1.dump
That will work

Restore dump from older mongo

I’m currently trying to restore a mongodump made with mongodb:3.4-jessie into a newer version, mongodb:4.2.3-bionic.
When I try to execute my command:
sudo docker exec mongo mongorestore —db=mock —gzip /mongorestore/app
It returns me with this error:
2020-05-01T00:01:29.405+0000 the --db and --collection args should only be used when restoring from a BSON file. Other uses are deprecated and will not exist in the future; use --nsInclude instead
2020-05-01T00:01:29.406+0000 Failed: mongorestore target '/home/user1/mongorestore/app' invalid: stat /home/user1/mongorestore/app: no such file or directory
2020-05-01T00:01:29.406+0000 0 document(s) restored successfully. 0 document(s) failed to restore.
The folder app contains BSON files and json.gz too.
I can’t upgrade the older dump, as it’s the only thing left and really want to use a newer version of mongo.
Thanks a lot!

Your command was blocked by problems before it could attempt to restore data to a newer mongodb release.
You're running mongorestore inside a docker container but the input data directory /mongorestore/app does not seem to be inside the container (unless you mounted it in a previous step not seen here and passed the wrong path to mongorestore). You can use the Docker run command's --mount or --volume options to mount host directories into a container. Then pass its in-container path to the mongorestore command. See the docker run command docs.
mongorestore is warning about this use of the --db option but it's not clear if that's because it can't find the input data directory or if it needs the --nsInclude option instead. See the mongorestore command docs.
You shouldn't need to use sudo with docker exec, and doing so could cause permission problems with the mounted output files. The mongorestore command shouldn't need sudo either, but if I'm wrong about that, write docker exec mongo sudo mongorestore ....
The mongodump and mongorestore docs suggest that the --gzip option expects all the files to be compressed, not just some of them. Maybe it notices each file's .gz filename extensions to decide whether to decompress it, but the docs don't say that it supports that case.
I'm betting mongorestore can restore BSON files from an older release. That file format should be very stable.

I ran into the same issue but with Mongo 5 and it worked like so:
mongorestore --host=hostname --port=portnum \
--archive=/path/to/archive.gz --gzip --verbose \
--nsInclude="mydbname.*" \
--convertLegacyIndexes
where mydbname is the name of the db that I used when dumping the collections.
If you use another dbname now, then you need to convert them using --nsFrom="mydbname.*" --nsTo="newdbname.*"
All from: https://docs.mongodb.com/database-tools/mongorestore/

mongodump hangs when using --uri parameter

I am either misusing mongodump or it has a bug, but I'm not sure which. I typically use mongo connection strings in my applications and scripts, e.g.
mongo mongodb://username:ps#myhostname/dbname this works
The mongodump tool supposedly supports URL strings, but every time I try to use it it starts and then does nothing:
mongodump --uri mongodb://username:ps#myhostname/dbname this runs but stops and does nothing with no CPU usage.
I've tried using -vvvvv and there is no interesting data shown.
If I do the exact same thing using the "old" parameters, it works, but then I'd have to parse URIs and that makes me sad:
mongodump --host myhostname --username username --password ps -d dbname this works
1) Am I doing this wrong?
2) If this is a bug, where would I file a ticket?
3) Is there a tool that would parse a mongodb:// URI back into pieces so that I can keep using URIs in my automation stack?
$ mongodump --version
mongodump version: r3.6.8
git version: 6bc9ed599c3fa164703346a22bad17e33fa913e4
Go version: go1.8.5
os: linux
arch: amd64
compiler: gc
OpenSSL version: OpenSSL 1.1.0f 25 May 2017
db.version() in a connected shell also returns 3.6.8

I ran into this same issue, and likewise, was quite sad. However, I'm happy again because I realized you MUST append the following two options to your connection string:
?ssl=true&authSource=admin
Pop those bad boys on your uri and you should be smooth sailing.

Postgresql "pg_restore.exe" taking over a day to complete

I'm on a Windows Server 2016 machine. I have run pg_dump.exe on a 3gb postgres 9.4 database using the -Fc format.
When I run pg_restore to a local database (9.6):
pg_restore.exe -O -x -C -v -f c:/myfilename
The command runs for over 24 hours. (Still running)
Similar to this issue: Postgres Restore taking ages (days)
I am using the verbose cli option, which looks to be spitting out a lot of JSON. I'm assuming that's getting inserted into tables. The task manager has the CPU at 0%, using .06MB of memory. Looks like I should add more jobs next time, but this still seems pretty ridiculous.
I prefer using a linux machine, but this is what the client provided. Any suggestions?

pg_restore.exe -d {db_name} -O -x c:/myfilename
Did the trick.
I got rid of the -C and manually created the database prior to running the command. I also realized that connection options should come before other options:
pg_restore [connection-option...] [option...] [filename]
see postgres documentation for more.

Getting empty folder when using mongodump to back up my MongoDB

Basically, I have a problem with using mongodump to back up my MongoDB.
This is the general syntax I use in SSH:
mongodump -d myDatabaseName -o ~/backups/my_backup
This is the resulting message:
Fri Apr 22 20:39:57.304 DATABASE: myDatabaseName to /root/backups/my_backup/myDatabaseName
This simply generates a blank folder with no files in it whatsoever. The actual database is fairly large, so not sure what's going on.
I would also like to add that my mongodump client and my MongoDB version are both the same (version 2.4.9).
Not sure how to go about fixing this. Any help is appreciated.

This is a similar question as Mongodump getting blank folders
There is no accepted answer as of writing my answer. Here is what I did to resolve my issue and I believe it will help you as well.
The default mongodb-client deb package with Ubuntu is the issue. I removed those and installed the mongodb-org-tools package from mongodb.com https://docs.mongodb.com/master/tutorial/install-mongodb-on-ubuntu/?_ga=2.36209632.1945690590.1499275806-1594815486.1499275806
They have other install instructions for your specific OS if you are not on Ubuntu https://www.mongodb.com/download-center?jmp=nav#community

Try adding the port of mongodb_port as in:
mongodump --port your_number -c the_collection -d the_database

Make sure that you have the exact name of the database. If you spell it wrong, this could happen. To confirm, connect to your mongo database and type show dbs to see a list of database names. Then make sure that your databasename parameter -d <databasename> matches one of those in the list.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

mongodump for collection larger than ram - mongodb

Another possibility is to add the compress flag to the output of mongodump. It helped me to backup a db that hanged at 48% without compressing. So the syntax would be: mongodump --uri="mongodb://mongousr:somepassword#host.domain.com:27017/somedbauthSource=admin" --gzip --out="$BACKUP_PATH"

Related

How to upgrade the pg_restore in docker postgres image 10.3 to 10.5

Restore dump from older mongo

mongodump hangs when using --uri parameter

Postgresql "pg_restore.exe" taking over a day to complete

Getting empty folder when using mongodump to back up my MongoDB

Categories

Resources