tar gzip mongo dump like MySQL - mongodb

Is there anyway to tar gzip mongo dumps like you can do with MySQL dumps?
For example, for mysqldumps, you can write a command as such:
mysqldump -u <username> --password=<password> --all-databases | gzip > all-databases.`date +%F`.gz
Is there an equivalent way to do the same for mongo dumps?
For mongo dumps I run this command:
mongodump --host localhost --out /backup
Is there a way to just pipe that to gzip? I tried, but that didn't work.
Any ideas?

Version 3.2 introduced gzip and archive option:
mongodump --db <yourdb> --gzip --archive=/path/to/archive
Then you can restore with:
mongorestore --gzip --archive=/path/to/archive

Update (July 2015):
TOOLS-675 is now marked as complete, which will allow for dumping to an archive format in 3.2 and gzip will be one of the options in the 3.2 versions of the mongodump/mongorestore tools. I will update with the relevant docs once they are live for 3.2
Original answer (3.0 and below):
You can do this with a single collection by outputting mongodump to stdout, then piping it to a compression program (gzip, bzip2) but you will only get data (no index information) and you cannot do it for a full database (multiple collections) for now. The relevant feature request for this functionality is SERVER-5190 for upvoting/watching purposes.
Here is a quick sample run through of what is possible, using bzip2 in this example:
./mongo
MongoDB shell version: 2.6.1
connecting to: test
> db.foo.find()
{ "_id" : ObjectId("53ad8a3eb74b5ae2ff0ec93a"), "a" : 1 }
{ "_id" : ObjectId("53ad8ba445be9c4f7bd018b4"), "a" : 2 }
{ "_id" : ObjectId("53ad8ba645be9c4f7bd018b5"), "a" : 3 }
{ "_id" : ObjectId("53ad8ba845be9c4f7bd018b6"), "a" : 4 }
{ "_id" : ObjectId("53ad8baa45be9c4f7bd018b7"), "a" : 5 }
>
bye
$ ./mongodump -d test -c foo -o - | bzip2 - > foo.bson.bz2
connected to: 127.0.0.1
$ bunzip2 foo.bson.bz2
$ ./bsondump foo.bson
{ "_id" : ObjectId( "53ad8a3eb74b5ae2ff0ec93a" ), "a" : 1 }
{ "_id" : ObjectId( "53ad8ba445be9c4f7bd018b4" ), "a" : 2 }
{ "_id" : ObjectId( "53ad8ba645be9c4f7bd018b5" ), "a" : 3 }
{ "_id" : ObjectId( "53ad8ba845be9c4f7bd018b6" ), "a" : 4 }
{ "_id" : ObjectId( "53ad8baa45be9c4f7bd018b7" ), "a" : 5 }
5 objects found
Compare that with a straight mongodump (you get the same foo.bson but the extra foo.metadata.json describing the indexes is not included above):
$ ./mongodump -d test -c foo -o .
connected to: 127.0.0.1
2014-06-27T16:24:20.802+0100 DATABASE: test to ./test
2014-06-27T16:24:20.802+0100 test.foo to ./test/foo.bson
2014-06-27T16:24:20.802+0100 5 documents
2014-06-27T16:24:20.802+0100 Metadata for test.foo to ./test/foo.metadata.json
$ ./bsondump test/foo.bson
{ "_id" : ObjectId( "53ad8a3eb74b5ae2ff0ec93a" ), "a" : 1 }
{ "_id" : ObjectId( "53ad8ba445be9c4f7bd018b4" ), "a" : 2 }
{ "_id" : ObjectId( "53ad8ba645be9c4f7bd018b5" ), "a" : 3 }
{ "_id" : ObjectId( "53ad8ba845be9c4f7bd018b6" ), "a" : 4 }
{ "_id" : ObjectId( "53ad8baa45be9c4f7bd018b7" ), "a" : 5 }
5 objects found

Export Mongodb as
mongodump --host <host-ip> --port 27017 --db <database> --authenticationDatabase admin --username <username> --password <password> --gzip --archive > dump_`date "+%Y-%m-%d"`.gz
Import as
mongodump --host <host-ip> --port 27017 --db <database> --authenticationDatabase admin --username <username> --password <password> --gzip --archive=mongodump.gz

If you want to do it passing uri for your MongoDB replica set cluster
Dump:
mongodump --uri='mongodb://user:pass#primary_host,secondary_host/<db-name>?replicaSet=<replica-name>&authSource=admin' --gzip --archive > dump_`date "+%Y-%m-%d"`.gz
Restore:
mongorestore --uri='mongodb://user:pass#primary_host,secondary_host/<db-name>?replicaSet=<replica-name>&authSource=admin' --gzip --archive=<dump-file>.gz

Related

mongodb - issue with same file name in fs.files GridFS

I have multiple files in fs.files collection in mongodb GridFS with same name but for different Users.
When I use below query:
db.fs.files.find({"metadata.folder" : { "$exists": false,"metadata.msgid" : { "$exists": false}},{"metadata.user":1, "_id":0, "filename":1}).pretty()
I get result like :
{ "filename" : "standard.wav", "metadata" :
{ "user" : "101" }
}
{ "filename" : "standard.wav", "metadata" :
{ "user" : "100" }
}
{ "filename" : "standard.wav", "metadata" :
{ "user" : "104" }
}
Files are different for all Users but having same name.
So when I used following commands to store files in local system for different users, it always store same file for all Users.
For User 101 :
mongofiles --uri MONGO_DSN -d test -l /home/user/101/standard.wav get standard.wav
For User 100 :
mongofiles --uri MONGO_DSN -d test -l /home/user/100/standard.wav get standard.wav
For User 104 :
mongofiles --uri MONGO_DSN -d test -l /home/user/104/standard.wav get standard.wav
It should store different files for different users.
Thanks in advance.
I have solved it using get_id parameter instead of using get.
So my command now :
For User 101 :
mongofiles --uri MONGO_DSN -d test -l /home/user/101/standard.wav get_id $object101
For User 100 :
mongofiles --uri MONGO_DSN -d test -l /home/user/100/standard.wav get_id $object100
For User 104 :
mongofiles --uri MONGO_DSN -d test -l /home/user/104/standard.wav get_id $object104
Here my $object101, $object100, $object104 are extended JSON _id of the object in GridFS.
References :
mongofiles: get file by _id in addition to filename
MongoFiles

How can I import bson and json files into MongoDB?

I have following bson and json files from https://github.com/Apress/def-guide-to-mongodb/tree/master/9781484211830/The%20Definitive%20Guide%20to%20MongoDB
$ ls .
aggregation.bson aggregation.metadata.json mapreduce.bson mapreduce.metadata.json storage.bson text.json
How can I import them into MongoDB?
I tried to import each of them as a collection, but failed:
$ mongorestore -d test -c aggregation
2018-07-18T01:44:25.376-0400 the --db and --collection args should only be used when restoring from a BSON file. Other uses are deprecated and will not exist in the future; use --nsInclude instead
2018-07-18T01:44:25.377-0400 using default 'dump' directory
2018-07-18T01:44:25.377-0400 see mongorestore --help for usage information
2018-07-18T01:44:25.377-0400 Failed: mongorestore target 'dump' invalid: stat dump: no such file or directory
I am not sure if I specify the file aggregation.bson correctly, but the above command is what I learned from a similar example in a book.
Thanks.
UPDATE
In the following, why did the first fail and the second succeed? Which command shall I use?
$ mongoimport -d test -c aggregation --file aggregation.bson
2018-07-18T09:45:44.698-0400 connected to: localhost
2018-07-18T09:45:44.720-0400 Failed: error processing document #1: invalid character 'ยบ' looking for beginning of value
2018-07-18T09:45:44.720-0400 imported 0 documents
$ mongoimport -d test -c aggregation --file aggregation.metadata.json
2018-07-18T09:46:05.058-0400 connected to: localhost
2018-07-18T09:46:05.313-0400 imported 1 document
mongoimport --db dbName --collection collectionName --type json --file fileName.json
Update:
C:\Program Files\MongoDB\Server\4.0\bin>mongorestore -d test -c aggregation aggregation.bson
2018-07-19T10:28:39.963+0300 checking for collection data in aggregation.bson
2018-07-19T10:28:40.099+0300 restoring test.aggregation from aggregation.bson
2018-07-19T10:28:41.113+0300 no indexes to restore
2018-07-19T10:28:41.113+0300 finished restoring test.aggregation (1000 documents)
2018-07-19T10:28:41.113+0300 done
So I tried it and it worked fine for me do you have the file in your bin folder or maybe the command you used wasn't complete?
db.aggregation.find().pretty().limit(2)
{
"_id" : ObjectId("51de841747f3a410e3000001"),
"num" : 1,
"color" : "blue",
"transport" : "train",
"fruits" : [
"orange",
"banana",
"kiwi"
],
"vegetables" : [
"corn",
"broccoli",
"potato"
]
}
{
"_id" : ObjectId("51de841747f3a410e3000005"),
"num" : 5,
"color" : "yellow",
"transport" : "plane",
"fruits" : [
"lemon",
"cherry",
"dragonfruit"
],
"vegetables" : [
"mushroom",
"capsicum",
"zucchini"
]
}

Mongoexport date range query result in Failure parsing

Trying to run mongoexport and having problems with my query parameter.
mongoexport -d test-copy -c collection -o /home/ubuntu/mongodb-archiving/mongodump/collection.json --query '{"created_at": {\$lte: new Date(1451577599000) } }'
Collection is:
{"created_at" : ISODate("2014-03-07T06:32:19.172Z")}
To which I can query just fine in Mongo Client.
The result in the following error:
Assertion: 10340:Failure parsing JSON string near: "created_a
You have a \ in your query. Please remove it.
--query '{"created_at": {$lte: new Date(1451577599000)}}'
You should use $date with mongoexport:
mongoexport.exe -h *HOST* -p *PORT* -q "{ 'created_at' : { '$lt' : { '$date' : '2014-03-07T06:32:19.172Z' } } }"
Remove the \$lte and change it to quoted "$lt" in your query, and the mongodump shall work fine.
Tested on mongodb 3.0.8
> use appdb
> db.testcoll.find({})
{ "_id" : 1, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
{ "_id" : 2, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
{ "_id" : 3, "created_at" : ISODate("2016-09-16T08:46:30.736Z") }
{ "_id" : 4, "created_at" : ISODate("2016-09-16T08:47:12.368Z") }
{ "_id" : 5, "created_at" : ISODate("2016-09-16T08:47:15.562Z") }
> db.testcoll.find({"created_at":{"$lt":new Date("2016-09-16")}})
{ "_id" : 1, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
{ "_id" : 2, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
> db.testcoll.find({"created_at":{"$lt":new Date(1473984000)}})
// make sure you are using millisecond version of epoch
> db.testcoll.find({"created_at":{"$lt":new Date(1473984000000)}})
{ "_id" : 1, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
{ "_id" : 2, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
Now the mongodump part :
dp#xyz:~$ mongodump -d appdb -c testcoll --query '{"created_at":{"$lt":new Date(1473984000000)}}'
2016-09-16T14:21:27.695+0530 writing appdb.testcoll to dump/appdb/testcoll.bson
2016-09-16T14:21:27.696+0530 writing appdb.testcoll metadata to dump/appdb/testcoll.metadata.json
2016-09-16T14:21:27.708+0530 done dumping appdb.testcoll (2 documents)
The mongoexport and mongodump tools require a valid JSON object for the --query parameter. From https://docs.mongodb.com/manual/reference/program/mongodump/#cmdoption--query:
--query , -q
Provides a JSON document as a query that optionally limits the documents included in the output of mongodump.
You must enclose the query in single quotes (e.g. ') to ensure that it does not interact with your shell environment.
The command failed due to the query parameter you passed into mongoexport, which is not a valid JSON object due to the existence of new Date() which is a Javascript statement.
The required modification is to simply use the example ISODate() object you provided, .e.g:
mongoexport -d test-copy -c collection -o /home/ubuntu/mongodb-archiving/mongodump/collection.json --query '{"created_at": {$lte: ISODate("2014-03-07T06:32:19.172Z") } }'
You just need to replace the contents of the ISODate() with the date you require.

mongoexport not exporting any records from collection

I have really been puzzled by this one. Below is my database and collections. I can not get mongoexport to dump the collection itunes_itunes_level4_US_uniq into json.
I'm trying this via:
mongoexport -d test-database -c itunes_itunes_level4_US_uniq -o itunes_itunes_level4_US_uniq.json
2016-07-21T19:09:37.507-0500 connected to: localhost
2016-07-21T19:09:37.508-0500 exported 0 records
Same command allows me to export the other collections successfully.
What am I doing wrong?
> show dbs
admin (empty)
local 0.078GB
test 0.078GB
test-database 47.931GB
> show collections
[object Object]
extract
extract_4l
extract_level4
itunes_itunes_level4_US_uniq
itunes_level4_US
rabbit_US_uniq
system.indexes
> use test-database
switched to db test-database
> db.itunes_itunes_level4_US_uniq.stats(1024)
{
"ns" : "test-database.itunes_itunes_level4_US_uniq",
"count" : 986099,
"size" : 5295580,
"avgObjSize" : 5499,
"storageSize" : 6002404,
"numExtents" : 24,
"nindexes" : 1,
"lastExtentSize" : 1818052,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 1,
"totalIndexSize" : 7,
"indexSizes" : {
"_id_" : 7
},
"ok" : 1
}
mongoexport --db test-database --collection itunes_itunes_level4_US_uniq --out itunes_itunes_level4_US_uniq.json
Export whole db and find your collection in it by following command :
mongodump -d <database name> -o <directory_backup>
use above command.

Export array of documents from MongoDB in csv

I'm working on a java program to pass from MongoDB to Neo4j.
I have to export some Mongo documents in a csv file.
I have, for example, this document:
"coached_Team" : [
{
"team_id" : "Pal.00",
"in_charge" : {
"from" : {
"day" : 25,
"month" : 9,
"year" : 2013
}
},
"matches" : 75
}
]
I have to export in csv. I read some other questions, for example this and I used that tip to export my document.
To export in csv I use this command:
Z:\path\to\Mongo\3.0\bin>mongoexport --db <database> --collection
<collection> --type=csv --fields coached_Team.0.team_id,coached_Team.0.in_charge.from.day,
coached_Team.0.in_charge.from.month,coached_Team.0.in_charge.from.year,
coached_Team.0.matches --out "C:\path\to\output\file\output.csv
But, it did not work for me: