Can mongorestore handle regex validators? - mongodb

I'm trying to dump/restore a mongo collection that has validators. These validators have regexes, which, on the surface, it looks like mongorestore cannot import:
First, I create my collection:
use test
db.users.drop()
db.createCollection("users", {validator : {
name : {
$type : "string",
$regex : /^[A-z]*$/
},
}
})
// { "ok" : 1 }
db.getCollectionInfos()
// looks like it should...
db.users.insertOne({"name": "inv#lid"});
// fails
db.users.insertOne({"name": "Valid"});
// succeeds
db.users.find()
// { "_id" : ObjectId("59cd85d84f2803b08e9218ac"), "name" : "Valid" }
Then, I run a dump, which seems fine:
/usr/bin/mongodump --host $MYHOST \
--port $MYPORT \
--db test \
--gzip \
--archive=test.mongodump.gz
Then, a restore, which fails:
/usr/bin/mongorestore --host $MYHOST \
--port $MYPORT \
--gzip \
--archive=test.mongodump.gz
Error:
2017-09-28T23:31:30.626+0000 preparing collections to restore from
2017-09-28T23:31:30.691+0000 reading metadata for test.users from archive 'test.mongodump.gz'
2017-09-28T23:31:30.692+0000 Failed: test.users: error parsing metadata from archive 'test.mongodump.gz': extended json in 'options': expected $regex field to have string value
I've poured over the docs to mongdump and mongorestore, but haven't really gotten anywhere. I have tried --noOptionsRestore, same error.
I'm relatively new to mongo, so I could just be missing something simple...

It turns out that, if you have a $regex, you also needs $options, even if $options = ''. The error message sort of hints at this, but very poorly. Changing the create statement from:
db.createCollection("users", {validator : {
name : {
$type : "string",
$regex : /^[A-z]*$/
},
}
})
to
db.createCollection("users", {validator : {
name : {
$type : "string",
$regex : /^[A-z]*$/,
$options: ''
},
}
})
solves the problem

Related

mongodb - issue with same file name in fs.files GridFS

I have multiple files in fs.files collection in mongodb GridFS with same name but for different Users.
When I use below query:
db.fs.files.find({"metadata.folder" : { "$exists": false,"metadata.msgid" : { "$exists": false}},{"metadata.user":1, "_id":0, "filename":1}).pretty()
I get result like :
{ "filename" : "standard.wav", "metadata" :
{ "user" : "101" }
}
{ "filename" : "standard.wav", "metadata" :
{ "user" : "100" }
}
{ "filename" : "standard.wav", "metadata" :
{ "user" : "104" }
}
Files are different for all Users but having same name.
So when I used following commands to store files in local system for different users, it always store same file for all Users.
For User 101 :
mongofiles --uri MONGO_DSN -d test -l /home/user/101/standard.wav get standard.wav
For User 100 :
mongofiles --uri MONGO_DSN -d test -l /home/user/100/standard.wav get standard.wav
For User 104 :
mongofiles --uri MONGO_DSN -d test -l /home/user/104/standard.wav get standard.wav
It should store different files for different users.
Thanks in advance.
I have solved it using get_id parameter instead of using get.
So my command now :
For User 101 :
mongofiles --uri MONGO_DSN -d test -l /home/user/101/standard.wav get_id $object101
For User 100 :
mongofiles --uri MONGO_DSN -d test -l /home/user/100/standard.wav get_id $object100
For User 104 :
mongofiles --uri MONGO_DSN -d test -l /home/user/104/standard.wav get_id $object104
Here my $object101, $object100, $object104 are extended JSON _id of the object in GridFS.
References :
mongofiles: get file by _id in addition to filename
MongoFiles

Mongoexport date range query result in Failure parsing

Trying to run mongoexport and having problems with my query parameter.
mongoexport -d test-copy -c collection -o /home/ubuntu/mongodb-archiving/mongodump/collection.json --query '{"created_at": {\$lte: new Date(1451577599000) } }'
Collection is:
{"created_at" : ISODate("2014-03-07T06:32:19.172Z")}
To which I can query just fine in Mongo Client.
The result in the following error:
Assertion: 10340:Failure parsing JSON string near: "created_a
You have a \ in your query. Please remove it.
--query '{"created_at": {$lte: new Date(1451577599000)}}'
You should use $date with mongoexport:
mongoexport.exe -h *HOST* -p *PORT* -q "{ 'created_at' : { '$lt' : { '$date' : '2014-03-07T06:32:19.172Z' } } }"
Remove the \$lte and change it to quoted "$lt" in your query, and the mongodump shall work fine.
Tested on mongodb 3.0.8
> use appdb
> db.testcoll.find({})
{ "_id" : 1, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
{ "_id" : 2, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
{ "_id" : 3, "created_at" : ISODate("2016-09-16T08:46:30.736Z") }
{ "_id" : 4, "created_at" : ISODate("2016-09-16T08:47:12.368Z") }
{ "_id" : 5, "created_at" : ISODate("2016-09-16T08:47:15.562Z") }
> db.testcoll.find({"created_at":{"$lt":new Date("2016-09-16")}})
{ "_id" : 1, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
{ "_id" : 2, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
> db.testcoll.find({"created_at":{"$lt":new Date(1473984000)}})
// make sure you are using millisecond version of epoch
> db.testcoll.find({"created_at":{"$lt":new Date(1473984000000)}})
{ "_id" : 1, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
{ "_id" : 2, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
Now the mongodump part :
dp#xyz:~$ mongodump -d appdb -c testcoll --query '{"created_at":{"$lt":new Date(1473984000000)}}'
2016-09-16T14:21:27.695+0530 writing appdb.testcoll to dump/appdb/testcoll.bson
2016-09-16T14:21:27.696+0530 writing appdb.testcoll metadata to dump/appdb/testcoll.metadata.json
2016-09-16T14:21:27.708+0530 done dumping appdb.testcoll (2 documents)
The mongoexport and mongodump tools require a valid JSON object for the --query parameter. From https://docs.mongodb.com/manual/reference/program/mongodump/#cmdoption--query:
--query , -q
Provides a JSON document as a query that optionally limits the documents included in the output of mongodump.
You must enclose the query in single quotes (e.g. ') to ensure that it does not interact with your shell environment.
The command failed due to the query parameter you passed into mongoexport, which is not a valid JSON object due to the existence of new Date() which is a Javascript statement.
The required modification is to simply use the example ISODate() object you provided, .e.g:
mongoexport -d test-copy -c collection -o /home/ubuntu/mongodb-archiving/mongodump/collection.json --query '{"created_at": {$lte: ISODate("2014-03-07T06:32:19.172Z") } }'
You just need to replace the contents of the ISODate() with the date you require.

What role is required to allow MongoDB db.collection.aggregate() usage?

I have a user in admin.system.users:
{
"_id" : "admin.reports",
"user" : "reports",
"db" : "admin",
"credentials" : {
"MONGODB-CR" : "hash"
},
"roles" : [
{
"role" : "read",
"db" : "mydb"
}
]
}
And I am attempting to execute this from a bash script:
mongo --quiet mongodb.production.internal/admin -ureports -p"password" <<< "
var conn = connect('localhost/mydb');
db.auth('reports', 'password');
db.Collection.aggregate([
{
\$group: {
_id: '\$itemId',
count: { \$sum: 1 }
}
},
{
\$sort: {
count: 1
}
}
], {cursor: {}}).forEach(function(line) {
print(line._id+','+line.count);
});
"
And I am getting this response from the mongodb instance:
not authorized on admin to execute command { aggregate: (amongst other very verbose but pointless output).
What permission do I need to enable the use of the aggregation command for this user? Can it not be done without enabling write access?
The issue was not with permissions, read is indeed sufficient, but with correctly switching to the right database.
I was using connect() to silently (no output) switch to the working db because the use command throws output. This is incorrect, and the workaround is in how you specify the connection on the command line, ie:
mongo --quiet mongodb.production.internal/mydb --authenticationDatabase admin -ureports -p"password"

tar gzip mongo dump like MySQL

Is there anyway to tar gzip mongo dumps like you can do with MySQL dumps?
For example, for mysqldumps, you can write a command as such:
mysqldump -u <username> --password=<password> --all-databases | gzip > all-databases.`date +%F`.gz
Is there an equivalent way to do the same for mongo dumps?
For mongo dumps I run this command:
mongodump --host localhost --out /backup
Is there a way to just pipe that to gzip? I tried, but that didn't work.
Any ideas?
Version 3.2 introduced gzip and archive option:
mongodump --db <yourdb> --gzip --archive=/path/to/archive
Then you can restore with:
mongorestore --gzip --archive=/path/to/archive
Update (July 2015):
TOOLS-675 is now marked as complete, which will allow for dumping to an archive format in 3.2 and gzip will be one of the options in the 3.2 versions of the mongodump/mongorestore tools. I will update with the relevant docs once they are live for 3.2
Original answer (3.0 and below):
You can do this with a single collection by outputting mongodump to stdout, then piping it to a compression program (gzip, bzip2) but you will only get data (no index information) and you cannot do it for a full database (multiple collections) for now. The relevant feature request for this functionality is SERVER-5190 for upvoting/watching purposes.
Here is a quick sample run through of what is possible, using bzip2 in this example:
./mongo
MongoDB shell version: 2.6.1
connecting to: test
> db.foo.find()
{ "_id" : ObjectId("53ad8a3eb74b5ae2ff0ec93a"), "a" : 1 }
{ "_id" : ObjectId("53ad8ba445be9c4f7bd018b4"), "a" : 2 }
{ "_id" : ObjectId("53ad8ba645be9c4f7bd018b5"), "a" : 3 }
{ "_id" : ObjectId("53ad8ba845be9c4f7bd018b6"), "a" : 4 }
{ "_id" : ObjectId("53ad8baa45be9c4f7bd018b7"), "a" : 5 }
>
bye
$ ./mongodump -d test -c foo -o - | bzip2 - > foo.bson.bz2
connected to: 127.0.0.1
$ bunzip2 foo.bson.bz2
$ ./bsondump foo.bson
{ "_id" : ObjectId( "53ad8a3eb74b5ae2ff0ec93a" ), "a" : 1 }
{ "_id" : ObjectId( "53ad8ba445be9c4f7bd018b4" ), "a" : 2 }
{ "_id" : ObjectId( "53ad8ba645be9c4f7bd018b5" ), "a" : 3 }
{ "_id" : ObjectId( "53ad8ba845be9c4f7bd018b6" ), "a" : 4 }
{ "_id" : ObjectId( "53ad8baa45be9c4f7bd018b7" ), "a" : 5 }
5 objects found
Compare that with a straight mongodump (you get the same foo.bson but the extra foo.metadata.json describing the indexes is not included above):
$ ./mongodump -d test -c foo -o .
connected to: 127.0.0.1
2014-06-27T16:24:20.802+0100 DATABASE: test to ./test
2014-06-27T16:24:20.802+0100 test.foo to ./test/foo.bson
2014-06-27T16:24:20.802+0100 5 documents
2014-06-27T16:24:20.802+0100 Metadata for test.foo to ./test/foo.metadata.json
$ ./bsondump test/foo.bson
{ "_id" : ObjectId( "53ad8a3eb74b5ae2ff0ec93a" ), "a" : 1 }
{ "_id" : ObjectId( "53ad8ba445be9c4f7bd018b4" ), "a" : 2 }
{ "_id" : ObjectId( "53ad8ba645be9c4f7bd018b5" ), "a" : 3 }
{ "_id" : ObjectId( "53ad8ba845be9c4f7bd018b6" ), "a" : 4 }
{ "_id" : ObjectId( "53ad8baa45be9c4f7bd018b7" ), "a" : 5 }
5 objects found
Export Mongodb as
mongodump --host <host-ip> --port 27017 --db <database> --authenticationDatabase admin --username <username> --password <password> --gzip --archive > dump_`date "+%Y-%m-%d"`.gz
Import as
mongodump --host <host-ip> --port 27017 --db <database> --authenticationDatabase admin --username <username> --password <password> --gzip --archive=mongodump.gz
If you want to do it passing uri for your MongoDB replica set cluster
Dump:
mongodump --uri='mongodb://user:pass#primary_host,secondary_host/<db-name>?replicaSet=<replica-name>&authSource=admin' --gzip --archive > dump_`date "+%Y-%m-%d"`.gz
Restore:
mongorestore --uri='mongodb://user:pass#primary_host,secondary_host/<db-name>?replicaSet=<replica-name>&authSource=admin' --gzip --archive=<dump-file>.gz

How to Export Mongo Data with UTC Timestamps?

I'm trying to use mongoexport to export a bunch of data in json so I can read it in a different program. I use the command:
mongoexport --jsonArray -h some_ip -d some_db -c some_collection -o mongo_dump.json
Problem is, all of my datetime objects wind up coming out looking like:
"time_created" : { "$date" : 1344000402000 }
"time_created" : { "$date" : 1343999298000 }
Which is the special 64 bit mongo time format. Is there something simple I can specify to just get unixtimestamps? Mongo time is useless to me and annoying to convert from.
I don't think there's a flag to change them in the output, unfortunately.
However, since the difference is just an extra three digits at the end, you can just do something like this:
sed -e 's/{ "\$date" : \([0-9]*\)[0-9]\{3\}/{ "\$date" : \1/' mongo_dump.json > unixstyle.json
It converted:
"time_created" : { "$date" : 1344000402000 }
"time_created" : { "$date" : 1343999298000 }
to:
"time_created" : { "$date" : 1344000402 }
"time_created" : { "$date" : 1343999298 }
edited to fix it to handle all kinds of digits, not just 0s