Mongoexport date range query result in Failure parsing - mongodb

Trying to run mongoexport and having problems with my query parameter.
mongoexport -d test-copy -c collection -o /home/ubuntu/mongodb-archiving/mongodump/collection.json --query '{"created_at": {\$lte: new Date(1451577599000) } }'
Collection is:
{"created_at" : ISODate("2014-03-07T06:32:19.172Z")}
To which I can query just fine in Mongo Client.
The result in the following error:
Assertion: 10340:Failure parsing JSON string near: "created_a

You have a \ in your query. Please remove it.
--query '{"created_at": {$lte: new Date(1451577599000)}}'

You should use $date with mongoexport:
mongoexport.exe -h *HOST* -p *PORT* -q "{ 'created_at' : { '$lt' : { '$date' : '2014-03-07T06:32:19.172Z' } } }"

Remove the \$lte and change it to quoted "$lt" in your query, and the mongodump shall work fine.
Tested on mongodb 3.0.8
> use appdb
> db.testcoll.find({})
{ "_id" : 1, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
{ "_id" : 2, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
{ "_id" : 3, "created_at" : ISODate("2016-09-16T08:46:30.736Z") }
{ "_id" : 4, "created_at" : ISODate("2016-09-16T08:47:12.368Z") }
{ "_id" : 5, "created_at" : ISODate("2016-09-16T08:47:15.562Z") }
> db.testcoll.find({"created_at":{"$lt":new Date("2016-09-16")}})
{ "_id" : 1, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
{ "_id" : 2, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
> db.testcoll.find({"created_at":{"$lt":new Date(1473984000)}})
// make sure you are using millisecond version of epoch
> db.testcoll.find({"created_at":{"$lt":new Date(1473984000000)}})
{ "_id" : 1, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
{ "_id" : 2, "created_at" : ISODate("2016-09-15T08:46:12.272Z") }
Now the mongodump part :
dp#xyz:~$ mongodump -d appdb -c testcoll --query '{"created_at":{"$lt":new Date(1473984000000)}}'
2016-09-16T14:21:27.695+0530 writing appdb.testcoll to dump/appdb/testcoll.bson
2016-09-16T14:21:27.696+0530 writing appdb.testcoll metadata to dump/appdb/testcoll.metadata.json
2016-09-16T14:21:27.708+0530 done dumping appdb.testcoll (2 documents)

The mongoexport and mongodump tools require a valid JSON object for the --query parameter. From https://docs.mongodb.com/manual/reference/program/mongodump/#cmdoption--query:
--query , -q
Provides a JSON document as a query that optionally limits the documents included in the output of mongodump.
You must enclose the query in single quotes (e.g. ') to ensure that it does not interact with your shell environment.
The command failed due to the query parameter you passed into mongoexport, which is not a valid JSON object due to the existence of new Date() which is a Javascript statement.
The required modification is to simply use the example ISODate() object you provided, .e.g:
mongoexport -d test-copy -c collection -o /home/ubuntu/mongodb-archiving/mongodump/collection.json --query '{"created_at": {$lte: ISODate("2014-03-07T06:32:19.172Z") } }'
You just need to replace the contents of the ISODate() with the date you require.

Related

Can mongorestore handle regex validators?

I'm trying to dump/restore a mongo collection that has validators. These validators have regexes, which, on the surface, it looks like mongorestore cannot import:
First, I create my collection:
use test
db.users.drop()
db.createCollection("users", {validator : {
name : {
$type : "string",
$regex : /^[A-z]*$/
},
}
})
// { "ok" : 1 }
db.getCollectionInfos()
// looks like it should...
db.users.insertOne({"name": "inv#lid"});
// fails
db.users.insertOne({"name": "Valid"});
// succeeds
db.users.find()
// { "_id" : ObjectId("59cd85d84f2803b08e9218ac"), "name" : "Valid" }
Then, I run a dump, which seems fine:
/usr/bin/mongodump --host $MYHOST \
--port $MYPORT \
--db test \
--gzip \
--archive=test.mongodump.gz
Then, a restore, which fails:
/usr/bin/mongorestore --host $MYHOST \
--port $MYPORT \
--gzip \
--archive=test.mongodump.gz
Error:
2017-09-28T23:31:30.626+0000 preparing collections to restore from
2017-09-28T23:31:30.691+0000 reading metadata for test.users from archive 'test.mongodump.gz'
2017-09-28T23:31:30.692+0000 Failed: test.users: error parsing metadata from archive 'test.mongodump.gz': extended json in 'options': expected $regex field to have string value
I've poured over the docs to mongdump and mongorestore, but haven't really gotten anywhere. I have tried --noOptionsRestore, same error.
I'm relatively new to mongo, so I could just be missing something simple...
It turns out that, if you have a $regex, you also needs $options, even if $options = ''. The error message sort of hints at this, but very poorly. Changing the create statement from:
db.createCollection("users", {validator : {
name : {
$type : "string",
$regex : /^[A-z]*$/
},
}
})
to
db.createCollection("users", {validator : {
name : {
$type : "string",
$regex : /^[A-z]*$/,
$options: ''
},
}
})
solves the problem

MongoDB export issue

I am trying to export the MongoDB output to CSV format. But have trouble.
See the following document in my collection:
db.save.find().pretty();
{
"_id" : ObjectId("58884b11e1370511b89d8267"),
"domain" : "google.com",
"emails" : [
{
"email" : "f#google.com",
"first" : "James",
"Last" : "fer"
},
{
"email" : "d#gmail.com",
"first" : "dear",
"last" : "near"
}
]
}
Exporting the document to csv
C:\MongoDB\Server\bin>mongoexport.exe -d Trial -c save -o file.csv --type csv --fields domain,emails
2017-01-25T12:50:54.927+0530 connected to: localhost
2017-01-25T12:50:54.929+0530 exported 1 record
The output file is:
domain,emails
google.com,"[{""email"":""f#google.com"",""first"":""James"",""Last"":""fer""},{""email"":""d#gmail.com"",""first"":""dear"",""last"":""near""}]"
But if I import the same file, the output is different then it was in the actual collection. See the example:
> db.sir.find().pretty()
{
"_id" : ObjectId("5888529fa26b65ae310d026f"),
"domain" : "google.com",
"emails" : "[{\"email\":\"f#google.com\",\"first\":\"James\",\"Last\":\"fer\"},{\"email\":\"d#gmail.com\",\"first\":\"dear\",\"last\":\"near\"}]"
}
I do not want that extra \ in my import document. That's it. Please tell me if it is avoidable and if yes, then what should be the format of CSV to be given for import.
This is not expected format. So let me know how I can make the proper format. Kindly help me with this query.

mongo export array elements

I would like to export 3 different csv files, here is my document
{
"capacities" : [
{
"size" : "A",
"incoming_parcels" : 27,
"outgoing_parcels" : 0,
"empty_compartments" : 0
},
{
"size" : "B",
"incoming_parcels" : 11,
"outgoing_parcels" : 0,
"empty_compartments" : 8
},
{
"size" : "C",
"incoming_parcels" : 2,
"outgoing_parcels" : 1,
"empty_compartments" : 7
}
]
}
I would like to get all documents where capacities[1] = B and then get all fields - same for all sizes.
Here is my syntax for export :
mongoexport.exe --db name --collection name --type csv --out sizeB.csv -q "{'capacities.1.size': 'B'}" -f size,incoming_parcels,outgoing_parcels,empty_compartments
I've also tried -f capacities.1.size etc
One approach you could take is to use the aggregation framework to filter your documents using the above query as your $match operator and then write the documents returned by the aggregation pipeline to a specified collection using the $out operator. You can then export the data from that aggregation output collection. The following outlines the concept:
db.test.aggregate([
{
"$match": {
"capacities.size": "B"
}
},
{
"$unwind": "$capacities"
},
{
"$match": {
"capacities.size": "B"
}
},
{
"$project": {
"size" : "$capacities.size",
"incoming_parcels" : "$capacities.incoming_parcels",
"outgoing_parcels" : "$capacities.outgoing_parcels",
"empty_compartments" : "$capacities.empty_compartments",
}
},
{
"$out": "capacities_output"
}
])
Export to csv:
mongoexport.exe --db name --collection "capacities_output" --csv > sizeB.csv --fields size,incoming_parcels,outgoing_parcels,empty_compartments

Export array of documents from MongoDB in csv

I'm working on a java program to pass from MongoDB to Neo4j.
I have to export some Mongo documents in a csv file.
I have, for example, this document:
"coached_Team" : [
{
"team_id" : "Pal.00",
"in_charge" : {
"from" : {
"day" : 25,
"month" : 9,
"year" : 2013
}
},
"matches" : 75
}
]
I have to export in csv. I read some other questions, for example this and I used that tip to export my document.
To export in csv I use this command:
Z:\path\to\Mongo\3.0\bin>mongoexport --db <database> --collection
<collection> --type=csv --fields coached_Team.0.team_id,coached_Team.0.in_charge.from.day,
coached_Team.0.in_charge.from.month,coached_Team.0.in_charge.from.year,
coached_Team.0.matches --out "C:\path\to\output\file\output.csv
But, it did not work for me:

error when using mongorestore to replay oplog with binData field

When using mongorestore with option --oplogReplay to replay oplogs, I found a strange error that mongorestore cannot handle binData field's set operation. You maybe meet the same error if you do this:
insert a test data.
db.testData.insert({_id: 10000, data: BinData(0, ""), size: 10})
update its binData field.
db.testData.update({_id: 10000}, {$set: {data: BinData(0, "CgxVfs93PiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA=")}})
update its other field
db.testData.update({_id: 10000}, {$set: {size: 20}})
check with oplog
use local
db.oplog.rs.find().sort({$natural: -1})
you may see the following response:
{ "ts" : Timestamp(1435627154, 1), "h" : NumberLong("-4979206321598144076"), "v" : 2, "op" : "u", "ns" : "test.testData", "o2" : { "_id" : 10000 }, "o" : { "$set" : { "size" : 20 } } }
{ "ts" : Timestamp(1435627144, 1), "h" : NumberLong("2899524097634687825"), "v" : 2, "op" : "u", "ns" : "test.testData", "o2" : { "_id" : 10000 }, "o" : { "$set" : { "data" : BinData(0,"CgxVfs93PiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA=") } } }
{ "ts" : Timestamp(1435627136, 1), "h" : NumberLong("-8486373688715225152"), "v" : 2, "op" : "i", "ns" : "test.testData", "o" : { "_id" : 10000, "data" : BinData(0,""), "size" : 10 } }
dump these two oplog and replay it
In bash shell:
mongodump --port 27017 -d local -c oplog.rs --query '{"ts" : {$gte: Timestamp(1435627144, 1)}}' -o ./oplogD/
mv ./oplogD/local/oplog.rs.bson ./oplogR/oplog.bson
mongorestore --port 27017 --oplogReplay ./oplogR/
after this you would find data not as expected. In my own, data changes to this.
{ "_id" : 10000, "data" : BinData(0,"ADRAAAAAPiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA="), "size" : 20 }
The size field is really correct, but the data field is not correct.
The most strange thing would be this, if you dump only one oplog and replay it, the data would be correct.
mongodump --port 27017 -d local -c oplog.rs --query '{"ts" : Timestamp(1435627144, 1)}' -o ./tmpD/
mv ./tmpD/local/oplog.rs.bson ./tmpR/oplog.bson
mongorestore --port 27017 --oplogReplay ./tmpR/
After oplog replayed, the 'data' field is quite correct.
{ "_id" : 10000, "data" : BinData(0,"CgxVfs93PiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA="), "size" : 10 }
Why does this strange thing happen?
It was fixed in this commit.
https://github.com/mongodb/mongo-tools/commit/ed60bbfae7d2b5239bea69f162f0784e17995e91
Trace the bug report in JIRA.
https://jira.mongodb.org/browse/TOOLS-807