mongodump 1 single document fails because ObjectId - mongodb

I'm trying to create a mongo dump for 1 single document inside a mongo collection.
When I do this on windows command line:
mongodump /host:x.x.x.x /port:27017 /username:my_user /password:my_pass -d my_db -o C:\fsdump -c "fs.files" -q '{_id: ObjectId("28ad7bkjia3e927d690385ec")}'
I get this error:
positional arguments not allowed: [ObjectId(28ad7bkjia3e927d690385ec)}']
when I change the id in mongo from ObjectId("28ad7bkjia3e927d690385ec") to "28ad7bkjia3e927d690385ec", and when I dump like this:
mongodump /host:x.x.x.x /port:27017 /username:my_user /password:my_pass -d my_db -o C:\fsdump -c "fs.files" -q '{_id: "28ad7bkjia3e927d690385ec"}'
then it works as expected
so my question is how can I use mongodump and do filtering on specific ObjectId's ?
or is there another way to create an export for a subset of documents in a collection (instead of the entire collection)?

ObjectId("28ad7bkjia3e927d690385ec") is a javascript function call to the ObjectId constructor. This is not valid JSON. "28ad7bkjia3e927d690385ec" is also not a valid ObjectId.
Mongodump uses an extended form of JSON, which had a tag to indicate the field type, so you would specify an ObjectId like this:
-q '{"_id":{"$oid":"5f76b7cc0311bd14f80a3dec"}}'

Related

generate a specific key pair from mongodb from UNIX shell

Using mongodb and am trying to get a specific value from a collection in the db. I am able to get the complete export using
mongoexport --db database --collection name
But the output is a large file and I am trying to get a specific set of key/pair in it.
ex: "Name": "Value"
there are several names and I just need to print all the names in the collection.
What would be the command syntax from a UNIX shell ?
I looked at this but that is from with in the mongo shell.
thanks
To request all fields from collection yourCollection in MyDatabase :
mongo --quiet 127.0.0.1/MyDatabase --eval 'printjson(db.yourCollection.find().toArray());'
To request only fields name field from collection yourCollection in MyDatabase :
mongo --quiet 127.0.0.1/MyDatabase --eval 'printjson(db.yourCollection.find({},{"_id":0,"name":1}).toArray());'
You could also have a script to execute and save you the time of manually writing in those commands. Execute something like
mongo localhost:27017/test myfile.js
and in the javascript file input
db.name.findOne()
db.name().find({},{"_id":0,"Name":1}).toArray());
please see https://docs.mongodb.com/manual/tutorial/write-scripts-for-the-mongo-shell/ and [How to execute mongo commands through shell scripts?
If your goal is to export documents matching a specific condition from the database, you can pass a query to mongoexport using the -q parameter. For example:
mongoexport -d db -c coll -q '{"Name":"Value"}'
This will export all documents containing the field "Name" having the value "Value".
You can also pass the --quiet parameter to mongoexport if you prefer to have the output without any informative content, such as number of exported documents.
Please see https://docs.mongodb.com/manual/reference/program/mongoexport/ for more information regarding the mongoexport tool.

Inserting JSON document into mongo error

I'm trying to insert the TWDS1E1.json file into mongodb through the command prompt:
db.collections.insert( TWDS1E1.json )
But getting the error:
TWDS1E1.json is not defined.
Mongo is not my thing, what am I doing wrong here?
In command prompt whose directory path is the path where mongoimport.exe is available type the commands
For normal JSON
mongoimport -d test -c docs --file example2.json
For array type JSON
mongoimport --jsonArray -d test -c docs --file example2.json
Please see docs for more information
You cannot use the collection.insert() command to insert a file.
insert() is used to insert actual objects, e.g.
db.myCollection.insert({"name":"buzz"});
To bulk load a JSON file, use mongoimport

Mongodb export csv in sorted order

I'm trying to export an entire MongoDB collection sorted by some of the fields. I'm led to believe that the following should work:
$ mongoexport --csv -d my_db -c my_collection -f field1.subfield,field2.subfield -o d.csv -q '{$query:{},$orderby:{"field1.subfield":1}}'
Unfortunately, this only exports one record in the collection (there are 18478 records) and the data exported is blank. Leaving the $orderby blank, like so,
$ mongoexport --csv -d my_db -c my_collection -f field1.subfield,field2.subfield -o d.csv -q '{$query:{},$orderby:{}}'
, exports the whole collection the way I want, so clearly the orderby clause is wrong. What am I doing wrong?
mongoexport utility does not expect you to sort data.

Is it possible to mongodump the last "x" records from a collection?

Can you use mongodump to dump the latest "x" documents from a collection? For example, in the mongo shell you can execute:
db.stats.find().sort({$natural:-1}).limit(10);
Is this same capability available to mongodump?
I guess the workaround would be to dump the above documents into a new temporary collection and mongodump the entire temp collection, but would be great to just be able to do this via mongodump.
Thanks in advance,
Michael
mongodump does not fully expose the cursor interfaces.
But you can work around it, using the --query parameter.
First get the total number of documents of the collection
db.collection.count()
Let's say there are 10000 documents and you want the last 1000.
To do so get the id of first document you want to dump.
db.collection.find().sort({_id:1}).skip(10000 - 1000).limit(1)
In this example the id was "50ad7bce1a3e927d690385ec".
Now you can feed mongodump with this information, to dump all documents a with higher or equal id.
$ mongodump -d 'your_database' -c 'your_collection' -q '{_id: {$gte: ObjectId("50ad7bce1a3e927d690385ec")}}'
UPDATE
The new parameters --limit and --skip were added to mongoexport will be probably available in the next version of the tool: https://github.com/mongodb/mongo/pull/307
Building off of Mic92's answer, to get the most recent 1000 items from a collection:
Find the _id of the 1000th most recent item:
db.collection.find('', {'_id':1}).sort({_id:-1}).skip(1000).limit(1)
It will be something like 50ad7bce1a3e927d690385ec.
Then pass this _id in a query to mongodump:
$ mongodump -d 'your_database' -c 'your_collection' -q '{"_id": {"$gt": {"$oid": "50ad7bce1a3e927d690385ec"}}}'
mongodump supports the --query operator. If you can specify your query as a json query, you should be able to do just that.
If not, then your trick of running a query to dump the records into a temporary collection and then dumping that will work just fine. In this case, you could automate the dump using a shell script that calls a mongo with a javascript command to do what you want and then calling mongodump.
I was playing with a similar requirement (using mongodump) where I wanted to do sequential backup and restore. I would take dump from last stored timestamp.
I couldn't get through --query '{ TIMESTAMP : { $gte : $stime, $lt : $etime } }'
Some points to note:
1) use single quote instead of double
2) do not escape $ or anything
3) replacing $stime/$etime with real numbers will make the query work
4) problem I had was with getting $stime/$etime resolved before mongodump executes itself
under -x it showed as
+ eval mongodump --query '{TIMESTAMP:{\$gte:$utc_stime,\$lt:$utc_etime}}'
++ mongodump --query '{TIMESTAMP:$gte:1366700243}' '{TIMESTAMP:$lt:1366700253}'
Hell, the problem was evident. query gets converted into two conditionals.
The solution is tricky and I got it after repeated trials....
escape { and } ie use { ..} . This fixes the problem.
try this:
NUM=10000
doc=selected_doc
taskid=$(mongo 127.0.0.1/selected_db -u username -p password --eval "db.${doc}.find({}, {_id: 1}).sort({_id: -1}).skip($NUM).limit(1)" | grep -E -o '"[0-9a-f]+"')
mongodump --collection $doc --db selected_db --host 127.0.0.1 -u username -p password -q "{_id: {\$gte: $taskid}}" --out ${doc}.dump
_id-based approaches may not work if you use a custom _id for your collection (such as returned by a 3rd party API). In that case, you should depend on a createdAt or equivalent field:
COL="collectionName"
HOW_MANY=10000
DATE_CUTOFF=$(mongo <host, user, pass...> dbname --quiet \
--eval "db.$COL.find({}, { createdAt: 1 }).sort({ createdAt: -1 }).skip($HOW_MANY).limit(1)"\
| grep -E -o '(ISODate\(.*?\))')
echo "Copying $HOW_MANY items after $DATE_CUTOFF..."
mongodump <host, user, pass...> -d dbname -c ${COL}\
-q "{ createdAt: { \$gte: $DATE_CUTOFF} }" --gzip
Strategy is simple but there are some challenges to doing that. I am assuming we are using _id field to do that. And we know _id field is incremental thus it is a good indicator to find recent documents.
Find X'th record accordingly in the collection
Extract the _id field of document
Use the _id field in mongodump --query
Find X'th record accordingly in the collection
We can achive that by using --eval with mongo tool.
Sort the documents newest to oldest
limit X record reverse sort it
take the first document (limit:1)
stringify Id
mongo --host=$mongodb_uri --quiet --eval db.myCollection.aggregate([{\$sort:{_id:-1}},{\$limit:$MAX_DOCUMENT},{\$sort:{_id:1}},{\$limit:1},{\$project:{_id:{\$toString:\"\$_id\"}}}])
result={ "_id" : "62440d84c18a957093f6c8a3" }
Extract the _id field of document
We need the exact value of _id so we do some regex
$(echo $result | sed -e 's/{ "_id" : "\(.*\)" }/\1/')
lastId=62440d84c18a957093f6c8a3
Use the _id field in mongodump --query
mongodump does not accept ObjectId so we should use $oid to indicate ObjectId fields.
query="{\"_id\":{\"\$gte\":{\"\$oid\":\"$lastId\"}}}"
Here is the complete bash script
dump()
{
local lastIdQuery="db.$collection.aggregate([{\$sort:{_id:-1}},{\$limit:$MAX_DOCUMENT},{\$sort:{_id:1}},{\$limit:1},{\$project:{_id:{\$toString:\"\$_id\"}}}])"
echo "lastIdQuery $lastIdQuery"
local lastIdResult=$(mongo --host=$mongodb_uri --quiet --eval "$lastIdQuery")
echo "lastIdResult $lastIdResult"
local lastId=$(echo $lastIdResult | sed -e 's/{ "_id" : "\(.*\)" }/\1/')
echo $lastId
query="{\"_id\":{\"\$gte\":{\"\$oid\":\"$lastId\"}}}"
echo "query $query"
mongodump --uri=$mongodb_uri --collection $collection --query="$query" --out=$outFolder
}
mongodb_uri='mongodb://localhost:27017/myDb'
outFolder=./backup
MAX_DOCUMENT=100
collection="users"
dump

Export one object with mongoexport, how to specify _id?

I'm trying to export just one object with mongoexport, filtering by its ID.
I tried:
mongoexport -d "kb_development" -c "articles" -q "{'_id': '4e3ca3bc38c4f10adf000002'}"
and many variations, but it keeps saying
connected to: 127.0.0.1
exported 0 records
(and I'm sure there is such an object in the collection)
In mongo shell I would use ObjectId('4e3ca3bc38c4f10adf000002'), but it does not seem to work in the mongoexport query.
I think you should be able to use ObjectId(...) in the query argument to mongoexport:
mongoexport -d kb_development -c articles -q '{_id: ObjectId("4e3ca3bc38c4f10adf000002")}'
If that does not work, you can use the "strict mode" javascript notation of ObjectIds, as documented here:
mongoexport -d kb_development -c articles -q '{_id: {"$oid": "4e3ca3bc38c4f10adf000002"}}'
(Also note that strict mode JSON is the format produced by mongoexport)
You have to specify the _id field by using the ObjectId type. In your question it was specified as a string.
CODE ::
mongoexport -h localhost -d my_database -c sample_collection -q '{key:ObjectId("50584580ff0f089602000155")}' -o my_output_file.json
NOTE :: dont forgot quotes in query
My MongoDB verion: 3.2.4. when I use mongoexport tool in mongo shell:
NOT WORK:
-q '{"_id":ObjectId("5719cd12b1168b9d45136295")}'
-q '{_id: {"$oid": "5719cd12b1168b9d45136295"}}'
WORKs:
-q "{_id:ObjectId('5719cd12b1168b9d45136295')}"
- Though in mongo doc , it says that
You must enclose the query in single quotes (e.g. ') to ensure that it
does not interact with your shell environment.
- But, single quote(') does not work! please use double quote(")!
for mongoexport version: r4.2.3
mongoexport -q '{"_id": {"$oid": "4e3ca3bc38c4f10adf000002"}}'
and for a nested field
mongoexport -q '{"_id": {"$oid": "4e3ca3bc38c4f10adf000002"}}' --fields parentField.childField
You do not have to add ObjectId or $oid as suggested by answers above. As has been mentioned by #Blacksad, just get your single and double quotes right.
mongoexport -d kb_development -c articles -q '{_id:"4e3ca3bc38c4f10adf000002"}'
many of the answers provided here didn't work for me, the error was with my double quotes. Here is what worked for me:
mongoexport -h localhost -d database_name -c collection_name -q {_id:ObjectId('50584580ff0f089602066633')} -o output_file.json
remember to use single quote only for the ObjectId string.