Export populated data from MongoDB to CSV file - mongodb

I am using MongoDB at mLab. I have multiple collections - 1 main and other supporting. Therefore, the main collection consists of IDs pointing to supporting collections. I would like to export the actual data from the main collection to a CSV file. So I need to populate the data first and then export the result.
I see I can export collections individually but then the data are not populated. I suppose I should use bash script to do this but I do not know how.
Could you point me the right direction or suggest a way to do this?
Thank you!

Using the mongo shell will be the better idea in your case, as per the official documents below is the steps to write the bash script to read the data from mongo collection in bash shell scripts:
Simple example to get the data count from a collection with updated date time with greater than 10 days.
DATE2=$(date -d '10 days ago' "+%Y-%m-%dT%H:%M:%S.%3NZ");
counter = $(mongo --quiet dbName --eval 'db.dbCollection.find({"updatedAt":{"$gt":new ISODate("'$DATE'")}}).count()')
echo counter;
Or you can get the list of data and iterate over it to populate it as per your requirements.
For more on mongo shell query click here

Related

How to use mongodb functions with mongoimport?

Let's say I want to insert an object that contains date objects using mongoimport from the commandline.
echo "{\"int_key\": 1, \"date_key\": new Date(\"2022-12-27\")}" | mongoimport --host "192.168.60.10" --db example_db --collection example_collection
will not work because the object I am trying to insert is not in the form of a valid JSON. The reason I want to use mongoimport is because there is an array of a large number of objects that I want to persist at one go. If I try to use the mongo command the argument length for --eval is too long. For example,
mongo --host "192.168.60.10" --eval "db=db.getSiblingDB(\"example_db\");db.getCollection(\"example_collection\").insert([{\"int_key\": 1, \"date_key\": new Date(\"2022-12-27\")}])"
but the array inside insert() has a very large number of objects. Can you suggest any workaround to this? I was thinking I could use mongoimport to read all the objects put into an array through stdin or a file. The options for using a json array would not allow the kind of array of objects I insert using the insert() in mongo --eval.
You have to use this
echo "{\"int_key\": 1, \"date_key\": {\"$date\": \"2022-12-27\"}}"
It may require:
echo "{\"int_key\": 1, \"date_key\": {\"\$date\": \"2022-12-27T00:00:00Z\"}}"
For other data types see MongoDB Extended JSON (v2)
I use mongoimport in the same way to insert around 6 billion documents per day, it is very fast and reliable.
Depending on how you use it, mongoimport does not import small amount of documents could be relevant for you.

How to prevent specific collection not to delete

I am on windows platform. I have a shell script which delete the whole database when cronjob cal this file.
File is
delete.sh
#!/usr/bin/env mongo
mongo 127.0.0.1:27021/test --eval "db.dropDatabase()"
Let's say I have a collection named as "Doctor" Now I don't want to delete that table.
Any idea how to acheive this.
can we use --excludeCollection here???
There is no such flag. Dropping database means dropping the whole database. If you need at least one collection to remain you need to keep the database. Collections don't exists without a database.
What you can do is to drop collections instead. Use db.getCollectionNames and drop them one by one excluding the ones you want to keep.
E.g. with filter:
mongo 127.0.0.1:27021/test --eval "db.getCollectionNames().filter(c=>!['Doctor'].includes(c)).forEach(c=>db.getCollection(c).drop())"
Or using getCollectionInfos with query filter:
mongo 127.0.0.1:27021/test --eval "db.getCollectionInfos({name:{$nin: ['Doctor']}}).forEach(({name})=>db.getCollection(name).drop())"
You may need to escape the dollar sign in $nin. I can't recall how shell scripts work on windows.

extract date from object id and export it to csv in mongodb

I am pretty new to mongodb. Am trying to export data from a collection to a csv file. I have done that and it works fine. I have a question . Is there a way to export just date from ObjectId to a new column. I understand we can get date from ObjectId using ObjectId.getTimestamp(). Is there a way we can do the same for mongoexport. Below is the query i use to export data
mongoexport --db MyDB --collection CollectionName --type=csv --fieldFile fieldsList.txt --out Data.csv
You cannot do this with mongoexport, but if the case is generally simple enough then you can really just use the mongo shell.
For instance to just export data from all fields in a collection with a flat structure and append the last field as the timestamp then you can do:
mongo localhost/MyDB --quiet --eval 'db.CollectioName.find().forEach(d => print(Object.keys(d).concat(["#time"]).map(k => (k === "#time") ? d["_id"].getTimestamp().valueOf() : d[k].valueOf() ).join(", ")))' > Data.csv
Showing the script part as pretty:
db.CollectioName.find().forEach(d =>
print(Object.keys(d).concat(["#time"]).map(k =>
(k === "#time") ? d["_id"].getTimestamp().valueOf() : d[k].valueOf() ).join(", ")
)
)
Which essentially says that when iterating all documents for the given collection we
Grab a list of all document fields
Append the "special" field of #time to the end of the list
Loop those fields and return an array of values - where the #time gets the timestamp from the ObjectId in _id
Join the result array with commas and print all of it out
If you had a list of fields then you could simply replace the Object.keys(d) part with an array of field names, such as:
db.CollectioName.find().forEach(d =>
print(["_id","field1","field2"].concat(["#time"]).map(k =>
(k === "#time") ? d["_id"].getTimestamp().valueOf() : d[k].valueOf() ).join(", ")
)
)
But really as long as you provide the database to connect to and the --quiet and --eval options with the script line, then you can simply redirect the output to your destination file, from any scripting you want.
It does not take all considerations for a CSV into account. But it is a "quick and dirty" solution for most basic cases at a pinch, or at the very least a starting point for expansion without writing a full program listing.
If you really want more than this, then there are drivers for your language of choice as well as a plethora of CSV writing libraries for every single one of those languages. And it's really not that much harder than the listing here, especially with a library taking all "quoting" considerations into mind.

mongoexport csv not exporting time

I am trying to export a collection to csv which has the following fields:
_id
number
name
price
pollingTime
I can see the polling time data when I open the collection in RoboMongo or try to access the collection through mongoshell, but when I export that into a CSV, the pollingTime field comes out blank.
Here's my mongoexport command:
mongoexport --db=itemDB --collection=itemprice1 --type=csv --fieldFile=fields.txt --out items.csv
I need to send this data to some non-tech business folks; any idea if I need to make any changes in the fields.txt. Fields.txt is like this:
_id
number
name
totalPrice
pollingTimme
Apologies - I discovered immediately after while reviewing my question itself that I was making a mistake in the spelling of pollingTime field. I still want to keep the question and answer on StackOverflow so that others searching such a problem will try to look at their spellings :-)

Is there a better way to export a mongodb query to a new collection?

What I want:
I have a master collection of products, I then want to filter them and put them in a separate collection.
db.masterproducts.find({category:"scuba gear"}).copyTo(db.newcollection)
Of course, I realise the 'copyTo' does not exist.
I thought I could do it with MapReduce as results are created in a new collection using the new 'out' parameter in v1.8; however this new collection is not a subset of my original collection. Or can it be if I use MapReduce correctly?
To get around it I am currently doing this:
Step 1:
/usr/local/mongodb/bin/mongodump --db database --collection masterproducts -q '{category:"scuba gear"}'
Step 2:
/usr/local/mongodb/bin/mongorestore -d database -c newcollection --drop packages.bson
My 2 step method just seems rather inefficient!
Any help greatly appreciated.
Thanks
Bob
You can iterate through your query result and save each item like this:
db.oldCollection.find(query).forEach(function(x){db.newCollection.save(x);})
You can create small server side javascript (like this one, just add filtering you want) and execute it using eval
You can use dump/restore in the way you described above
Copy collection command shoud be in mongodb soon (will be done in votes order)! See jira feature.
You should be able to create a subset with mapreduce (using 'out'). The problem is mapreduce has a special output format so your documents are going to be transformed (there is a JIRA ticket to add support for another format, but I can not find it at the moment). It is also going to be very inefficent :/
Copying a cursor to a collection makes a lot of sense, I suggest creating a ticket for this.
there is also toArray() method which can be used:
//create new collection
db.creatCollection("resultCollection")
// now query for type="foo" and insert the results into new collection
db.resultCollection.insert( (db.orginialCollection.find({type:'foo'}).toArray())