Query result set export to local machine Mongo DB [duplicate] - mongodb

I am using MongoDB 2.2.2 for 32-bit Windows7 machine. I have a complex aggregation query in a .js file. I need to execute this file on the shell and direct the output to a CSV file. I ensure that the query returns a "flat" json (no nested keys), so it is inherently convertible to a neat csv.
I know about load() and eval(). eval() requires me to paste the whole query into the shell and allows only printjson() inside the script, while I need csv. And, the second way: load()..It prints the output on the screen, and again in json format.
Is there a way Mongo can do this conversion from json to csv? (I need csv file to prepare charts on the data). I am thinking:
1. Either mongo has a built-in command for this that I can't find right now.
2. Mongo can't do it for me; I can at most send the json output to a file which I then need to convert to csv myself.
3. Mongo can send the json output to a temporary collection, the contents of which can be easily mongoexported to csv format. But I think only map-reduce queries support output collections. Is that right? I need it for an aggregation query.
Thanks for any help :)

I know this question is old but I spend an hour trying to export a complex query to csv and I wanted to share my thoughts. First I couldn't get any of the json to csv converters to work (although this one looked promising). What I ended up doing was manually writing the csv file in my mongo script.
This is a simple version but essentially what I did:
print("name,id,email");
db.User.find().forEach(function(user){
print(user.name+","+user._id.valueOf()+","+user.email);
});
This I just piped the query to stdout
mongo test export.js > out.csv
where test is the name of the database I use.

Mongo's in-built export is working fine, unless you want to any data manipulation like format date, covert data types etc.
Following command works as charm.
mongoexport -h localhost -d databse -c collection --type=csv
--fields erpNum,orderId,time,status
-q '{"time":{"$gt":1438275600000}, "status":{"$ne" :"Cancelled"}}'
--out report.csv

Extending other answers:
I found #GEverding's answer most flexible. It also works with aggregation:
test_db.js
print("name,email");
db.users.aggregate([
{ $match: {} }
]).forEach(function(user) {
print(user.name+","+user.email);
}
});
Execute the following command to export results:
mongo test_db < ./test_db.js >> ./test_db.csv
Unfortunately, it adds additional text to the CSV file which requires processing the file before we can use it:
MongoDB shell version: 3.2.10
connecting to: test_db
But we can make mongo shell stop spitting out those comments and only print what we have asked for by passing the --quiet flag
mongo --quiet test_db < ./test_db.js >> ./test_db.csv

Here is what you can try:
print("id,name,startDate")
cursor = db.<collection_name>.find();
while (cursor.hasNext()) {
jsonObject = cursor.next();
print(jsonObject._id.valueOf() + "," + jsonObject.name + ",\"" + jsonObject.stateDate.toUTCString() +"\"")
}
Save that in a file, say "export.js". Run the following command:
mongo <host>/<dbname> -u <username> -p <password> export.js > out.csv

Have a look at
this
for outputing from mongo shell to file.
There is no support for outputing csv from mongos shell. You would have to write the javascript yourself or use one of the many converters available. Google "convert json to csv" for example.

Just weighing in here with a nice solution I have been using. This is similar to Lucky Soni's solution above in that it supports aggregation, but doesn't require hard coding of the field names.
cursor = db.<collection_name>.<my_query_with_aggregation>;
headerPrinted = false;
while (cursor.hasNext()) {
item = cursor.next();
if (!headerPrinted) {
print(Object.keys(item).join(','));
headerPrinted = true;
}
line = Object
.keys(item)
.map(function(prop) {
return '"' + item[prop] + '"';
})
.join(',');
print(line);
}
Save this as a .js file, in this case we'll call it example.js and run it with the mongo command line like so:
mongo <database_name> example.js --quiet > example.csv

I use the following technique. It makes it easy to keep the column names in sync with the content:
var cursor = db.getCollection('Employees.Details').find({})
var header = []
var rows = []
var firstRow = true
cursor.forEach((doc) =>
{
var cells = []
if (firstRow) header.push("employee_number")
cells.push(doc.EmpNum.valueOf())
if (firstRow) header.push("name")
cells.push(doc.FullName.valueOf())
if (firstRow) header.push("dob")
cells.push(doc.DateOfBirth.valueOf())
row = cells.join(',')
rows.push(row)
firstRow = false
})
print(header.join(','))
print(rows.join('\n'))

When executing a script in a remote server. Mongo will add its own logging output, which we might want to omit from our file.
--quiet option will only disable connection related logs. Not all mongo logs. In such case we might need to filter out unneeded lines manually. A Windows based example:
mongo dbname --username userName --password password --host replicaset/ip:port --quiet printDataToCsv.js | findstr /v "NETWORK" > data.csv
This will pipe the script output and use findstr to filter out any lines, which have NETWORK string in them. More information on findstr: https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/findstr
A Linux version of this would use grep.

Related

MongoDb - Export database to js script (similar to rockmongo export)

Is there a way from the command line that I can dump a MongoDb database to a javascript file that can be interpreted by the mongo shell? I am looking for a way to do exactly what the RockMongo Export function does, but I need to be able to call it from a command line script. I've looked everywhere for something that does this but all I can seem to find is mongoexport and mongodump which don't seem to do what I want, as these just create JSON files.
The reason I need to do this is because codeception's MongoDb module requires a file in this format to restore the database after each test. I want to write a script to automate this process so that I don't have to constantly go through RockMongo and generate the dump.
Thanks in advance!
In case anyone else happens to find this, I finally found a solution that works for my scenario. I had to take Markus' suggestion and kind of roll my own solution, but I discovered a mongodb command called bsondump that made things much easier.
So in my script I first use mongodump to create a BSON file of my collection
mongodump --db mydb --collection mycollection --out - > mycollection.bson
I then use bsondump to convert that into JSON that can be used in Shell Mode
bsondump mycollection.bson > mycollection.json
Finally, I'm using PHP so in my PHP script I loop through that json file and wrap each line in an insert statement.
$lines = file('mycollection.json');
$inserts = [];
foreach($lines as $line)
{
$inserts[] = 'db.getCollection("mycollection").insert(' . trim($line) . ');' . PHP_EOL;
}
file_put_contents('output.js', $inserts);
I'm guessing there is probably a better way to do this, but so far this seems to be working nicely for me. Thanks for steering me in the right direction Markus!

Method to insert CSV into mongo

Is there any method to insert CSV file into Mongo DB other than using mongoimport tool ? . I need to perform bulk insertions in mongoDB . I referred some sites and i found that there are some issues in using mongo import tool for importing large set of data.Please enlighten me how to insert CSV into mongoDB from application directly ? I need to know if there are any methods or wrappers in c++ or java for inserting CSV into MongoDB. Thanks in advance
the MongoDB shell is able to evaluate JavaScript. So you could write your own parser in JavaScript load your program into the shell and start it. If your don't like JavaScript, there are a lot of drivers for other programming languages where you could load your file and use the driver to insert your data into the database.
you can use bulk option available with mongo2.6
read the file
iterate and save to some variable
bulk.excecute.
mongoimport -c 'check' --db 'test' --type csv --file test.csv --headerline
use mongoimport --help for more help
(edit : -h option is for host)
test.csv
name, age, sex
John, 23, M

Printing Mongo query output to a file while in the mongo shell

2 days old with Mongo and I have a SQL background so bear with me. As with mysql, it is very convenient to be in the MySQL command line and output the results of a query to a file on the machine. I am trying to understand how I can do the same with Mongo, while being in the shell
I can easily get the output of a query I want by being outside of the shell and executing the following command:
mongo localhost:27017/dbname --eval "printjson(db.collectionName.findOne())" > sample.json
The above way is fine, but it requires me to exit the mongo shell or open a new terminal tab to execute this command. It would be very convenient if I could simply do this while still being inside the shell.
P.S: the Question is an offshoot of a question I posted on SO
AFAIK, there is no a interactive option for output to file, there is a previous SO question related with this: Printing mongodb shell output to File
However, you can log all the shell session if you invoked the shell with tee command:
$ mongo | tee file.txt
MongoDB shell version: 2.4.2
connecting to: test
> printjson({this: 'is a test'})
{ "this" : "is a test" }
> printjson({this: 'is another test'})
{ "this" : "is another test" }
> exit
bye
Then you'll get a file with this content:
MongoDB shell version: 2.4.2
connecting to: test
> printjson({this: 'is a test'})
{ "this" : "is a test" }
> printjson({this: 'is another test'})
{ "this" : "is another test" }
> exit
bye
To remove all the commands and keep only the json output, you can use a command similar to:
tail -n +3 file.txt | egrep -v "^>|^bye" > output.json
Then you'll get:
{ "this" : "is a test" }
{ "this" : "is another test" }
We can do it this way -
mongo db_name --quiet --eval 'DBQuery.shellBatchSize = 2000; db.users.find({}).limit(2000).toArray()' > users.json
The shellBatchSize argument is used to determine how many rows is the mongo client allowed to print. Its default value is 20.
If you invoke the shell with script-file, db address, and --quiet arguments, you can redirect the output (made with print() for example) to a file:
mongo localhost/mydatabase --quiet myScriptFile.js > output
There are ways to do this without having to quit the CLI and pipe mongo output to a non-tty.
To save the output from a query with result x we can do the following to directly store the json output to /tmp/x.json:
> EDITOR="cat > /tmp/x.json"
> x = db.MyCollection.find(...).toArray()
> edit x
>
Note that the output isn't strictly Json but rather the dialect that Mongo uses.
In the new mongodb shell 5.0+ mongosh, it integrate the Node.js fs module, so you can simply do below in the new mongosh shell:
fs.writeFileSync('output.json', JSON.stringify(db.collectionName.findOne()))
This also avoid problems such as the ObjectId(...) being included in the tojson result, which is not valid JSON string.
The above code works according to the docs describes:
The MongoDB Shell, mongosh, is a fully functional JavaScript and Node.js 14.x REPL environment for interacting with MongoDB deployments. You can use the MongoDB Shell to test queries and operations directly with your database.
The old mongo shell already marked as Legacy, so use the mongosh if possible.
It may be useful to you to simply increase the number of results that get displayed
In the mongo shell > DBQuery.shellBatchSize = 3000
and then you can select all the results out of the terminal in one go and paste into a text file.
It is what I am going to do :)
(from : https://stackoverflow.com/a/3705615/1290746)
Combining several conditions:
write mongo query in JS file and send it from terminal
switch/define a database programmatically
output all found records
cut initial output lines
save the output into JSON file
myScriptFile.js
// Switch current database to "mydatabase"
db = db.getSiblingDB('mydatabase');
// The mark for cutting initial output off
print("CUT_TO_HERE");
// Main output
// "toArray()" method allows to get all records
printjson( db.getCollection('jobs').find().toArray() );
Sending the query from terminal
-z key of sed allows treat output as a single multi-line string
$> mongo localhost --quiet myScriptFile.js | sed -z 's/^.*CUT_TO_HERE\n//' > output.json

Mongodb find() result to file

How can I save my output from db.xxx.find() to a flat file. As a quick search in stackoverflow, most of the answers are suggest use
mongo 127.0.0.1/db --eval "var c = db.collection.find(); while(c.hasNext()) {printjson(c.next())}" >> test.txt
or
mongo 127.0.0.1/db script.js >> test.txt
But I need to do the authentication and other stuffs before doing find(). And I need to do it inside the mongo console. Any suggestions?

MongoDB : Show databases like MySQL

My MongoDB has more than 100 databases in it.
Whenever i use show dbs command my screen is filled with all databases names and it makes hard to find a particular database.
How to display only those databases which contain a particular substring as we can query in MySQL for displaying particular databases with ( show databases like '%SUBSTR%' ) query.
We do not have options like that. But your problem can be resolved by outputting the result into a txt file and later opening it
$ mongo | tee outnew.txt
In the mongo shell you can the give the show dbs command and exit.
mongo> show dbs;
mongo> exit
Then using gedit or excel access the outnew.txt file.
Hope it helped.
Another option is:
> db.getMongo().getDBNames().forEach(
function(databaseName) {
if (databaseName.match(/SUBSTR/i))
print(databaseName);
}
);
> var showdbs = function(pattern) {
db.getMongo().getDBNames().forEach(
function(databaseName) {
if (databaseName.match(new RegExp(pattern, 'i')))
print(databaseName);
});
};
> showdbs('SUBSTR'); // ALL: showdbs();
If you are on a *nix OS you could run the following command.
mongo --eval "db.adminCommand('listDatabases')['databases']" | grep "SUBSTR"
Note: You need to be admin to run this command.