Method to insert CSV into mongo - mongodb

Is there any method to insert CSV file into Mongo DB other than using mongoimport tool ? . I need to perform bulk insertions in mongoDB . I referred some sites and i found that there are some issues in using mongo import tool for importing large set of data.Please enlighten me how to insert CSV into mongoDB from application directly ? I need to know if there are any methods or wrappers in c++ or java for inserting CSV into MongoDB. Thanks in advance

the MongoDB shell is able to evaluate JavaScript. So you could write your own parser in JavaScript load your program into the shell and start it. If your don't like JavaScript, there are a lot of drivers for other programming languages where you could load your file and use the driver to insert your data into the database.

you can use bulk option available with mongo2.6
read the file
iterate and save to some variable
bulk.excecute.
mongoimport -c 'check' --db 'test' --type csv --file test.csv --headerline
use mongoimport --help for more help
(edit : -h option is for host)
test.csv
name, age, sex
John, 23, M

Related

How to export only few collections, not all, from a mongo database?

I need to export few collections from mongo database, not all collections.
I know how to export the entire database using command line mongoexport, but I need only few collections
How do I do it?
You can export a few collections in a database. First, find the list of collections you want to export.
From mongo shell you can get list of all collections in the current database using the command: db.getCollectionNames(). You figure the collections you want to export from this list.
Suppose you want to export two collections, "movies" and "books" within the test database.
For Windows, a batch file like this will do the export. E.g.,:
mongo_exp.bat:
for %%coll IN ("movies","books") DO mongoexport --db=test --collection=%%coll --out=%%coll.json
When you run this batch file from the Windows command-prompt it will export two collections to two JSON files: movies.json and books.json.
For UNIX like OS, a bash script like this will do the export. For example: mongo_exp.sh:
#!/bin/sh
declare -a colls=("movies" "books")
for coll in ${colls[#]}
do
mongoexport --db=test --collection=$coll --out=$coll.json
done
In UNIX environment, make sure the script has execute permission before you run it.

How should I go about inserting a full .lic or .licx file into a mongodb collection?

I want to test MongoDB as a possible alternative to my file system set-up. I have 3 folders, two hold JSON data (so no problem there), but one holds .lic and .licx files. I simply want to store and retrieve these files easily from a MongoDB collection in a database. I'm testing on the command line... How would I insert a .licx file into a collection that is in a database?
I've tried a command line argument
I've read a bit about gridFS but no clear example of how to use it.
--db license-server --collection licenses --type BSON --file C:\Users\<myname>\Desktop\<projectname>\private\licenses\<filename>.licx
I expect the licx file to be inserted into the collection with an id so I can retrieve it later.
I'm getting: error validating settings: unknown type bson as an error for the command line command.
To insert a document that's bigger that 16MB or has an extension like .licx for example, run the command
mongofiles -d license-server put <filename(includingfullpath)>.licx
this will store the file in the fs.files and fs.chunks collections within your database.
To retrieve the file on the command line use
mongofiles -d license-server get <filename(includingfullpath)>.licx
Additional Documentation can be found here:
https://docs.mongodb.com/manual/reference/program/mongofiles/#bin.mongofiles

Query result set export to local machine Mongo DB [duplicate]

I am using MongoDB 2.2.2 for 32-bit Windows7 machine. I have a complex aggregation query in a .js file. I need to execute this file on the shell and direct the output to a CSV file. I ensure that the query returns a "flat" json (no nested keys), so it is inherently convertible to a neat csv.
I know about load() and eval(). eval() requires me to paste the whole query into the shell and allows only printjson() inside the script, while I need csv. And, the second way: load()..It prints the output on the screen, and again in json format.
Is there a way Mongo can do this conversion from json to csv? (I need csv file to prepare charts on the data). I am thinking:
1. Either mongo has a built-in command for this that I can't find right now.
2. Mongo can't do it for me; I can at most send the json output to a file which I then need to convert to csv myself.
3. Mongo can send the json output to a temporary collection, the contents of which can be easily mongoexported to csv format. But I think only map-reduce queries support output collections. Is that right? I need it for an aggregation query.
Thanks for any help :)
I know this question is old but I spend an hour trying to export a complex query to csv and I wanted to share my thoughts. First I couldn't get any of the json to csv converters to work (although this one looked promising). What I ended up doing was manually writing the csv file in my mongo script.
This is a simple version but essentially what I did:
print("name,id,email");
db.User.find().forEach(function(user){
print(user.name+","+user._id.valueOf()+","+user.email);
});
This I just piped the query to stdout
mongo test export.js > out.csv
where test is the name of the database I use.
Mongo's in-built export is working fine, unless you want to any data manipulation like format date, covert data types etc.
Following command works as charm.
mongoexport -h localhost -d databse -c collection --type=csv
--fields erpNum,orderId,time,status
-q '{"time":{"$gt":1438275600000}, "status":{"$ne" :"Cancelled"}}'
--out report.csv
Extending other answers:
I found #GEverding's answer most flexible. It also works with aggregation:
test_db.js
print("name,email");
db.users.aggregate([
{ $match: {} }
]).forEach(function(user) {
print(user.name+","+user.email);
}
});
Execute the following command to export results:
mongo test_db < ./test_db.js >> ./test_db.csv
Unfortunately, it adds additional text to the CSV file which requires processing the file before we can use it:
MongoDB shell version: 3.2.10
connecting to: test_db
But we can make mongo shell stop spitting out those comments and only print what we have asked for by passing the --quiet flag
mongo --quiet test_db < ./test_db.js >> ./test_db.csv
Here is what you can try:
print("id,name,startDate")
cursor = db.<collection_name>.find();
while (cursor.hasNext()) {
jsonObject = cursor.next();
print(jsonObject._id.valueOf() + "," + jsonObject.name + ",\"" + jsonObject.stateDate.toUTCString() +"\"")
}
Save that in a file, say "export.js". Run the following command:
mongo <host>/<dbname> -u <username> -p <password> export.js > out.csv
Have a look at
this
for outputing from mongo shell to file.
There is no support for outputing csv from mongos shell. You would have to write the javascript yourself or use one of the many converters available. Google "convert json to csv" for example.
Just weighing in here with a nice solution I have been using. This is similar to Lucky Soni's solution above in that it supports aggregation, but doesn't require hard coding of the field names.
cursor = db.<collection_name>.<my_query_with_aggregation>;
headerPrinted = false;
while (cursor.hasNext()) {
item = cursor.next();
if (!headerPrinted) {
print(Object.keys(item).join(','));
headerPrinted = true;
}
line = Object
.keys(item)
.map(function(prop) {
return '"' + item[prop] + '"';
})
.join(',');
print(line);
}
Save this as a .js file, in this case we'll call it example.js and run it with the mongo command line like so:
mongo <database_name> example.js --quiet > example.csv
I use the following technique. It makes it easy to keep the column names in sync with the content:
var cursor = db.getCollection('Employees.Details').find({})
var header = []
var rows = []
var firstRow = true
cursor.forEach((doc) =>
{
var cells = []
if (firstRow) header.push("employee_number")
cells.push(doc.EmpNum.valueOf())
if (firstRow) header.push("name")
cells.push(doc.FullName.valueOf())
if (firstRow) header.push("dob")
cells.push(doc.DateOfBirth.valueOf())
row = cells.join(',')
rows.push(row)
firstRow = false
})
print(header.join(','))
print(rows.join('\n'))
When executing a script in a remote server. Mongo will add its own logging output, which we might want to omit from our file.
--quiet option will only disable connection related logs. Not all mongo logs. In such case we might need to filter out unneeded lines manually. A Windows based example:
mongo dbname --username userName --password password --host replicaset/ip:port --quiet printDataToCsv.js | findstr /v "NETWORK" > data.csv
This will pipe the script output and use findstr to filter out any lines, which have NETWORK string in them. More information on findstr: https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/findstr
A Linux version of this would use grep.

MongoDb - Export database to js script (similar to rockmongo export)

Is there a way from the command line that I can dump a MongoDb database to a javascript file that can be interpreted by the mongo shell? I am looking for a way to do exactly what the RockMongo Export function does, but I need to be able to call it from a command line script. I've looked everywhere for something that does this but all I can seem to find is mongoexport and mongodump which don't seem to do what I want, as these just create JSON files.
The reason I need to do this is because codeception's MongoDb module requires a file in this format to restore the database after each test. I want to write a script to automate this process so that I don't have to constantly go through RockMongo and generate the dump.
Thanks in advance!
In case anyone else happens to find this, I finally found a solution that works for my scenario. I had to take Markus' suggestion and kind of roll my own solution, but I discovered a mongodb command called bsondump that made things much easier.
So in my script I first use mongodump to create a BSON file of my collection
mongodump --db mydb --collection mycollection --out - > mycollection.bson
I then use bsondump to convert that into JSON that can be used in Shell Mode
bsondump mycollection.bson > mycollection.json
Finally, I'm using PHP so in my PHP script I loop through that json file and wrap each line in an insert statement.
$lines = file('mycollection.json');
$inserts = [];
foreach($lines as $line)
{
$inserts[] = 'db.getCollection("mycollection").insert(' . trim($line) . ');' . PHP_EOL;
}
file_put_contents('output.js', $inserts);
I'm guessing there is probably a better way to do this, but so far this seems to be working nicely for me. Thanks for steering me in the right direction Markus!

Should we create indexes for the fields which are part of mongoexport command?

I am working on a existing Java J2EE application , which is using Mongodb very extensively .
The Application has got some scripts .sh files (bash files ) which runs daily at some point of time whose responsibility is to execute mongoexport command as shown below
mongoexport --csv -o /tmp/people.csv -d school -c people -f firstName,lastName,telephone,email
My question is that , do i need to create indexes on the collection named people for the fields firstName,lastName,telephone,email
Will this add any advantage on mongodb ?? Or creating indexes on these fields is not necessary at all ??
So please let me know
Should we create indexes for the fields which are part of mongoexport command ?
The mongoexport will run in O(N) time (with or without an index) because you're exporting all the records in the collection (e.g. requires a scan on the collection). As mentioned in a comment, indexes are only needed to speed up searching, sorting, and maybe aggregations.