I am importing data into mongodb by this command
mongoimport -d dataBase --collection ip2location --type csv --file "/home/oodles/git/csv/IP2LOCATION-LITE-DB11.CSV" --fields _id,ipFrom,ipTo,countryCode,countryName,regionName,cityName,latitude,longitude,zipCode,timeZone
import is successfull
but the problem is
db.ip2Location.find().pretty()
getting
"_id" : ObjectId("52be7f25c80e0735273985bf"), ///here is requirement need "_id" : NumberLong(1)
"ipFrom" : NumberLong(16777216),
"ipTo" : NumberLong(16777471),
"countryCode" : "AU",
"countryName" : "AUSTRALIA",
"regionName" : "QUEENSLAND",
"cityName" : "SOUTH BRISBANE",
"latitude" : -27.48333,
"longitude" : 153.01667,
"zipCode" : 4101,
"timeZone" : "+10:00"
first line is "_id" : ObjectId("52be7f25c80e0735273985bf"),
but I need like this: "id" : NumberLong(1)
cvs data sample
"16777216","16777471","AU","AUSTRALIA","QUEENSLAND","SOUTH BRISBANE","-27.483330","153.016670","4101","+10:00"
csv data sample
"16777216","16777471","AU","AUSTRALIA","QUEENSLAND","SOUTH BRISBANE","-27.483330","153.016670","4101","+10:00"
The sample line doesn't match the list of --fields provided; it has one field less. The first field should be the _id you want to use (or 1 per your example).
Corrected line:
1, "16777216","16777471","AU","AUSTRALIA","QUEENSLAND","SOUTH BRISBANE","-27.483330","153.016670","4101","+10:00"
I tested and this works as expected in MongoDB 2.0.4.
I can't reproduce your results of having the _id added automatically with the first field value missing; when I tried with MongoDB 2.0.4 it assigned values to fields in the order listed so that the _id became 16777216, ipFrom was 16777471, etc. I suspect you may have been viewing a document inserted in an earlier mongoimport run where your --fields list did not include _id.
You should also be aware that mongoimport only inserts data (it does not do updates or upserts). If there is already a document with the given _id than mongoimport will ignore that line of your CSV.
An easier way to keep the fields and CSV data in sync is to have the list of fields as the first line in your CSV and then use mongoimport --headerline ... instead of --fields.
Related
I am trying to export the MongoDB output to CSV format. But have trouble.
See the following document in my collection:
db.save.find().pretty();
{
"_id" : ObjectId("58884b11e1370511b89d8267"),
"domain" : "google.com",
"emails" : [
{
"email" : "f#google.com",
"first" : "James",
"Last" : "fer"
},
{
"email" : "d#gmail.com",
"first" : "dear",
"last" : "near"
}
]
}
Exporting the document to csv
C:\MongoDB\Server\bin>mongoexport.exe -d Trial -c save -o file.csv --type csv --fields domain,emails
2017-01-25T12:50:54.927+0530 connected to: localhost
2017-01-25T12:50:54.929+0530 exported 1 record
The output file is:
domain,emails
google.com,"[{""email"":""f#google.com"",""first"":""James"",""Last"":""fer""},{""email"":""d#gmail.com"",""first"":""dear"",""last"":""near""}]"
But if I import the same file, the output is different then it was in the actual collection. See the example:
> db.sir.find().pretty()
{
"_id" : ObjectId("5888529fa26b65ae310d026f"),
"domain" : "google.com",
"emails" : "[{\"email\":\"f#google.com\",\"first\":\"James\",\"Last\":\"fer\"},{\"email\":\"d#gmail.com\",\"first\":\"dear\",\"last\":\"near\"}]"
}
I do not want that extra \ in my import document. That's it. Please tell me if it is avoidable and if yes, then what should be the format of CSV to be given for import.
This is not expected format. So let me know how I can make the proper format. Kindly help me with this query.
My doc:
db.org.insert({
"id" : 28,
"organisation" : "Mickey Mouse company",
"country" : "US",
"contactpersons" : [{
"title" : "",
"typecontact" : "D",
"mobilenumber" : "757784854",
"firstname" : "Mickey",
"lastname" : "Mouse",
"emailaddress" : "mickey#mouse.com"
},
{
"title" : "",
"typecontact" : "E",
"mobilenumber" : "757784854",
"firstname" : "Donald",
"lastname" : "Duck",
"emailaddress" : "donald#duck.com"
}],
"modifieddate" : "2013-11-21T16:04:49+0100"
});
My query:
mongoexport --host localhost --db sample --collection org --type csv --fields country,contactpersons.0.firstname,contactpersons.0.emailaddress --out D:\info_docs\org.csv
By this query, I'm able to get only the first document values of the contactpersons.But, I'm trying to export the second document values also.
How can I resolve this issue ? Can anyone please help me out regarding this ...
You're getting exactly the first document in contactpersons because you are only exporting the first element of the array (contactpersons.0.firstname). mongoexport can't export several or all elements of an array, so what you need to do is to unwind the array and save it in another collection. You can do this with the aggregation framework.
First, do an $unwind of contactpersons, then $project the fields you want to use (in your example, country and contactpersons), and finally save the output in a new collection with $out.
db.org.aggregate([
{$unwind: '$contactpersons'},
{$project: {_id: 0, org_id: '$id', contacts: '$contactpersons', country: 1}},
{$out: 'aggregate_org'}
])
Now you can do a mongoexport of contacts (which is the result of the $unwind of contactpersons) and country.
mongoexport --host localhost --db sample --collection aggregate_org --type=csv --fields country,contacts.firstname,contacts.emailaddress --out D:\info_docs\org.csv
I have some data exported from mongodb. Now, I'm trying to import it into another database:
import --db local --collection CollectionTest1 --type json --file mytest.json
I kept receiving the error:
Failed: Error processing document #2 invalid character ',' looking for beginning of value
So, I ran it through jslint and sure enough, it isn't valid json, because it has mongo types in it like ISODate(...). Looking through the help file, I don't see anyway to handle this. Surely, it can import its own exports, right?
Any idea how to import such a file:
{
"_id" : "AGE",
"CreatedAt" : ISODate("2016-06-14T18:15:32.600Z"),
"CreatedBy" : "Guest User",
"UpdatedAt" : ISODate("2016-06-14T18:15:32.600Z"),
"UpdatedBy" : "Guest User",
"Controlled" : false,
"DraftSuffix" : "",
...
I have my json_file.json like this:
[
{
"project": "project_1",
"coord1": 2,
"coord2": 10,
"status": "yes",
"priority": 7
},
{
"project": "project_2",
"coord1": 2,
"coord2": 10,
"status": "yes",
"priority": 7
},
{
"project": "project_3",
"coord1": 2,
"coord2": 10,
"status": "yes",
"priority": 7
}
]
When I run the following command to import this into mongodb:
mongoimport --db my_db --collection my_collection --file json_file.json
I get the following error:
Failed: error unmarshaling bytes on document #0: JSON decoder out of sync - data changing underfoot?
If I add the --jsonArray flag to the command I import like this:
imported 3 documents
instead of one document with the json format as shown in the original file.
How can I import json into mongodb with the original format in the file shown above?
The mongoimport tool has an option:
--jsonArray treat input source as a JSON array
Or it is possible to import from file containing same data format as the result of db.collection.find() command. Here is example from university.mongodb.com courseware some content from grades.json:
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb577" }, "student_id" : 0, "type" : "exam", "score" : 54.6535436362647 }
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb578" }, "student_id" : 0, "type" : "quiz", "score" : 31.95004496742112 }
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb579" }, "student_id" : 0, "type" : "homework", "score" : 14.8504576811645 }
As you can see, no array used and no comma delimiters between documents either.
I discover, recently, that this complies with the JSON Lines text format.
Like one used in apache.spark.sql.DataFrameReader.json() method.
Side note:
$ python -m json.tool --sort-keys --json-lines < data.jsonl
also can handle this format
see demo and details here
Perhaps the following reference from the MongoDB project blog could help you gain insight on how arrays work in Mongo:
https://blog.mlab.com/2013/04/thinking-about-arrays-in-mongodb/
I would frame your import otherwise, and either:
a) import the three different objects separately into the collection as you say, using the --jsonArray flag; or
b) encapsulate the complete array within a single object, for example in this way:
{
"mydata":
[
{
"project": "project_1",
...
"priority": 7
}
]
}
HTH.
I faced opposite problem today, my conclusion would be:
If you wish to insert array of JSON objects at once, where each array entry shall be treated as separate dtabase entry, you have two options of syntax:
Array of object with valid coma positions & --jsonArray flag obligatory
[
{obj1},
{obj2},
{obj3}
]
Use file with basically incorrect JSON formatting (i.e. missing , between JSON object instances & without --jsonArray flag
{obj1}
{obj2}
{obj3}
If you wish to insert only an array (i.e. array as top-level citizen of your database) I think it's not possible and not valid, because mongoDB by definition supports documents as top-level objects which are mapped to JSON objects afterwards. In other words, you must wrap your array into JSON object as ALAN WARD pointed out.
Error:
$ ./mongoimport --db bookings --collection user --file user.json
2021-06-12T18:52:13.256+0530 connected to: localhost
2021-06-12T18:52:13.261+0530 Failed: error unmarshaling bytes on document #0: JSON decoder out of sync - data changing underfoot?
2021-06-12T18:52:13.261+0530 imported 0 documents
Solution: When your JSON data contain an array of objects then we need to use --jsonArray while import with the command like mentioned below
$ ./mongoimport --db bookings --collection user --file user.json --jsonArray
2021-06-12T18:53:44.164+0530 connected to: localhost
2021-06-12T18:53:44.532+0530 imported 414 documents
Scope:
I have a collection on MongoDB that I want to export to a .csv file. I have done this already, with a similar database, using the MongoExport.exe, executing it right on the server (windows machine, hosting the MongoDB database).
Problem:
Once I run the following script
mongoexport.exe --fieldFile fields.txt --db AppleStore --collection AppleStoreApps --out applestore.csv --csv --port 21766
I start getting the following error messages
Invalid BSON object type for CSV output:10
It works for some cases, but seems like the majority of records gets this error.
More Information:
This is an example of JSON object on mongoDB, that should be exported:
{
"_id" : ObjectId("545c05ea74671a1d1c572da9"),
"url" : "https://itunes.apple.com/us/app/dc-eventos/id782560424?mt=8",
"name" : "DC Eventos",
"developerName" : "FERNANDO COSTA",
"developerUrl" : "https://itunes.apple.com/us/artist/fernando-costa/id729986271",
"price" : 0,
"isFree" : true,
"thumbnailUrl" : "http://a4.mzstatic.com/us/r30/Purple6/v4/ee/a2/5e/eea25e3f-8f12-9dce-c86f-37e5e3d9a8dc/icon350x350.jpeg",
"compatibility" : "Requires iOS 5.0 or later. Compatible with iPhone, iPad, and iPod touch. This app is optimized for iPhone 5.",
"category" : "Business",
"updateDate" : ISODate("2014-03-22T03:00:00.000Z"),
"version" : "1.82.82.542",
"size" : "16.3 MB",
"languages" : [
"English"
],
"minimumAge" : 4,
"ageRatingReasons" : [],
"rating" : {
"starsRatingCurrentVersion" : 0,
"starsVersionAllVersions" : 0,
"ratingsCurrentVersion" : 0,
"ratingsAllVersions" : 0
},
"topInAppPurchases" : null
}
mongoexport is likely choking on empty array -- "ageRatingReasons" : [] -- and null objects. examine the records one by one and check for a pattern.
csv cannot 'do' arrays and objects hence the need for json and xml. try exporting json and then convert with a variety of json to csv converters that will handle complex or custom flattening of objects such as [] to 0 or skipped commas val,,val whatever is needed. the jsontocsv convertor must also permit turning off validating, simply because ObjectId("545c05ea74671a1d1c572da9") is invalid json.