I have my json_file.json like this:
[
{
"project": "project_1",
"coord1": 2,
"coord2": 10,
"status": "yes",
"priority": 7
},
{
"project": "project_2",
"coord1": 2,
"coord2": 10,
"status": "yes",
"priority": 7
},
{
"project": "project_3",
"coord1": 2,
"coord2": 10,
"status": "yes",
"priority": 7
}
]
When I run the following command to import this into mongodb:
mongoimport --db my_db --collection my_collection --file json_file.json
I get the following error:
Failed: error unmarshaling bytes on document #0: JSON decoder out of sync - data changing underfoot?
If I add the --jsonArray flag to the command I import like this:
imported 3 documents
instead of one document with the json format as shown in the original file.
How can I import json into mongodb with the original format in the file shown above?
The mongoimport tool has an option:
--jsonArray treat input source as a JSON array
Or it is possible to import from file containing same data format as the result of db.collection.find() command. Here is example from university.mongodb.com courseware some content from grades.json:
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb577" }, "student_id" : 0, "type" : "exam", "score" : 54.6535436362647 }
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb578" }, "student_id" : 0, "type" : "quiz", "score" : 31.95004496742112 }
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb579" }, "student_id" : 0, "type" : "homework", "score" : 14.8504576811645 }
As you can see, no array used and no comma delimiters between documents either.
I discover, recently, that this complies with the JSON Lines text format.
Like one used in apache.spark.sql.DataFrameReader.json() method.
Side note:
$ python -m json.tool --sort-keys --json-lines < data.jsonl
also can handle this format
see demo and details here
Perhaps the following reference from the MongoDB project blog could help you gain insight on how arrays work in Mongo:
https://blog.mlab.com/2013/04/thinking-about-arrays-in-mongodb/
I would frame your import otherwise, and either:
a) import the three different objects separately into the collection as you say, using the --jsonArray flag; or
b) encapsulate the complete array within a single object, for example in this way:
{
"mydata":
[
{
"project": "project_1",
...
"priority": 7
}
]
}
HTH.
I faced opposite problem today, my conclusion would be:
If you wish to insert array of JSON objects at once, where each array entry shall be treated as separate dtabase entry, you have two options of syntax:
Array of object with valid coma positions & --jsonArray flag obligatory
[
{obj1},
{obj2},
{obj3}
]
Use file with basically incorrect JSON formatting (i.e. missing , between JSON object instances & without --jsonArray flag
{obj1}
{obj2}
{obj3}
If you wish to insert only an array (i.e. array as top-level citizen of your database) I think it's not possible and not valid, because mongoDB by definition supports documents as top-level objects which are mapped to JSON objects afterwards. In other words, you must wrap your array into JSON object as ALAN WARD pointed out.
Error:
$ ./mongoimport --db bookings --collection user --file user.json
2021-06-12T18:52:13.256+0530 connected to: localhost
2021-06-12T18:52:13.261+0530 Failed: error unmarshaling bytes on document #0: JSON decoder out of sync - data changing underfoot?
2021-06-12T18:52:13.261+0530 imported 0 documents
Solution: When your JSON data contain an array of objects then we need to use --jsonArray while import with the command like mentioned below
$ ./mongoimport --db bookings --collection user --file user.json --jsonArray
2021-06-12T18:53:44.164+0530 connected to: localhost
2021-06-12T18:53:44.532+0530 imported 414 documents
Related
I have following bson and json files from https://github.com/Apress/def-guide-to-mongodb/tree/master/9781484211830/The%20Definitive%20Guide%20to%20MongoDB
$ ls .
aggregation.bson aggregation.metadata.json mapreduce.bson mapreduce.metadata.json storage.bson text.json
How can I import them into MongoDB?
I tried to import each of them as a collection, but failed:
$ mongorestore -d test -c aggregation
2018-07-18T01:44:25.376-0400 the --db and --collection args should only be used when restoring from a BSON file. Other uses are deprecated and will not exist in the future; use --nsInclude instead
2018-07-18T01:44:25.377-0400 using default 'dump' directory
2018-07-18T01:44:25.377-0400 see mongorestore --help for usage information
2018-07-18T01:44:25.377-0400 Failed: mongorestore target 'dump' invalid: stat dump: no such file or directory
I am not sure if I specify the file aggregation.bson correctly, but the above command is what I learned from a similar example in a book.
Thanks.
UPDATE
In the following, why did the first fail and the second succeed? Which command shall I use?
$ mongoimport -d test -c aggregation --file aggregation.bson
2018-07-18T09:45:44.698-0400 connected to: localhost
2018-07-18T09:45:44.720-0400 Failed: error processing document #1: invalid character 'º' looking for beginning of value
2018-07-18T09:45:44.720-0400 imported 0 documents
$ mongoimport -d test -c aggregation --file aggregation.metadata.json
2018-07-18T09:46:05.058-0400 connected to: localhost
2018-07-18T09:46:05.313-0400 imported 1 document
mongoimport --db dbName --collection collectionName --type json --file fileName.json
Update:
C:\Program Files\MongoDB\Server\4.0\bin>mongorestore -d test -c aggregation aggregation.bson
2018-07-19T10:28:39.963+0300 checking for collection data in aggregation.bson
2018-07-19T10:28:40.099+0300 restoring test.aggregation from aggregation.bson
2018-07-19T10:28:41.113+0300 no indexes to restore
2018-07-19T10:28:41.113+0300 finished restoring test.aggregation (1000 documents)
2018-07-19T10:28:41.113+0300 done
So I tried it and it worked fine for me do you have the file in your bin folder or maybe the command you used wasn't complete?
db.aggregation.find().pretty().limit(2)
{
"_id" : ObjectId("51de841747f3a410e3000001"),
"num" : 1,
"color" : "blue",
"transport" : "train",
"fruits" : [
"orange",
"banana",
"kiwi"
],
"vegetables" : [
"corn",
"broccoli",
"potato"
]
}
{
"_id" : ObjectId("51de841747f3a410e3000005"),
"num" : 5,
"color" : "yellow",
"transport" : "plane",
"fruits" : [
"lemon",
"cherry",
"dragonfruit"
],
"vegetables" : [
"mushroom",
"capsicum",
"zucchini"
]
}
I am having problems exporting subdocuments that are stored in MongoDB to a .CSV.
My data: a mongo collection that contains a unique user ID and scores from personality quizzes.
I would like a CSV that has three columns: user_id, name, raw_score. To add a further layer of complexity, within the 'scales' subdocument some users will have more than two entries (some quizzes produced more than 2 personality scores).
An example of my data minus documents that I am not interested in:
"assessment":{
"user_id" : "5839b1a654842f35617ad100",
"submissions" : {
"results" : {
"scales" : [
{
"scale" : {
"name" : "Security",
"code" : "SEC",
"multiplier" : 1
},
"raw_score" : 2
},
{
"scale" : {
"name" : "Power",
"code" : "POW",
"multiplier" : -1
},
"raw_score" : 3
}
],
}
}
}
}
I have tried using mongoexport but this produces a CSV that only has a user_id column.
rekuss$ mongoexport -d production_hoganx_app -c assessments --type=csv -o app_personality.csv -f user_id,results.scales.scale.name,results.scales.raw_score
Any ideas where I am going wrong?
Please let me know if you need anymore information.
Many thanks
You should try removing '=' sign from type. You could try --type csv
I'm working on a java program to pass from MongoDB to Neo4j.
I have to export some Mongo documents in a csv file.
I have, for example, this document:
"coached_Team" : [
{
"team_id" : "Pal.00",
"in_charge" : {
"from" : {
"day" : 25,
"month" : 9,
"year" : 2013
}
},
"matches" : 75
}
]
I have to export in csv. I read some other questions, for example this and I used that tip to export my document.
To export in csv I use this command:
Z:\path\to\Mongo\3.0\bin>mongoexport --db <database> --collection
<collection> --type=csv --fields coached_Team.0.team_id,coached_Team.0.in_charge.from.day,
coached_Team.0.in_charge.from.month,coached_Team.0.in_charge.from.year,
coached_Team.0.matches --out "C:\path\to\output\file\output.csv
But, it did not work for me:
Scope:
I have a collection on MongoDB that I want to export to a .csv file. I have done this already, with a similar database, using the MongoExport.exe, executing it right on the server (windows machine, hosting the MongoDB database).
Problem:
Once I run the following script
mongoexport.exe --fieldFile fields.txt --db AppleStore --collection AppleStoreApps --out applestore.csv --csv --port 21766
I start getting the following error messages
Invalid BSON object type for CSV output:10
It works for some cases, but seems like the majority of records gets this error.
More Information:
This is an example of JSON object on mongoDB, that should be exported:
{
"_id" : ObjectId("545c05ea74671a1d1c572da9"),
"url" : "https://itunes.apple.com/us/app/dc-eventos/id782560424?mt=8",
"name" : "DC Eventos",
"developerName" : "FERNANDO COSTA",
"developerUrl" : "https://itunes.apple.com/us/artist/fernando-costa/id729986271",
"price" : 0,
"isFree" : true,
"thumbnailUrl" : "http://a4.mzstatic.com/us/r30/Purple6/v4/ee/a2/5e/eea25e3f-8f12-9dce-c86f-37e5e3d9a8dc/icon350x350.jpeg",
"compatibility" : "Requires iOS 5.0 or later. Compatible with iPhone, iPad, and iPod touch. This app is optimized for iPhone 5.",
"category" : "Business",
"updateDate" : ISODate("2014-03-22T03:00:00.000Z"),
"version" : "1.82.82.542",
"size" : "16.3 MB",
"languages" : [
"English"
],
"minimumAge" : 4,
"ageRatingReasons" : [],
"rating" : {
"starsRatingCurrentVersion" : 0,
"starsVersionAllVersions" : 0,
"ratingsCurrentVersion" : 0,
"ratingsAllVersions" : 0
},
"topInAppPurchases" : null
}
mongoexport is likely choking on empty array -- "ageRatingReasons" : [] -- and null objects. examine the records one by one and check for a pattern.
csv cannot 'do' arrays and objects hence the need for json and xml. try exporting json and then convert with a variety of json to csv converters that will handle complex or custom flattening of objects such as [] to 0 or skipped commas val,,val whatever is needed. the jsontocsv convertor must also permit turning off validating, simply because ObjectId("545c05ea74671a1d1c572da9") is invalid json.
I am importing data into mongodb by this command
mongoimport -d dataBase --collection ip2location --type csv --file "/home/oodles/git/csv/IP2LOCATION-LITE-DB11.CSV" --fields _id,ipFrom,ipTo,countryCode,countryName,regionName,cityName,latitude,longitude,zipCode,timeZone
import is successfull
but the problem is
db.ip2Location.find().pretty()
getting
"_id" : ObjectId("52be7f25c80e0735273985bf"), ///here is requirement need "_id" : NumberLong(1)
"ipFrom" : NumberLong(16777216),
"ipTo" : NumberLong(16777471),
"countryCode" : "AU",
"countryName" : "AUSTRALIA",
"regionName" : "QUEENSLAND",
"cityName" : "SOUTH BRISBANE",
"latitude" : -27.48333,
"longitude" : 153.01667,
"zipCode" : 4101,
"timeZone" : "+10:00"
first line is "_id" : ObjectId("52be7f25c80e0735273985bf"),
but I need like this: "id" : NumberLong(1)
cvs data sample
"16777216","16777471","AU","AUSTRALIA","QUEENSLAND","SOUTH BRISBANE","-27.483330","153.016670","4101","+10:00"
csv data sample
"16777216","16777471","AU","AUSTRALIA","QUEENSLAND","SOUTH BRISBANE","-27.483330","153.016670","4101","+10:00"
The sample line doesn't match the list of --fields provided; it has one field less. The first field should be the _id you want to use (or 1 per your example).
Corrected line:
1, "16777216","16777471","AU","AUSTRALIA","QUEENSLAND","SOUTH BRISBANE","-27.483330","153.016670","4101","+10:00"
I tested and this works as expected in MongoDB 2.0.4.
I can't reproduce your results of having the _id added automatically with the first field value missing; when I tried with MongoDB 2.0.4 it assigned values to fields in the order listed so that the _id became 16777216, ipFrom was 16777471, etc. I suspect you may have been viewing a document inserted in an earlier mongoimport run where your --fields list did not include _id.
You should also be aware that mongoimport only inserts data (it does not do updates or upserts). If there is already a document with the given _id than mongoimport will ignore that line of your CSV.
An easier way to keep the fields and CSV data in sync is to have the list of fields as the first line in your CSV and then use mongoimport --headerline ... instead of --fields.