I have a big json file (~ 300GB) which is composed of many of dicts in it and I am trying to import this file into MongoDB. The method that I tried was mongoimport by using this:
mongoimport --db <DB_NAME> --collection <COLLECTION_NAME> --file <FILE_DIRECTORY> --jsonArray --batchSize 1
but it shows the error something like this after some insertions Failed: error processing document #89602: unexpected EOF. I have no idea why it happens.
Any other methods to make it work?
Related
I see there are quite a few people with exporting an importing a collection/database issues.
When I try to export a collection via:
mongodump --db database1 --collection collection1
This is perfectly fine. No errors.
But when I try to import the collection that I just exported:
mongoimport --db database1 --collection collection1 --file collection1.bson -vvvvv
I get:
Failed: error processing document #1: invalid character '\u008f' looking for beginning of value
I've tried to import/export a different collections and I get:
Failed: error processing document #1: invalid character '¡' looking for beginning of value
Is there a simple way to fix this other than going through a binary encoded json file to look for ¡ and \u008f? Why would mongo allow it to be exported yet complains when trying to import it?
The following works for us, importing the single file our_file.json into our mongodb collection: mongoimport --uri "mongodb+srv://<username>:<password>#our-cluster.dwxnd.gcp.mongodb.net/dbname" --collection our_coll_name --drop --file /tmp/our_file.json
The following does not work, as we cannot point to a directory our_directory: mongoimport --uri "mongodb+srv://<username>:<password>#our-cluster.dwxnd.gcp.mongodb.net/dbname" --collection our_coll_name --drop --file /tmp/our_directory
We predictably get the error Failed: error processing document #1: read /tmp/our_directory: is a directory
Is it possible to import all of the JSONs in our_directory into our collection using a single bash command? See speed test in my answer below - is it possible to parallelize, or use multi-threading, so that the mongoimport of the 103 files outperforms the mongoimport of the 1 file?
It looks like cat /tmp/our_directory/*.json | mongoimport --uri "mongodb+srv://<username>:<password>#our-cluster.dwxnd.gcp.mongodb.net/dbname" --collection our_coll_name --drop is working. And the import seems to be happening at a decent speed...
Edit: Speed Test (locally on Mac with these specs)
It took 11 minutes to mongoimport a total of 103 files with combined size of ~1GB to our mongoDB collection. We tested the mongoimport speed with a single 1GB file as well (rather than 103), and it took roughly 11 minutes as well.
I am using mongo 3.4
I want to import json file from json array to mongod using bash script, and I want to import the json file only if they don't exist. I tried with --upsert but it does not work.
Is there any easy way to do it? Thanks
mongoimport --db dbName --collection collectionName --file fileName.json --jsonArray --upsert
mongoimport -d dbName -c collectionName jsonFile.json -vvvvv
Even though the output of mongoimport says that n of objects were imported, the exsiting document with same data has not been overwritten.
if use --upsert it will update the existing document.
Found similar discussion here
When I try to import my json data file into my local instance of mongodb, I get an error. The code that I am using is shown below.
> mongoimport --db cities --collection zips --type json --file C:/MongoDB/data/zips.json
This is the error that I get.
2014-11-29T20:27:33.803-0800 SyntaxError: Unexpected identifier
what seems to be to problem here?
I just found out that mongoimport is used from terminal/command line(cmd), and NOT within the mongo shell.
I am trying to import the large data set json file into mongodb using mongoimport.
mongoimport --db test --collection sam1 --file 1234.json --jsonArray
error:
2014-07-02T15:57:16.406+0530 error: object to insert too large
2014-07-02T15:57:16.406+0530 tried to import 1 objects
Please try add this option: --batchSize 1
Like:
mongoimport --db test --collection sam1 --file 1234.json --batchSize 1
Data will be parsed and stored into the database batchwise