mongoimport 2.6 vs 2.4 - mongodb

I am exporting data from one database and importing data to another database. When I export the data its on a machine with mongo 2.6 but when I import the data its on a VM using mongo 2.4. Both mongod instances are running 2.4.
I keep getting this error:
Wed Jun 4 13:13:32.604 check 0 0
Wed Jun 4 13:13:32.604 imported 0 objects
Wed Jun 4 13:13:32.604 ERROR: encountered 1 error(s)
failed: [192.168.140.30] => (item=collection) => {"changed": true, "cmd": "mongoimport -u username -p password -d db -c collection --drop --jsonArray /tmp/collection.json ", "delta": "0:00:00.026383", "end": "2014-06-04 13:13:33.091774", "item": "collection", "rc": 255, "start": "2014-06-04 13:13:33.065391"}
stdout: connected to: 127.0.0.1
Wed Jun 4 13:13:33.089 dropping: <db.collection>
Wed Jun 4 13:13:33.089 exception:BSON representation of supplied JSON array is too large: code FailedToParse: FailedToParse: Date expecting integer milliseconds: offset:171
And the exported date format looks like
{ "date" : { "$date" : "2014-06-02T06:39:28.869-0700" }
I have verified that using mongoimport on the same machine as the mongoexport works fine, so I assume there is a compatablility issue between mongoimport/export from 2.4 to 2.6. Due to firewall restrictions I need to use the two different machines for moving the data around.
Does anybody have any good work arounds for this problem. I have not seen an option to export in the old format as far as I can tell. I also cannot tell from the release notes, what is causing the compatability error.

I ended up using mongodump and mongorestore instead of mongoimport and mongo export. There was no compatability issue using bson documents instead of json.

Related

mongimport upsert creates new documents

When I try to execute a mongoimport with upsertFields like so:
> mongoimport --db upsert-test --collection data --type tsv --headerline --file upsert-data.tsv --upsertFields MyCustomUpsertField -vvv
2018-10-10T15:08:39.358+0200 using upsert fields: [MyCustomUpsertField]
2018-10-10T15:08:39.424+0200 using 8 decoding workers
2018-10-10T15:08:39.424+0200 using 1 insert workers
2018-10-10T15:08:39.425+0200 will listen for SIGTERM, SIGINT, and SIGKILL
2018-10-10T15:08:39.425+0200 filesize: 61 bytes
2018-10-10T15:08:39.426+0200 using fields: "MyCustomUpsertField","SomeData"
2018-10-10T15:08:39.431+0200 connected to: localhost
2018-10-10T15:08:39.431+0200 ns: upsert-test.data
2018-10-10T15:08:39.431+0200 connected to node type: standalone
2018-10-10T15:08:39.432+0200 standalone server: setting write concern w to 1
2018-10-10T15:08:39.432+0200 using write concern: w='1', j=false, fsync=false, wtimeout=0
2018-10-10T15:08:39.432+0200 standalone server: setting write concern w to 1
2018-10-10T15:08:39.432+0200 using write concern: w='1', j=false, fsync=false, wtimeout=0
2018-10-10T15:08:39.433+0200 got line: ["Upsert-ID-1" "SomeData1"]
2018-10-10T15:08:39.433+0200 imported 1 document
And then execute the same command again, The result is two documents with exactly the same data.
Adding the (supposedly obsolete) --mode upsert flag changes nothing. New documents are always created.
I was under the impression, that upserFields would search for already existing documents with MyCustomUpsertField == "Upsert-ID-1" and update those documents instead of creating new ones?
Env Info
> mongo --version
MongoDB shell version v4.0.0
git version: 3b07af3d4f471ae89e8186d33bbb1d5259597d51
allocator: tcmalloc
modules: none
build environment:
distmod: 2008plus-ssl
distarch: x86_64
target_arch: x86_64
> mongoimport --version
mongoimport version: r4.0.0
git version: 3b07af3d4f471ae89e8186d33bbb1d5259597d51
Go version: go1.8.5
os: windows
arch: amd64
compiler: gc
OpenSSL version: OpenSSL 1.0.2o-fips 27 Mar 2018
What am I doing wrong?
You have an issue related to quotes in your TSV header which is similar to this one: https://jira.mongodb.org/browse/TOOLS-61
When you look at your screenshot above you'll notice that your field names are not MyCustomUpsertField but "MyCustomUpsertField" - with the quotes included.
So what you want to do is either remove the quotes from your file (I would strongly suggest that because it looks funky on a JSON level and I feel this is going to cause issues somewhere) or find a way to use quotes in the command line, kind of like this:
mongoimport --db upsert-test --collection data --type tsv --headerline --file upsert-data.tsv --upsertFields "MyCustomUpsertField" -vvv
Mind you, I haven't tried the above and would guess it won't behave as expected.

MongoDB: mongoimport loses connection when importing big files

I have some trouble importing a JSON file to a local MongoDB instance. The JSON was generated using mongoexport and looks like this. No arrays, no hardcore nesting:
{"_created":{"$date":"2015-10-20T12:46:25.000Z"},"_etag":"7fab35685eea8d8097656092961d3a9cfe46ffbc","_id":{"$oid":"562637a14e0c9836e0821a5e"},"_updated":{"$date":"2015-10-20T12:46:25.000Z"},"body":"base64 encoded string","sender":"mail#mail.com","type":"answer"}
{"_created":{"$date":"2015-10-20T12:46:25.000Z"},"_etag":"7fab35685eea8d8097656092961d3a9cfe46ffbc","_id":{"$oid":"562637a14e0c9836e0821a5e"},"_updated":{"$date":"2015-10-20T12:46:25.000Z"},"body":"base64 encoded string","sender":"mail#mail.com","type":"answer"}
If I import a 9MB file with ~300 rows, there is no problem:
[stekhn latest]$ mongoimport -d mietscraping -c mails mails-small.json
2015-11-02T10:03:11.353+0100 connected to: localhost
2015-11-02T10:03:11.372+0100 imported 240 documents
But if try to import a 32MB file with ~1300 rows, the import fails:
[stekhn latest]$ mongoimport -d mietscraping -c mails mails.json
2015-11-02T10:05:25.228+0100 connected to: localhost
2015-11-02T10:05:25.735+0100 error inserting documents: lost connection to server
2015-11-02T10:05:25.735+0100 Failed: lost connection to server
2015-11-02T10:05:25.735+0100 imported 0 documents
Here is the log:
2015-11-02T11:53:04.146+0100 I NETWORK [initandlisten] connection accepted from 127.0.0.1:45237 #21 (6 connections now open)
2015-11-02T11:53:04.532+0100 I - [conn21] Assertion: 10334:BSONObj size: 23592351 (0x167FD9F) is invalid. Size must be between 0 and 16793600(16MB) First element: insert: "mails"
2015-11-02T11:53:04.536+0100 I NETWORK [conn21] AssertionException handling request, closing client connection: 10334 BSONObj size: 23592351 (0x167FD9F) is invalid. Size must be between 0 and 16793600(16MB) First element: insert: "mails"
I've heard about the 16MB limit for BSON documents before, but since no row in my JSON file is bigger than 16MB, this shouldn't be a problem, right? When I do the exact same (32MB) import one my local computer, everything works fine.
Any ideas what could cause this weird behaviour?
I guess the problem is about performance, any way you can solved used:
you can use mongoimport option -j. Try increment if not work with 4. i.e, 4,8,16, depend of the number of core you have in your cpu.
mongoimport --help
-j, --numInsertionWorkers= number of insert operations to run
concurrently (defaults to 1)
mongoimport -d mietscraping -c mails -j 4 < mails.json
or you can split the file and import all files.
I hope this help you.
looking a little more, is a bug in some version
https://jira.mongodb.org/browse/TOOLS-939
here another solution you can change the batchSize, for default is 10000, reduce the value and test:
mongoimport -d mietscraping -c mails < mails.json --batchSize 1
Quite old, but I struggled on same issue.
If you want to import big files, especially remote with Compass or by Program just add
&wtimeoutMS=0
to your Connection-String. This removes Timeout on Write-Operations.

mongorestore failing because of DocTooLargeForCapped error

I'm trying to restore a collection like so:
$ mongorestore --verbose --db MY_DB --collection MY_COLLECTION /path/to/MY_COLLECTION.bson --port 1234 --noOptionsRestore
Here's the error output (timestamps removed):
using write concern: w='majority', j=false, fsync=false, wtimeout=0
checking for collection data in /path/to/MY_COLLECTION.bson
found metadata for collection at /path/to/MY_COLLECTION.metadata.json
reading metadata file from /path/to/MY_COLLECTION.metadata.json
skipping options restoration
restoring MY_DB.MY_COLLECTION from file /path/to/MY_COLLECTION.bson
file /path/to/MY_COLLECTION.bson is 241330 bytes
error: write to oplog failed: DocTooLargeForCapped document doesn't fit in capped collection. size: 116 storageSize:1206976512 # 28575
error: write to oplog failed: DocTooLargeForCapped document doesn't fit in capped collection. size: 116 storageSize:1206976512 # 28575
restoring indexes for collection MY_DB.MY_COLLECTION from metadata
Failed: restore error: MY_DB.MY_COLLECTION: error creating indexes for MY_DB.MY_COLLECTION: createIndex error: exception: write to oplog failed: DocTooLargeForCapped document doesn't fit in capped collection. size: 116 storageSize:1206976512 # 28575
The result of the restore is a database and collection with correct names but no documents.
OS: Ubuntu 14.04 running on Azure VM.
I just solved my own problem. See answer below.
The problem seemed to be that I was using mongod on the replica set PRIMARY member.
Once I commented out the following line in /etc/mongod.conf, it worked without problems:
replSet=REPL_SET_NAME --> #replSet=REPL_SET_NAME
I assume passing the correct replica set name to the mongorestore command (like in this question) could also work, but haven't tried that yet.

MongoDB still shows empty collections after restoring from dump

After mongodump, I did mongorestore which seemed to work fine
heathers-air:db heathercohen$ mongorestore -v -host localhost:27017
2015-02-06T11:22:40.027-0800 creating new connection to:localhost:27017
2015-02-06T11:22:40.028-0800 [ConnectBG] BackgroundJob starting: ConnectBG
2015-02-06T11:22:40.028-0800 connected to server localhost:27017 (127.0.0.1)
2015-02-06T11:22:40.028-0800 connected connection!
connected to: localhost:27017
2015-02-06T11:22:40.030-0800 dump/langs.bson
2015-02-06T11:22:40.030-0800 going into namespace [dump.langs]
Restoring to dump.langs without dropping. Restored data will be inserted without raising errors; check your server log
file dump/langs.bson empty, skipping
2015-02-06T11:22:40.030-0800 Creating index: { key: { _id: 1 }, name: "_id_", ns: "dump.langs" }
2015-02-06T11:22:40.031-0800 dump/tweets.bson
2015-02-06T11:22:40.031-0800 going into namespace [dump.tweets]
Restoring to dump.tweets without dropping. Restored data will be inserted without raising errors; check your server log
file size: 4877899
30597 objects found
2015-02-06T11:22:41.883-0800 Creating index: { key: { _id: 1 }, name: "_id_", ns: "dump.tweets" }
When I try to access the data though, it's still empty and the way it looked before restore:
> show dbs
admin (empty)
dump 0.078GB
local 0.078GB
tweets (empty)
twitter (empty)
It says it found 30597 objects, where did they go?
They went into the dump database, and then into the collections dump.tweets and dump.langs. The fact that the files are contained in the folder dump means that mongorestore thinks they should be restored to the database dump (it is inferred from the path). The verbose output even explicitly states that the data is being placed into dump.langs and dump.tweets specifically.
If you specify the database you wish to restore to (with -d) and restore the specific files you will be able to restore the documents to the database you desire. Or, you can simply have a look in the dump database by running:
use dump;
db.tweets.find();
db.langs.find();

Mongodump and mongorestore; field not found

I'm trying to dump a database from another server (this works fine), then restore it on a new server (this does not work fine).
I first run:
mongodump --host -d
This creates a folder dump/db which contains all of the bson documents.
Then in the dump folder, I'm running:
mongorestore -d dbname db
This works and iterates through the files, but I get this error on dbname.system.users
Wed May 23 02:08:05 { key: { _id: 1 }, ns: "dbname.system.users", name: "_id_" }
Error creating index dbname.system.usersassertion: 13111 field not found, expected type 16
Any ideas how to resolve this?
If it realy different versions, use --noIndexRestore option. And create all index after that.
Any chance the source and destination are different versions?
In any case, to get around this, restore the collections individually using the -c flag to the target DB and then build the indexes afterward. The system collection is the one used for indexes, so it is fairly easy to recreate - try it last once everything else has been restore, and if it still fails you can always just recreate the relevant indexes.
The issue could also caused by this bug in older versions of Mongo (In my case it was 2.0.8):
https://jira.mongodb.org/browse/SERVER-7181
Basically, you get 13111 field not found, expected type 16 error when it should actually be prompting you to enter your authentication details.
And example of how I fixed it:
root#precise64:/# mongorestore /backups/demand/ondemand.05-24-2013T114223/
connected to: 127.0.0.1
[REDACTED]
Fri May 24 11:48:15 going into namespace [test.system.indexes]
Fri May 24 11:48:15 { key: { _id: 1 }, ns: "test.system.users", name: "_id_" }
Error creating index test.system.usersassertion: 13111 field not found, expected type 16
# Error when not giving username and password
root#precise64:/# mongorestore -u fakeuser -p fakepassword /backups/demand/ondemand.05-24-2013T114223/
connected to: 127.0.0.1
[REDACTED]
Fri May 24 11:57:11 /backups/demand/ondemand.05-24-2013T114223/test/system.users.bson
Fri May 24 11:57:11 going into namespace [test.system.users]
1 objects found
# Works fine when giving username and password! :)
Hope that helps anyone who's issue doesn't get fixed by the previous 2 replies!
This can also happen if you are trying to mongorestore into MongoDB 2.6+ and the dump you are trying to restore contains a system.users table in any database other than admin. In MongoDB 2.2 and 2.4 the system.userscollections could occur in any database. The auth schema migration associated with MongoDB 2.6 moved all users into the system.users table in the admin database, but left behind the system.users tables in the other databases (MongoDB 2.6 just ignores these). This seems to cause this assertion when importing into MongoDB 2.6.