mongo.input.query with $date not filtering input to hadoop - mongodb

I have a sharded input collection that I want to filter on before sending it to my hadoop cluster for map reduce computations.
I have this parameter in my $ hadoop jar - command
mongo.input.query='{_id.uuid:"device-964693"}'
and it works. The output does not mapreduce any data that does not satisfy this query.
This however does not work:
mongo.input.query='{_id.day:{\\$lt:{\\$date:1388620740000}}}'
no data is being produced as output.
1388620740000 represents the date Wed Jan 01 2014 23:59:00 GMT+0000 (GMT).
The setup is using hadoop 2.2, mongo 2.4.9, this connector version (2.2-1.2.0).
No error messages, just a standard hadoop success message.
Is my syntax incorrect or what did I miss?
Could you point me to some debugging tools/methods for this?

Debugging methods:
in mongo:
db.currentOp(true).inprog.forEach(
function(d){
if(d.ns == "test.collection" && d.query.query["_id.day"] )
printjson(d);
})
a terminal-friendly syntax:
$ hadoop jar... ...mongo.input.query='{"_id.day":{"$lt":{"$date":"2014-01-19T23:00:00Z"}}}'

Related

Mongo shell : can't update log level (setLogLevel is not a function)

I am trying to increase my mongo log level without success:
MongoDB shell version: 2.6.10
connecting to: test
> use TreeDB
switched to db TreeDB
> db.setLogLevel(5,"query")
2018-01-23T12:11:28.221+0100 TypeError: Property 'setLogLevel' of object TreeDB is not a function
How to correctly use this function?
> db.setLogLevel
TreeDB.setLogLevel
db.help() does not output this function.
docs.mongodb references it here since 3.0: https://docs.mongodb.com/manual/reference/method/db.setLogLevel/
I am using ubuntu 16.04, mongoshell 2.6.10, mongo: 3.6.2 (via docker)
As explained in the comments, I have to install at least 3.0 version of mongoshell.
This should be done following https://docs.mongodb.com/manual/tutorial/install-mongodb-on-ubuntu/

How to read from a replicaset mongo by mongodb-erlang

1. {ok,P}= mongoc:connect({rs, <<"dev_mongodb">>, [ "dev_mongodb001:27017", "dev_mongodb002:27017"]}, [{name, mongopool}, {register, mongotopology}, { rp_mode, primary},{ rp_tags, [{tag,1}]}], [{login, <<"root">>}, {password, <<"mongoadmin">>}, {database, <<"admin">>}]).
2. {ok, Pool} = mc_topology:get_pool(P, []).
3. mongoc:find(Pool, {<<"DoctorLBS">>, <<"mongoMessage">>}, #{<<"type">> => <<"5">>}).
I used latest version in github, and got an error at step 3.
It seems my selector is not valid, is there any example of how to use mongodb-erlang ?
My mongodb version is 3.2.6, auth type is SCRAM-SHA1.
mongoc:find(Pool, <<"mongoMessage">>, #{<<"type">> => <<"5">>}).
I tried this in rs and single mode, still got this error.
Is there any other simple way to connect and read?
I just need to read some data once from mongo when my erlang program start, no other actions.
Todays version of mongo does not support tuple colldb due to new query api introduced in mongo 2.6
You should connect to DoctorLBS database instead, and than use
mongoc:find(Pool, <<"mongoMessage">>, #{<<"type">> => <<"5">>}).

Migration From Tokumx 1.5 To Percona Server For mongodb 3.11

Migrating Data from Tokumx To Percona Server For MonoDB
Step 1 :
This guide describes how to upgrade existing Percona TokuMX instance to Percona Server for MongoDB. The following JavaScript files are required to perform the upgrade:
• allDbStats.js
• tokumx_dump_indexes.js
• psmdb_restore_indexes.js
You can download those files from GitHub.
Step 2 :
Run the allDbStats.js script to record database state before migration:
$ mongo ./allDbStats.js > ~/allDbStats.before.out
Step 3 :
Perform a dump of the database:
$ mongodump --out /your/dump/path
Step 4 :
Perform a dump of the indexes:
$ ./tokumx_dump_indexes.js > /your/dump/path/tokumxIndexes.json
Step 5 :
Restore the collections without indexes using “--noIndexRestore” switch:
$ mongorestore --noIndexRestore /your/dump/path
Step 6 :
Restore the indexes (this may take a while). This step will remove clustering options to the collections before inserting.
$./psmdb_restore_indexes.js --eval "data='/your/dump/path/tokumxIndexes.json' "
Step 7 :
Run the allDbStats.js script to record database state after migration:
mongo ./allDbStats.js > ~/allDbStats.after.out
This is the guide i have found in the Migration from Tokumx to Percona server for mongodb. at step 6 when i try to restore indexes i get below mentioned error :
/mnt/tokumx-bkup/tokumxIndexes.json
2016-06-29T05:28:20.028+0000 E QUERY SyntaxError: Unexpected identifier
at /tmp/tokumx2_to_psmdb3_migration-master/psmdb_restore_indexes.js:78:1 at /mnt/tokumx-bkup/tokumxIndexes.json
2016-06-29T05:28:20.028+0000 E QUERY Error: error loading js file: /mnt/tokumx-bkup/tokumxIndexes.json
at /tmp/tokumx2_to_psmdb3_migration-master/psmdb_restore_indexes.js:78:1 at /tmp/tokumx2_to_psmdb3_migration-master/psmdb_restore_indexes.js:78
failed to load: /tmp/tokumx2_to_psmdb3_migration-master/psmdb_restore_indexes.js
Any help will be welcomed.
Thanks
check the tokumxIndexes.json file. When tokumx_dump_indexes.js is run, the mongo shell parameter --quiet must be used or the resulting json will contain the shell preamble at the beginning.
And check the file using something like http://jsonlint.com/
Also if preamble is present delete these two lines from the tokumxIndexes.json file.
"MongoDB shell version: 3.0.11-1.6
connecting to: 127.0.0.1:27017/test"
and Run the script again.
and Run the script again
$./psmdb_restore_indexes.js --eval "data='/your/dump/path/tokumxIndexes.json' "
Now this script will start build Index Process.

mongoimport 2.6 vs 2.4

I am exporting data from one database and importing data to another database. When I export the data its on a machine with mongo 2.6 but when I import the data its on a VM using mongo 2.4. Both mongod instances are running 2.4.
I keep getting this error:
Wed Jun 4 13:13:32.604 check 0 0
Wed Jun 4 13:13:32.604 imported 0 objects
Wed Jun 4 13:13:32.604 ERROR: encountered 1 error(s)
failed: [192.168.140.30] => (item=collection) => {"changed": true, "cmd": "mongoimport -u username -p password -d db -c collection --drop --jsonArray /tmp/collection.json ", "delta": "0:00:00.026383", "end": "2014-06-04 13:13:33.091774", "item": "collection", "rc": 255, "start": "2014-06-04 13:13:33.065391"}
stdout: connected to: 127.0.0.1
Wed Jun 4 13:13:33.089 dropping: <db.collection>
Wed Jun 4 13:13:33.089 exception:BSON representation of supplied JSON array is too large: code FailedToParse: FailedToParse: Date expecting integer milliseconds: offset:171
And the exported date format looks like
{ "date" : { "$date" : "2014-06-02T06:39:28.869-0700" }
I have verified that using mongoimport on the same machine as the mongoexport works fine, so I assume there is a compatablility issue between mongoimport/export from 2.4 to 2.6. Due to firewall restrictions I need to use the two different machines for moving the data around.
Does anybody have any good work arounds for this problem. I have not seen an option to export in the old format as far as I can tell. I also cannot tell from the release notes, what is causing the compatability error.
I ended up using mongodump and mongorestore instead of mongoimport and mongo export. There was no compatability issue using bson documents instead of json.

Mongodump and mongorestore; field not found

I'm trying to dump a database from another server (this works fine), then restore it on a new server (this does not work fine).
I first run:
mongodump --host -d
This creates a folder dump/db which contains all of the bson documents.
Then in the dump folder, I'm running:
mongorestore -d dbname db
This works and iterates through the files, but I get this error on dbname.system.users
Wed May 23 02:08:05 { key: { _id: 1 }, ns: "dbname.system.users", name: "_id_" }
Error creating index dbname.system.usersassertion: 13111 field not found, expected type 16
Any ideas how to resolve this?
If it realy different versions, use --noIndexRestore option. And create all index after that.
Any chance the source and destination are different versions?
In any case, to get around this, restore the collections individually using the -c flag to the target DB and then build the indexes afterward. The system collection is the one used for indexes, so it is fairly easy to recreate - try it last once everything else has been restore, and if it still fails you can always just recreate the relevant indexes.
The issue could also caused by this bug in older versions of Mongo (In my case it was 2.0.8):
https://jira.mongodb.org/browse/SERVER-7181
Basically, you get 13111 field not found, expected type 16 error when it should actually be prompting you to enter your authentication details.
And example of how I fixed it:
root#precise64:/# mongorestore /backups/demand/ondemand.05-24-2013T114223/
connected to: 127.0.0.1
[REDACTED]
Fri May 24 11:48:15 going into namespace [test.system.indexes]
Fri May 24 11:48:15 { key: { _id: 1 }, ns: "test.system.users", name: "_id_" }
Error creating index test.system.usersassertion: 13111 field not found, expected type 16
# Error when not giving username and password
root#precise64:/# mongorestore -u fakeuser -p fakepassword /backups/demand/ondemand.05-24-2013T114223/
connected to: 127.0.0.1
[REDACTED]
Fri May 24 11:57:11 /backups/demand/ondemand.05-24-2013T114223/test/system.users.bson
Fri May 24 11:57:11 going into namespace [test.system.users]
1 objects found
# Works fine when giving username and password! :)
Hope that helps anyone who's issue doesn't get fixed by the previous 2 replies!
This can also happen if you are trying to mongorestore into MongoDB 2.6+ and the dump you are trying to restore contains a system.users table in any database other than admin. In MongoDB 2.2 and 2.4 the system.userscollections could occur in any database. The auth schema migration associated with MongoDB 2.6 moved all users into the system.users table in the admin database, but left behind the system.users tables in the other databases (MongoDB 2.6 just ignores these). This seems to cause this assertion when importing into MongoDB 2.6.