How to do custom mapping using mongo connector with elasticsearch - mongodb

I wanna connect mongodb and elasticsearch. I used mongo connector to connect them. I followed instruction from below link to setup==>
http://vi3k6i5.blogspot.in/2014/12/using-elastic-search-with-mongodb.html
I am able to connect mongodb and elasticsearch. But by default mongo connector created indices in elasticsearch for all databases of mongodb.
I want to create only one index for my one database and I want to insert only selected field of documents. for example: in mongo shell==>
use hotels
db.restaurants.insert(
{
"address" : {
"street" : "2 Avenue",
"zipcode" : "10075",
"building" : "1480",
"coord" : [ -73.9557413, 40.7720266 ],
},
"borough" : "Manhattan",
"cuisine" : "Italian",
"grades" : [
{
"date" : ISODate("2014-10-01T00:00:00Z"),
"grade" : "A",
"score" : 11
},
{
"date" : ISODate("2014-01-16T00:00:00Z"),
"grade" : "B",
"score" : 17
}
],
"name" : "Vella",
"restaurant_id" : "41704620"
}
)
This will create database hotels and collection restaurants. Now I want to create index and I want to put only address field in elasticsearch for that index.
Below are the steps what I tried but thats not working :
First I start mongo connector like below :
Imomadmins-MacBook-Pro:~ jayant$ mongo-connector -m localhost:27017 -t localhost:9200 -d elastic_doc_manager --oplog-ts oplogstatus.txt
Logging to mongo-connector.log.
Then from new shell tab, I made command like :
curl -XPUT 'http://localhost:9200/hotels.restaurants/'
curl -XPUT "http://localhost:9200/hotels.restaurants/string/_mapping" - d'{
"string": {
"properties" : {
"address" : {"type" : "string"}
}
}
}'
But only index is created in elasticsearch named as hotels.restaurants. I can't see any document for index hotels.restaurants.
Please suggest me how to add document for hotels.restaurants

Well I got an answer to my question, while starting mongo connector we can specify collection name and the list of fields we are interested in. Please check below command ==>
$ mongo-connector -m localhost:27017 -t localhost:9200 -d elastic_doc_manager --oplog-ts oplogstatus.txt --namespace-set hotels.restaurants --fields address,grades,name

Related

Connect to Mongo Atlas Secondary

On Cloud Mongo (Mongo Atlas) Free tier, it has 3 members of servers. How can I connect to the Secondary host from Mongo shell? Their example only shows how to connect to Primary.
"members" : [
{
"_id" : 0,
"name" : "***-shard-00-00-***.mongodb.net:27017",
....
},
{
"_id" : 1,
"name" : "***-shard-00-01-***.mongodb.net:27017",
.....
},
{
"_id" : 2,
"name" : "***-shard-00-02-***.mongodb.net:27017",
.....
}
]
You need to use the --ssl flag and specify authSource.
Try:
mongo "mongodb://***-shard-00-02-***.mongodb.net:27017/?authSource=admin" --ssl
at the very minimum. Of course you can add options for username, password, database to connect, etc...
mongo "mongodb://<username>:<password>#***-shard-00-02-***.mongodb.net:27017/<database>?authSource=admin" --ssl
I hope this helps.

MongoDB export issue

I am trying to export the MongoDB output to CSV format. But have trouble.
See the following document in my collection:
db.save.find().pretty();
{
"_id" : ObjectId("58884b11e1370511b89d8267"),
"domain" : "google.com",
"emails" : [
{
"email" : "f#google.com",
"first" : "James",
"Last" : "fer"
},
{
"email" : "d#gmail.com",
"first" : "dear",
"last" : "near"
}
]
}
Exporting the document to csv
C:\MongoDB\Server\bin>mongoexport.exe -d Trial -c save -o file.csv --type csv --fields domain,emails
2017-01-25T12:50:54.927+0530 connected to: localhost
2017-01-25T12:50:54.929+0530 exported 1 record
The output file is:
domain,emails
google.com,"[{""email"":""f#google.com"",""first"":""James"",""Last"":""fer""},{""email"":""d#gmail.com"",""first"":""dear"",""last"":""near""}]"
But if I import the same file, the output is different then it was in the actual collection. See the example:
> db.sir.find().pretty()
{
"_id" : ObjectId("5888529fa26b65ae310d026f"),
"domain" : "google.com",
"emails" : "[{\"email\":\"f#google.com\",\"first\":\"James\",\"Last\":\"fer\"},{\"email\":\"d#gmail.com\",\"first\":\"dear\",\"last\":\"near\"}]"
}
I do not want that extra \ in my import document. That's it. Please tell me if it is avoidable and if yes, then what should be the format of CSV to be given for import.
This is not expected format. So let me know how I can make the proper format. Kindly help me with this query.

how to output the result to a file in monogodb

I wan to list all database in Monogodb and output to a txt file, but it did not work.
mongo 127.0.0.1/test -eval 'var c= show databases;' >>db_list.txt
the error message is
MongoDB shell version: 2.6.12
connecting to: 127.0.0.1/test
2016-12-06T12:12:32.456-0700 SyntaxError: Unexpected identifier
anyone knows how to make this work. I appreciate any help.
To use eval and list databases directly on a shell, the following query should be helpful.
mongo test --eval "printjson(db.adminCommand('listDatabases'))"
MongoDB shell version: 3.2.10
connecting to: test
{
"databases" : [
{
"name" : "local",
"sizeOnDisk" : 73728,
"empty" : false
},
{
"name" : "m034",
"sizeOnDisk" : 11911168,
"empty" : false
},
{
"name" : "test",
"sizeOnDisk" : 536576,
"empty" : false
}
],
"totalSize" : 12521472,
"ok" : 1
}
This will list all the collection names in a particular DB.
mongo test --eval "printjson(db.getCollectionNames())"
MongoDB shell version: 3.2.10
connecting to: test
[
"aaa",
"areamodel",
"email",
"hex",
"key",
"mel",
"multi",
"ques",
"rich"
]
A sample execution for reference (screenshot)
Instead of test you can go simply,
mongo db_name query.js > out.json
here query.js contains any query like:
printjson( db.adminCommand('listDatabases') )

error when using mongorestore to replay oplog with binData field

When using mongorestore with option --oplogReplay to replay oplogs, I found a strange error that mongorestore cannot handle binData field's set operation. You maybe meet the same error if you do this:
insert a test data.
db.testData.insert({_id: 10000, data: BinData(0, ""), size: 10})
update its binData field.
db.testData.update({_id: 10000}, {$set: {data: BinData(0, "CgxVfs93PiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA=")}})
update its other field
db.testData.update({_id: 10000}, {$set: {size: 20}})
check with oplog
use local
db.oplog.rs.find().sort({$natural: -1})
you may see the following response:
{ "ts" : Timestamp(1435627154, 1), "h" : NumberLong("-4979206321598144076"), "v" : 2, "op" : "u", "ns" : "test.testData", "o2" : { "_id" : 10000 }, "o" : { "$set" : { "size" : 20 } } }
{ "ts" : Timestamp(1435627144, 1), "h" : NumberLong("2899524097634687825"), "v" : 2, "op" : "u", "ns" : "test.testData", "o2" : { "_id" : 10000 }, "o" : { "$set" : { "data" : BinData(0,"CgxVfs93PiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA=") } } }
{ "ts" : Timestamp(1435627136, 1), "h" : NumberLong("-8486373688715225152"), "v" : 2, "op" : "i", "ns" : "test.testData", "o" : { "_id" : 10000, "data" : BinData(0,""), "size" : 10 } }
dump these two oplog and replay it
In bash shell:
mongodump --port 27017 -d local -c oplog.rs --query '{"ts" : {$gte: Timestamp(1435627144, 1)}}' -o ./oplogD/
mv ./oplogD/local/oplog.rs.bson ./oplogR/oplog.bson
mongorestore --port 27017 --oplogReplay ./oplogR/
after this you would find data not as expected. In my own, data changes to this.
{ "_id" : 10000, "data" : BinData(0,"ADRAAAAAPiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA="), "size" : 20 }
The size field is really correct, but the data field is not correct.
The most strange thing would be this, if you dump only one oplog and replay it, the data would be correct.
mongodump --port 27017 -d local -c oplog.rs --query '{"ts" : Timestamp(1435627144, 1)}' -o ./tmpD/
mv ./tmpD/local/oplog.rs.bson ./tmpR/oplog.bson
mongorestore --port 27017 --oplogReplay ./tmpR/
After oplog replayed, the 'data' field is quite correct.
{ "_id" : 10000, "data" : BinData(0,"CgxVfs93PiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA="), "size" : 10 }
Why does this strange thing happen?
It was fixed in this commit.
https://github.com/mongodb/mongo-tools/commit/ed60bbfae7d2b5239bea69f162f0784e17995e91
Trace the bug report in JIRA.
https://jira.mongodb.org/browse/TOOLS-807

mongo db get data from two tables using same id

i have two tables history and jobs
my history table contains
> db.history.find()
{ "id" : "21", "browser" : "FF","os" : "Windows" "datetime" : "2013-11-26 17:04:21", "_id" : ObjectId("5294873d6b441e2c16000002") }
db.jobs.find()
{ "_id" : ObjectId("5289c147db9ed2b022f95a36"), "id" : "21", "launch" : "ertret", "names" : "234", "script" : "art-pagination" }
From the above two tables i need to get browser, launch, script and os by using common id: 21
How it is possible.
You can do it by using following two queries. It is not possible to get it with single query.
> db.history.find({'id':21}, {'browser':1, 'os':1})
> db.jobs.find({'id':21}, {'launch':1,'script':1 })