com.mongodb.MongoException: not talking to master and retries used up - mongodb

My search is not working now. I guess because my index was not configured for replica set:
curl -XPUT 'http://localhost:9200/_river/mongodb/_meta' -d '{
"type": "mongodb",
"mongodb": {
"db": "mongo",
"host": "local",
"port": "40000",
"collection": "users"
},
"index": {
"name": "api",
"type": "users"
}
}'`
Is there anyway to declare a replica set properly so that elasticsearch can find the master, the way PHP driver does:
$m = new Mongo(
"mongodb://localhost:40000,localhost:41000",
array("replicaSet" => true)
);
so that elasticsearch can automatically fail over to another member.

I solved this simply by updating to the latest version of the client driver.
The previous (minor) version had trouble connecting to the latest mongo server.

Related

MongoSource from Kafka connect create weird _data key

I'm using KafkaConnect - MongoSource with the following configuration:
curl -X PUT http://localhost:8083/connectors/mongo-source2/config -H "Content-Type: application/json" -d '{
"name":"mongo-source2",
"tasks.max":1,
"connector.class":"com.mongodb.kafka.connect.MongoSourceConnector",
"key.converter":"org.apache.kafka.connect.storage.StringConverter",
"value.converter":"org.apache.kafka.connect.storage.StringConverter",
"connection.uri":"mongodb://xxx:xxx#localhost:27017/mydb",
"database":"mydb",
"collection":"claimmappingrules.66667777-8888-9999-0000-666677770000",
"pipeline":"[{\"$addFields\": {\"something\":\"xxxx\"} }]",
"transforms":"dropTopicPrefix",
"transforms.dropTopicPrefix.type":"org.apache.kafka.connect.transforms.RegexRouter",
"transforms.dropTopicPrefix.regex":".*",
"transforms.dropTopicPrefix.replacement":"my-topic"
}'
For some reason, when I consume messages, I'm getting a weird key:
"_id": {
"_data": "825DFD2A53000000012B022C0100296E5A1004060C0FB7484A4990A7363EF5F662CF8D465A5F6964005A1003F9974744D06AFB498EF8D78370B0CD440004"
}
I have no Idea where did it come from, My mongo document's _id is UUID, When I'm consuming messages, I was expected to see the documentKey field at my consumer key.
Here is a message example of what the connector published into kafka:
{
"_id": {
"_data": "825DFD2A53000000012B022C0100296E5A1004060C0FB7484A4990A7363EF5F662CF8D465A5F6964005A1003F9974744D06AFB498EF8D78370B0CD440004"
},
"operationType": "replace",
"clusterTime": {
"$timestamp": {
"t": 1576872531,
"i": 1
}
},
"fullDocument": {
"_id": {
"$binary": "+ZdHRNBq+0mO+NeDcLDNRA==",
"$type": "03"
},
...
},
"ns": {
"db": "security",
"coll": "users"
},
"documentKey": {
"_id": {
"$binary": "+ZdHRNBq+0mO+NeDcLDNRA==",
"$type": "03"
}
}
}
The documentation related to schema for Kafka connect configuration is really limited out there. I know it is too late to reply but lately I have also been facing the same issue and found out the solution by trial and error.
I added these two configuration to my mongodb-kafka-connect configuration -
"output.format.key": "schema",
"output.schema.key": "{\"name\":\"sampleId\",\"type\":\"record\",\"namespace\":\"com.mongoexchange.avro\",\"fields\":[{\"name\":\"documentKey._id\",\"type\":\"string\"}]}",
But still even after this I don't know whether resume_token of change stream as a key for kafka partition allotment had any significance in term of performance or even for the case where resume_token gets expired due long time of inactivity.
P.S. - The final version of my kafka connect configuration for mongodb as a source is this -
{
"tasks.max": 1,
"connector.class": "com.mongodb.kafka.connect.MongoSourceConnector",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.storage.StringConverter",
"connection.uri": "mongodb://example-mongodb-0:27017,example-mongodb-1:27017,example-mongodb-2:27017/?replicaSet=replicaSet",
"database": "exampleDB",
"collection": "exampleCollection",
"output.format.key": "schema",
"output.schema.key": "{\"name\":\"ClassroomId\",\"type\":\"record\",\"namespace\":\"com.mongoexchange.avro\",\"fields\":[{\"name\":\"documentKey._id\",\"type\":\"string\"}]}",
"change.stream.full.document": "updateLookup",
"copy.existing": "true",
"topic.prefix": "mongodb"
}

River Plugin Not_analyzed option for Elasticsearch

I'm using Elasticsearch 1.1.1, River Plugin and MongoDB 2.4
I have a field called cidr that is being analyzed. I need to set it so that it is not_analyzed anymore to use it with Kibana correctly. Following is the index I used. But now Im going to reindex it again (delete and write a new one.)
Whats the proper way to write a new index in a way that the values in the "cidr" field are not analyzed? Thank you.
curl -XPUT 'http://localhost:9200/_river/mongodb/_meta' -d '{
"type": "mongodb",
"mongodb": {
"db": "collective_name",
"collection": "ips"
},
"index": {
"name": "mongoindex"
}
}'
I see. It's working now. Mapping should be created BEFORE creating the index.
curl -XPUT "localhost:9200/mongoindex" -d '
{
"mappings": {
"mongodb" : {
"properties": {
"cidr": {"type":"string", "index" : "not_analyzed"}
}
}
}
}'
This is it. :)

updates in mongodb not getting pushed in elasticsearch

I am using elasticsearch and mongodb. Setup river week ago. Today on searching records just check the elasticsearch index are not getting updated or modified as records in mongodb are added daily.
I am using following lines to create river
curl -XPUT "localhost:9200/_river/myindex/_meta" -d '
{
"type": "mongodb",
"mongodb": {
"host": "localhost",
"port": "27017",
"db": "qna",
"collection": "collection"
},
"index": {
"name": "myindex",
"type": "index_type"
}
}'

Not all documents are indexed with ElasticSearch and MongoDB

I am using ElasticSearch 1.1.0 (I was running 1.2.0 but had issues with a ElasticSearch plugin) and MongoDB 2.6.1. I've installed them using the tutorial provided at enter link description here. When I create an index using
curl -XPUT "localhost:9200/_river/tenders/_meta" -d '{
"type": "mongodb",
"mongodb": {
"servers": [
{ "host": "127.0.0.1", "port": 27017 }
],
"options": { "secondary_read_preference": true },
"db": "tenderdb",
"collection": "tenders"
},
"index": {
"name": "tendersidx",
"type": "page"
}
}'
Indexing starts fine of the collection but it only indexes a part of the collection. E.g. the collection has at the moment 5184 records while only 1060 are indexed.
Avish's comment did the trick, he wrote: "ElasticSearch rivers only monitor changes in the other data store; your river should only track documents added to the collection after the river has been set up."

Index mongoDB with ElasticSearch

I already have MongoDB and installed Elasticsearch with Mongoriver. So I set up my river:
$ curl -X PUT localhost:9200/_river/database_test/_meta -d '{
"type": "mongodb",
"mongodb": {
"servers": [
{
"host": "127.0.0.1",
"port": 27017
}
],
"options": {
"secondary_read_preference": true
},
"db": "database_test",
"collection": "event"
},
"index": {
"name": "database_test",
"type": "event"
}
}'
I simply want to get events that have country:Canada so I try:
$ curl -XGET 'http://localhost:9200/database_test/_search?q=country:Canada'
And I get:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
I am searching the web and I read that I should first index my collection with Elasticsearch (lost the link). Should I index my Mongodb? What should I do to get results from an existing MongoDB collection?
The mongodb river relies on the operations log of MongoDB to index documents, so it is a requirement that you create your mongo database as a replica set. I assume that you're missing it, so when you create the river, the initial import sees nothing to index. I am also assuming that you're on Linux and you have a handle on the shell cli tools, so try this:
Follow these steps:
Make sure that the mapper-attachments Elasticsearch plugins is also installed
Make a backup of your database with mongodump
edit mongodb.conf (usually in /etc/mongodb.conf, but varies on how you installed it) and add the line:
replSet = rs0
"rs0" is the name of the replicaset, it can be whatever you like.
restart your mongo and then log in its console. Type:
rs.initiate()
rs.slaveOk()
The prompt will change to rs0:PRIMARY>
Now create your river just as you did in the question and restore your database with mongorestore. Elasticsearch should index your documents.
I recomend using this plugin: http://mobz.github.io/elasticsearch-head/ to navigate your indexes and rivers and make sure your data got indexed.
If that doesnt work, please post which versions you are using for the mongodb-river-plugin, elasticsearch and mongodb.