My search is not working now. I guess because my index was not configured for replica set:
curl -XPUT 'http://localhost:9200/_river/mongodb/_meta' -d '{
"type": "mongodb",
"mongodb": {
"db": "mongo",
"host": "local",
"port": "40000",
"collection": "users"
},
"index": {
"name": "api",
"type": "users"
}
}'`
Is there anyway to declare a replica set properly so that elasticsearch can find the master, the way PHP driver does:
$m = new Mongo(
"mongodb://localhost:40000,localhost:41000",
array("replicaSet" => true)
);
so that elasticsearch can automatically fail over to another member.
I solved this simply by updating to the latest version of the client driver.
The previous (minor) version had trouble connecting to the latest mongo server.
Related
I'm using KafkaConnect - MongoSource with the following configuration:
curl -X PUT http://localhost:8083/connectors/mongo-source2/config -H "Content-Type: application/json" -d '{
"name":"mongo-source2",
"tasks.max":1,
"connector.class":"com.mongodb.kafka.connect.MongoSourceConnector",
"key.converter":"org.apache.kafka.connect.storage.StringConverter",
"value.converter":"org.apache.kafka.connect.storage.StringConverter",
"connection.uri":"mongodb://xxx:xxx#localhost:27017/mydb",
"database":"mydb",
"collection":"claimmappingrules.66667777-8888-9999-0000-666677770000",
"pipeline":"[{\"$addFields\": {\"something\":\"xxxx\"} }]",
"transforms":"dropTopicPrefix",
"transforms.dropTopicPrefix.type":"org.apache.kafka.connect.transforms.RegexRouter",
"transforms.dropTopicPrefix.regex":".*",
"transforms.dropTopicPrefix.replacement":"my-topic"
}'
For some reason, when I consume messages, I'm getting a weird key:
"_id": {
"_data": "825DFD2A53000000012B022C0100296E5A1004060C0FB7484A4990A7363EF5F662CF8D465A5F6964005A1003F9974744D06AFB498EF8D78370B0CD440004"
}
I have no Idea where did it come from, My mongo document's _id is UUID, When I'm consuming messages, I was expected to see the documentKey field at my consumer key.
Here is a message example of what the connector published into kafka:
{
"_id": {
"_data": "825DFD2A53000000012B022C0100296E5A1004060C0FB7484A4990A7363EF5F662CF8D465A5F6964005A1003F9974744D06AFB498EF8D78370B0CD440004"
},
"operationType": "replace",
"clusterTime": {
"$timestamp": {
"t": 1576872531,
"i": 1
}
},
"fullDocument": {
"_id": {
"$binary": "+ZdHRNBq+0mO+NeDcLDNRA==",
"$type": "03"
},
...
},
"ns": {
"db": "security",
"coll": "users"
},
"documentKey": {
"_id": {
"$binary": "+ZdHRNBq+0mO+NeDcLDNRA==",
"$type": "03"
}
}
}
The documentation related to schema for Kafka connect configuration is really limited out there. I know it is too late to reply but lately I have also been facing the same issue and found out the solution by trial and error.
I added these two configuration to my mongodb-kafka-connect configuration -
"output.format.key": "schema",
"output.schema.key": "{\"name\":\"sampleId\",\"type\":\"record\",\"namespace\":\"com.mongoexchange.avro\",\"fields\":[{\"name\":\"documentKey._id\",\"type\":\"string\"}]}",
But still even after this I don't know whether resume_token of change stream as a key for kafka partition allotment had any significance in term of performance or even for the case where resume_token gets expired due long time of inactivity.
P.S. - The final version of my kafka connect configuration for mongodb as a source is this -
{
"tasks.max": 1,
"connector.class": "com.mongodb.kafka.connect.MongoSourceConnector",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.storage.StringConverter",
"connection.uri": "mongodb://example-mongodb-0:27017,example-mongodb-1:27017,example-mongodb-2:27017/?replicaSet=replicaSet",
"database": "exampleDB",
"collection": "exampleCollection",
"output.format.key": "schema",
"output.schema.key": "{\"name\":\"ClassroomId\",\"type\":\"record\",\"namespace\":\"com.mongoexchange.avro\",\"fields\":[{\"name\":\"documentKey._id\",\"type\":\"string\"}]}",
"change.stream.full.document": "updateLookup",
"copy.existing": "true",
"topic.prefix": "mongodb"
}
I'm using Elasticsearch 1.1.1, River Plugin and MongoDB 2.4
I have a field called cidr that is being analyzed. I need to set it so that it is not_analyzed anymore to use it with Kibana correctly. Following is the index I used. But now Im going to reindex it again (delete and write a new one.)
Whats the proper way to write a new index in a way that the values in the "cidr" field are not analyzed? Thank you.
curl -XPUT 'http://localhost:9200/_river/mongodb/_meta' -d '{
"type": "mongodb",
"mongodb": {
"db": "collective_name",
"collection": "ips"
},
"index": {
"name": "mongoindex"
}
}'
I see. It's working now. Mapping should be created BEFORE creating the index.
curl -XPUT "localhost:9200/mongoindex" -d '
{
"mappings": {
"mongodb" : {
"properties": {
"cidr": {"type":"string", "index" : "not_analyzed"}
}
}
}
}'
This is it. :)
I am using elasticsearch and mongodb. Setup river week ago. Today on searching records just check the elasticsearch index are not getting updated or modified as records in mongodb are added daily.
I am using following lines to create river
curl -XPUT "localhost:9200/_river/myindex/_meta" -d '
{
"type": "mongodb",
"mongodb": {
"host": "localhost",
"port": "27017",
"db": "qna",
"collection": "collection"
},
"index": {
"name": "myindex",
"type": "index_type"
}
}'
I am using ElasticSearch 1.1.0 (I was running 1.2.0 but had issues with a ElasticSearch plugin) and MongoDB 2.6.1. I've installed them using the tutorial provided at enter link description here. When I create an index using
curl -XPUT "localhost:9200/_river/tenders/_meta" -d '{
"type": "mongodb",
"mongodb": {
"servers": [
{ "host": "127.0.0.1", "port": 27017 }
],
"options": { "secondary_read_preference": true },
"db": "tenderdb",
"collection": "tenders"
},
"index": {
"name": "tendersidx",
"type": "page"
}
}'
Indexing starts fine of the collection but it only indexes a part of the collection. E.g. the collection has at the moment 5184 records while only 1060 are indexed.
Avish's comment did the trick, he wrote: "ElasticSearch rivers only monitor changes in the other data store; your river should only track documents added to the collection after the river has been set up."
I already have MongoDB and installed Elasticsearch with Mongoriver. So I set up my river:
$ curl -X PUT localhost:9200/_river/database_test/_meta -d '{
"type": "mongodb",
"mongodb": {
"servers": [
{
"host": "127.0.0.1",
"port": 27017
}
],
"options": {
"secondary_read_preference": true
},
"db": "database_test",
"collection": "event"
},
"index": {
"name": "database_test",
"type": "event"
}
}'
I simply want to get events that have country:Canada so I try:
$ curl -XGET 'http://localhost:9200/database_test/_search?q=country:Canada'
And I get:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
I am searching the web and I read that I should first index my collection with Elasticsearch (lost the link). Should I index my Mongodb? What should I do to get results from an existing MongoDB collection?
The mongodb river relies on the operations log of MongoDB to index documents, so it is a requirement that you create your mongo database as a replica set. I assume that you're missing it, so when you create the river, the initial import sees nothing to index. I am also assuming that you're on Linux and you have a handle on the shell cli tools, so try this:
Follow these steps:
Make sure that the mapper-attachments Elasticsearch plugins is also installed
Make a backup of your database with mongodump
edit mongodb.conf (usually in /etc/mongodb.conf, but varies on how you installed it) and add the line:
replSet = rs0
"rs0" is the name of the replicaset, it can be whatever you like.
restart your mongo and then log in its console. Type:
rs.initiate()
rs.slaveOk()
The prompt will change to rs0:PRIMARY>
Now create your river just as you did in the question and restore your database with mongorestore. Elasticsearch should index your documents.
I recomend using this plugin: http://mobz.github.io/elasticsearch-head/ to navigate your indexes and rivers and make sure your data got indexed.
If that doesnt work, please post which versions you are using for the mongodb-river-plugin, elasticsearch and mongodb.