I need to keep data in ElasticSearch in sync with the data I have and maintain in MongoDB.
Currently I have a batch job that finds all the changed data and updates it in elastic search using Spring-Batch and Spring-Data-ElasticSearch.
This works, but I'm looking for a solution where every change is directly mirrored in ElasticSearch.
give this a go
mongo connector
have a read through this 5 ways to sync data
Related
We are writing a scheduler to take backup of data from one collection to another collection in Mongodb using spring boot.
The data can be 500K to 1Million docs.
Once the copy was completed we should delete the data from old collection. Currently we are using spring data pagination to get the chunks of data and saving to new collection and then deleting.
Is this approach fine or any optimistic approach is suggestible.
As you using spring data with pagination for this task, it means you utilizing container for process documents.
You can trigger set of system command(mongo export and after import delete data from source) from scheduler.
Example like..
SystemCommandTasklet tasklet = new SystemCommandTasklet();
tasklet.setCommand("");
tasklet.setWorkingDirectory("/home/merlin");
I am trying to load huge amount of data from mongodb. Data size is in millions. So, it makes sense to pull this data using appropriate indexes and also query mongo in parallel. Thats why to do batch reads, I am using mongo spark.
How to use the appropriate index while querying mongo using mongospark connectors via withPipeline feature?
Also, I was exploring "com.mongodb.reactivestreams.client.MongoCollection". If possible, can someone throw some light on this?
i have data in mongodb database and now i want to visualize realtime data in kibana .if data changes in mongodb it should be reflect in kibana.so how to implement it ...please guide me to implement this
You could use Monstache. It is a sync daemon written in Go that continuously indexes your MongoDB collections into Elasticsearch.
I have a scala microservice that serves as database api, and the database I am using is mongodb.
I want to add elasticsearch that will contain all the data that my mongodb have, and I need to keep it in sync when the mongodb is updated, how can I achieve it?
what would be the best approach to do this? is there some plugins or something that can help me with this task?
Look at the 5 Different ways to synchronize data from MongoDB to ElasticSearch, personally, I did it with Logstash where I simply filtered one collection and dumped to ES every 24 hrs, the use case is key to determine what strategy/tool is to use.
I am working on a project where we have millions of entries stored in MongoDB database and, i want to index all this data using SOLR.
After extensive Searching i came to know there are no proper "Data Import Handlers" for mongoDB database.
Can anyone tell me what are the proper approaches for indexing data in MongoDB using SOLR ?
I want to use all the features of SOLR and want it to be scalable in real-time. I saw one or two approaches from different posts but not sure how they will work real time..
Many Thanks
10Gen introduce Mongodb Connector. You can integrate Mongodb with Solr using this tool.
Blog post : Introducing Mongo Connector
Github page : mongo-connector
I have created a plugin to allow you to load data from MongoDb using the Solr data import handler.
Check it out at:
https://github.com/james75/SolrMongoImporter
I wrote a response to a similar question, except it was how to import data from MySQL into SOLR. The example code is in PHP, but should give you a general idea. All you would need to do is set up an iterator to step through your MongoDB assets, extract the data to SOLR datatypes, and then save it to your SOLR index.
If you want it to be real-time, you could add some custom code to the save mechanism (assuming this can be done with MongoDB), and save directly to the SOLR index, then run a commit script to commit data every 15 minutes (via cron).