I'm looking for the right way so use ElasticSearch with MongoDB. I want to save several informations in MongoDB. Additionally i want to save a larger text with ElasticSearch to support complex fulltext-search.
My problem at the moment is:
I'm not sure what the best solution is for this. Most solutions i found to synchronize MongoDB with ElasticSearch are using "river" which is deprecated!
What is the best way to combine these two technologies?
Is it even the best way to save it in MongoDB and ElasticSearch?
I found multiple articles that explained, that ElasticSearch alone is not safe enough and that you have to use another DBMS.
Also under robustness on the mongoDB website I found this:
Unfortunately, Elasticsearch (and the components it's made of) does not currently handle OutOfMemory-errors very well.
[source]
So saving the data redundant is probably the best way.
Thanks in advance!
Hei,
We are also working with both Elasticsearch and MongoDb. We started with a river and after having a lot of issues with it we got rid of it before becoming deprecated. The way we do it is: when saving data to mongo we create a message in a queue which notifies the search storage to do the insert/delete operation with the given data.
So basically we keep them in sync manually and there will always be a delay between mongo and elaticsearch. The good part is that if elasticsearch would fail, we have implemented an endpoint which reimports the data from mongo to ES. Also, the structure inside ES it's different from the one in mongo. Before, it was a lot more complicated to do this with the river. Imagine that we even had our own custom implementation.
Hope my answer helps at least a bit.
I'm trying to use mongoDB with Morphia but still I have a problem with deleting documents. Is there any additional plugin or wrapper which works with Mongo and provides something like transactions in DBMS?
No, there are no (multi document) transactions. There are two possible solutions:
You can restructure your data into a single document instead of spreading it over multiple tables. Thus MongoDB's single document transactions (if you call them that) are enough for you. You can solve many problems with embedded entities or arrays. You might want to start a question related to "schema" design, if you're unsure how to approach this.
Your problem absolutely needs transactions across multiple documents / tables. Then MongoDB is simply not the right tool and you should use a relational database.
Don't fight the tool, pick the right one...
I have a Mongo database that I did not create or architect, is there a good way to introspect the db or print out what the structure is to start to get a handle on what types of data are being stored, how the data types are nested, etc?
Just query the database by running the following commands in the mongo shell:
use mydb //this switches to the database you want to query
show collections //this command will list all collections in the database
db.collectionName.find().pretty() //this will show all documents in the database in a readable format; do the same for each collection in the database
You should then be able to examine the document structure.
There is actually a tool to help you out here called Variety:
http://blog.mongodb.org/post/21923016898/meet-variety-a-schema-analyzer-for-mongodb
You can view the Github repo for it here: https://github.com/variety/variety
I should probably warn you that:
It uses MR to accomplish its tasks
It uses certain other queries that could bring a production set-up to a near halt in terms of performance.
As such I recommend you run this on a development server or a hidden node of a replica or something.
Depending on the size and depth of your documents it may take a very long time to understand the rough structure of your database through this but it will eventually give one.
This will print name and its type
var schematodo = db.collection_name.findOne()
for (var key in schematodo) { print (key, typeof key) ; }
I would recommend limiting the result set rather than issuing an unrestricted find command.
use mydb
db.collectionName.find().limit(10)
var z = db.collectionName.find().limit(10)
Object.keys(z[0])
Object.keys(z[1])
This will help you being to understand your database structure or lack thereof.
This is an open-source tool that I, along with my friend, have created - https://pypi.python.org/pypi/mongoschema/
It is a Python library with a pretty simple usage. You can try it out (even contribute).
One option is to use the Mongoeye. It is open-source tool similar to the Variety.
The difference is that Mongoeye is a stand-alone program (Mongo Shell is not required) and has more features (histograms, most frequent values, etc.).
https://github.com/mongoeye/mongoeye
Few days ago I found GUI client MongoDB Compass with some nice visualizations. See the product overview. It comes directly from the mongodb people and according to their doc:
MongoDB Compass is designed to allow users to easily analyze and understand the contents of their data collections within MongoDB...
You may've asked about validation schema. Here's the answer how to get it:
How to retrieve MongoDb collection validator rules?
Use Mongo Compass
which does a sample as explained here
Which does a random sample of 1000 documents to get you the schema - it could miss something but it's the only rational option if you database is several GBs.
Visualisation
The schema then can be exported as JSON
Documentation
You can use MongoDB's tool mongodump. On running it, a dump folder is created in the directory from which you executed mongodump. In that folder, there are multiple folders that correspond to the databases in MongDB, and there are subfolders that correspond to the collections, and files that correspond to the documents.
This method is the best I know of, as you can also make out the schema of empty collections.
Is it possible to run MongoDB commands like a query to grab additional data or to do an update from with in MongoDB's MapReduce command. Either in the Map or the Reduce function?
Is this completely ludicrous to do anyways? Currently I have some documents that refer to separate collections using the MongoDB DBReference command.
Thanks for the help!
Is it possible to run MongoDB commands... from within MongoDB's MapReduce command.
In theory, this is possible. In practice there are lots of problems with this.
Problem #1: exponential work. M/R is already pretty intense and poorly logged. Adding queries can easily make M/R run out of control.
Problem #2: context. Imagine that you're running a sharded M/R and you are querying into an unsharded collection. Does the current context even have that connection?
You're basically trying to implement JOIN logic and MongoDB has no joins. Instead, you may need to build the final data in a couple of phases by running a few loops on a few sets of data.
The Node wiki lists a few different mongo driverrs for node. What are the pros and cons on each one?
Right now I want to efficiently tail a Mongo capped collection from node, but I suspect I will end up using mongo from node quite heavily and if stackoverflow can save me from having to switch to a different driver later that'd be great.
In general I have no particular interest in object relational mappers; I mainly want to make clean and efficient insert, update and find calls asynchronously.
It's hard to say which is the best.
My current favorites from the api-sugar point of view are:
Mongoose as a ODM
Mongoskin witch basically replaces the driver's callbacks-based api with one based on promises (when/then)