What is the easier way to do a full text search with mongoose?
Mongoose is an "ORM" for MongoDB. MongoDB has some docs on full text search. However, MongoDB is not designed to be used for FTS, and big deployments typically use other tools like Solr or Sphinx.
If you're just trying to query with a regex, MongoDB supports that. The syntax should be similar in Mongoose.
MongoDB 2.4 contains experimental full text search capabilities: http://docs.mongodb.org/manual/release-notes/2.4/#text-indexes
There are a few mongoose keyword plugins for smaller scale stuff as well as an elasticsearch plugin. http://plugins.mongoosejs.com is a great place to discover more.
You might want to check out elastic search and mongoosastic. Take a look
http://www.elasticsearch.org/
https://github.com/jamescarr/mongoosastic
Hope this helps
Semi-recent developments for those who come looking, both MongoDB and Mongoose allow for text searches now:
What's new in Mongoose 3.8.9
Example usage of text search
Related
Hope everyone is doing great.
I had a bit of a "weird" question regarding doing non-exact/related searches with MongoDB.
I'm building a web application with a sort of "search engine" search bar if you will (I.e.: people input stuff and the results are documents related to that search instead of exact results), and I'm having a difficult time deciding the best approach.
Recently I discovered about MongoDB's full text search and it's been amazing so far in terms of what I want to achieve. However, as my search functionalities get more complex (adding stuff like sorting, pagination, etc.) I notice a lack of documentation on best practices in comparison to using find() queries. I mean, I know there are aggregation pipeline stages for doing those types of functionalities, but I have found the amount of proper examples kinda lacking.
Taking that into consideration, I've starting to consider changing my approach to using find() queries, but I can't seem to find examples of people using them for non-exact/related matches in the same way of what full text search can achieve. How would you even do that with find()? Would you use a more elaborated Regex or something similar? Is it even worth the try?
I would love to hear your anecdotes, specially as your search features became more complex, to ensure that the app remains performant. Do you swear by full text search? Or have you achieved search engine-like search using the good old find()? If so, how?
Thank you everyone!
Basically what i know in mongodb full text search is come with 2 types.
mongoDB atlas search
On-premise text search
To perform text search you can learn more on below ref docs
REFERENCE: https://www.mongodb.com/docs/manual/core/link-text-indexes/
We are planning to store millions of documents in MongoDB and full text search is very much required. I read Elasticsearch and Solr are the best available solutions for full text search.
Is Elastic search is mature enough to be used for Mongodb full text search? We also be sharding the collections. Does Elasticsearch works with Sharded collections?
What are the advantages and disadvantages of using Elasticsearch or Solr?
Is MongoDB capable of doing full text search?
There are some search capabilities in MongoDB but it is not as feature-rich as search engines.
http://www.mongodb.org/display/DOCS/Full+Text+Search+in+Mongo
We use Mongo with Solr to make content searchable. We prefer Solr because
It is easy to configure and customize
It has large community (This is really helpful if you are working with opensource tools)
Since we didn't work with ES i could not say much about it. You can found some discussions about Solr vs ES on the links below.
Solr vs ES 1
Solr vs ES 2
Solr vs ES 3
I have a professional experience with both Solr/MySQL and ElasticSearch/MongoDB.
If you are going to query a lot your search engine, you already shard your MongoDB (I mean, if you want to shard too your search engine): you should use ElasticSearch, unless what you want to do can't be done with ElasticSearch. And you should use it even if you are not going to shard.
ElasticSearch is a new project on top of Lucene that brings the sharding mechanism, from someone who is used to distributed environments and search (Shay Bannon made Compass and worked for Gigaspaces, the datagrid editor).
ElasticSearch is as easy as MongoDB to shard, I think it is even simpler and the default works great for most cases.
I don't like Solr so much.
The query langage is not structured at all (but it's the case of plugins and Lucene, and I think you can use this unstructured query langage with ES too)
I don't think there is a proper Solr client. Solr java client sucks, and I hearh PHP guys also complaining, while ElasticSearch Java client is very nice, much more typesafe and offers async support (nice if you use Netty for exemple). With Solr, you will do a LOT of string concatenation.
Less easy to scale
Not so new project, I felt the technical dept it has. ElasticSearch is born from Compass, so I guess all the technical dept has been dropped to have a fresh new approach.
Concerning data importing, I have experience with both Solr DataImportHandler and ElasticSearch rivers (CouchDB and MongoDB). What I can tell you is:
Solr permits to do more things, but in a very unstructured XML way, and the documentation doesn't help you so much to understand what is really happing once you are out of the hello world and try to use some advanced features.
ElasticSearch approach is more simple and also limited but has out of the box support for some technologies while DataImportHandler seems more complex-SQL friendly
With my Solr project I had to use manual indexation for some documents, but it was mostly because of the impossibility to denormalize the needed data into a document (the Solr project uses MySQL).
There is also a new MongoDB connector for both Solr and ElasticSearch which I need to test asap :)
http://blog.mongodb.org/post/29127828146/introducing-mongo-connector
So in the end, I'll definitly choose ElasticSearch, because:
It now has a great community
Many people I know with experience with Solr like ElasticSearch
The client side is safer and structured, and provides async with Java Futures
Both can probably import data from MongoDB easily with the new connector
As far as I know, it permits to do almost everything Solr does (in my experience but I'm not a search engine expert)
It adds sharding out of the box
It adds percolation which can help to built realtime scalable applications (but you'll probably need an additional messaging technology)
The source code I read has nearly no technical dept compared to Solr (at least on the client side), and it seems easy to create plugins.
In terms of MongoDB natively, no it doesn't have full text search support. You can see that it is a popular feature request:
https://jira.mongodb.org/browse/SERVER-380
From what I know of the ES river plugin for MongoDB, it tails the oplog for it's functionality. Since a sharded setup would have multiple oplogs and there would be no way to easily alter that code to connect via a mongos.
Similarly for Solr, the examples I have seen usually involve similar behavior to the ES plugin. Some more solid info here:
http://blog.knuthaugen.no/2010/04/cooking-with-mongodb-and-solr.html
I have not got any experience using one but others have made comparisons before, take a look here:
Solr vs. ElasticSearch
ElasticSearch, Sphinx, Lucene, Solr, Xapian. Which fits for which usage?
MongoDB can't do efficient full text search. You can do wildcard searches on fields, but i don't think these use indexes efficiently.
I would recommend using the river functionality of ElasticSearch to automatically push the documents from MongoDB to ElasticSearch.
elasticsearch-river-mongodb is a MongoDB to Elasticsearch river that when a document changes in MongoDB, ElasticSearch will monitoring the oplog and then automatically update its index.
This minimises the problem of keeping the two datastores in sync, as ElasticSearch is just monitoring the replication tables of Mongo.
Mongo is not at al good for fulltext search.
Obviously you need to index you fields for fast searching, and indexing fields containing BIG data (long long strings) will be failed in mongo. it has a limit of 1k for index, if you have content more thn 1k, it will be ignored by index and will not be displayed in your search results. obviously if you are trying to perform a full text search for your articles, mongo is not at al a good choice.
Currently, in MongoDB 2.4.6, there now IS a full-text search in MongoDB and it is more feature rich, then in previous versions. On http://docs.mongodb.org/manual/core/text-search/ are described the capabilities of the new functionality.
Worth mentioning:
tokenizes and stems the search term(s) during both the index creation and the text command execution. assigns a score to each document that
contains the search term in the indexed fields. The score determines the relevance of a document to a given search query.
However, in this answer (from September 2013) https://stackoverflow.com/a/18631775/1920149 you can see, that mongo still warns from using this functionality in production. This functionality is still in beta stage.
Full text search become possible in product environment with Mongodb since the version 2.6 by creating text index on the required fields.
indexe text in mongodb
I am looking for a very fast autocomplete solution for displaying results in mobile apps. I am using sphinx as full text index solution, but I thing if sphinx is the best one solution for autocomplete search, because after the index is searched, then I need to ask mysql for the results. Is there better and faster solution?
Well you can use string attributes, to store the actual text.
Then you don't need to go back to the database at all. Can just query sphinx. Sphinx stores attributes in memory; so doesn't slow the actual sphinx query searching down noticeably.
Sphinx works well for autocomplete in my experience.
If you are running sphinx 2.0.2 or greater:
index_exact_words = 1
Sphinx supports wildcard searching. Have a look at the parameter "enable_star". If you set it to 1 and restart sphinx, you should be able to search using wildcards.
Check it out in the Sphinx docs.
To find matches where any word contains "micro", the search term needs to be "micro".
We are trying to develop a strategy for using elasticsearch for full-text searching on our mongodb instance. It would appear that every key that we want to use as a filter must be included in elastics index. Potentially we could want to use every key in mongo as a filter - i.e. full-text search on description, filter by date and telephone number. Does anyone have any real-world experiences of adding full-text to mongo that they can share?
Maybe we can just use elasticsearch as a db?
I do not see any reason to use ElasticSearch in conjunction with MongoDb, just use ElasticSearch as separate document storage for documents, that have to be searched. And yes, you can even as whole db. Of course it depends on your domain model and other factors.
If you don't need stemming, fuzzy search, complicated wildcard search, you can do search with mongoDb. When new document inserted, split it to words in lower case, and add to the array "words" for example. Later you can perform search request against this array with regex. Not you can' use I (ignore case) option in this regex, and you can search only LIKE% wildcard (or without wildcard), otherwise search would not use mongoDb index.
One more option - you can try to find river for mongoDb
Another option - is to use Lucene if you are using Java. Probably you will be able to extend Directory class, in such a way, that Lucene will store index in MongoDb instead of file system or RAM. I have not made any research in this area, but I think it is possible
I experimented with full text search in MongoDB by splitting the words in the string like #Umar suggested. Honestly though, its a database and not a search engine so I would use Mongo for persistant storage and ElasticSearch for the search engine part of it. As a matter of fact, I would stick with something like Postgresql for persistant storage and then push the data you want to search out to the search engine. http://gdal.org/ogr/drv_elasticsearch.html is a driver that will allow you to quickly export your data from one RDBMS to ElasticSearch. THe data does not have to be geospatial in order to use it GDAL as long as their is a way to connect to the input source.
Adam
I have am using mongodb with mongomapper to store all my products. Each product belongs to multiple categories that have many levels i.e. category, sub category etc.
Each product has many search fields that are embedded documents in product.
All this is working and I now want to add search to the app.
The search system needs text search: multiple, dynamic, faceted search including min/max range search.
I have been looking into sunspot gem but having difficulty setting it up on dev let alone trying to run it in production! And I have also looked at sphinx.
But I am wondering if using just mongomapper / mongodb will be quick enough and the best way, as its quite a complex search system ?
Any help / suggestions / experiences / tutorials and examples on this would be most appreciated.
Thanks a lot,
Rick
I've been involved with a very large Sphinx powered search and I think its awful. Very difficult to configure if you want anything past a very simple full-text search. Solr\Lucene, on the other hand, is incredibly flexible and was unbelievably easier to setup and get running.
I am not using Solr in conjunction with MongoDB to power full text search with all the extra goodies, like facets, etc. Depending on how you configure Solr, you may not need to even hit your MongoDB for data. Or, you may tell Solr to index fields, but not to store them and instead you just store the ObjectId's that correspond to data inside of MongoDB.
If your search truly is a complex search system, I very strongly recommend that you do not use MongoDB for search and go with Solr. One big reason is that MongoDb doesnt have a full text feature - instead, it has regular expression matches. The Regex matches work wonderfully but will only use indexes in certain cases.