Is there some search engine SDK or API that I can use for local search? - gwt

I have many documents located on my disk and I want to build a search engine to search through them.
I know Google Desktop Search or Bing Desktop Search could do that. But I want to know if there's some SDK/API to do that so I can do some customization.
What I want to achieve, is that I can provide a document and the local search engine will return all the documents similar to it.

In general there are Lucene and Solr that can help to solve search related needs in Java (I guess you are using Java based on the tag GWT).
But I don't know how to do a search by example with these tools. I think you have to extract the relevant information of the document to construct a search based on it.

Related

How to implement full text search in NoSQL database?

Has anyone tried to deploy NoSQL database with Full Text Search feature?
I read a lot of topics here in StackOverFlow and some other sites but they were all in 2011 and 2012 which I think there are a lot of updates to this moment.
I have a project that requires a full text search feature and I am trying to pick the right NoSQL database.
I am thinking also of ElasticSearch and Solr to enable this feature?
Is MongoDB Full Text Search Feature working fine? or it has performance and scalability issues?
Thanks,
Elasticsearch is working pretty good. You can use analyzers to stem your text. Also you can store your data in JSON format. It has "match_phrase" function allowing you to make full text search in a field that you want to search. Take a look at that.You can find documentations here : https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html
The MongoDb full text search feature works well from version 2.6, however, its full text search feature relies on a text index and you can only add one such index per collection. Depending on your exact requirements, this may be too limiting. Check the MongoDB documentation for more details. If you need a more flexible full text search functionality, Elasticsearch will be a better option.

Google custom search engine and partial matching

I plugged in the Google Custom Search Engine to my MediaWiki site. It seems to work fine. However, how do I also make it search for results using partial matching? For example: when I searched for 'loft', it returned only the pages containing the whole word 'loft', but I was also looking for the pages containing 'loft' as a substring of some words, like 'createloft', 'deleteloft', 'loftstudy', etc.
Google doesn't provide such advanced search features. If you need things like per-namespace search, substring matching, regex search etc. use CirrusSearch, which is based on ElasticSearch.

google custom search engine control search results

My question is simple, how do I make a certain page be find-able by a specific keyword.
cse it's working fine it just don't manage to find everything he supposed to.
Google custom works like google search, manipulating results may not be possible, however, check out the synonyms tab in google.com/cse.
Say your users search for MBA you can configure it to show results for Master of Business Administration

How to search Feeds?

I want to create an iPhone project which search feeds, like in google reader if we search for some word in "Add a subsciption" tab it will display all feeds related to that so we can add feeds easily. Any Idea how it can be done.
Thanks in advance,
Google Reader, presumably, relies on Google's own vast collection of known feeds and its search engine to locate relevant ones. My guess is you'd probably need to do the same (maybe use their FeedBurner API) unless you plan to create and maintain a service to collect, categorize, and offer up feeds to searches.
I have used http://code.google.com/apis/ajaxfeeds/documentation/reference.html#findFeeds
And it worked for me.

Searching in Lucene .Net

I have used Lucene .Net for Indexing and using StandardAnalyzer to at time of Indexing. Now I want to search say 'attach'. In document 'attached' is there. How i get the successful hit for word 'attach'. Please help me as soon as possible.
Reducing word forms to their roots is called "stemming" in search engine software. Look at the first answer to How to enable stemming when searching using lucene.net? for a few options for stemmers using Lucene.net.