Using Apache Lucene Solr for Full Text Searches - zend-framework

We are planning to develop a web application the main highlight of that will be searching user input text (full text search). We are planning to use some PHP framework and APACHE Lucene Solr for this ? How different it is from Hibernate search? Which is a better option? What things should be kept in mind regarding the database if I want to have mostly full text searches on the DB ? Which is the best PHP framework to use with APACHE Lucene? Zend does provide some additional component to use with Lucene. Similarly, Symfony has a plugin for using lucene. This question is very generic in nature and hence any suggestion,regarding the development idea of the web application and its optimization, will be very valuable.

Apache Solr runs in a Java Web Container (Tomcat, Jetty...) and doesn't use a DB but stores directly on the filesystem in index files. The communication of your PHP Application with the Solr Server (running on the same server on a different port or a different server) is done via HTTP (XML, JSON). You need to enable this kind of communication in your PHP App. There are several implementations for connecting PHP with a Solr Server such as Solarium.
The Zend component you are referring to is probably Zend Search, an implementation of the Apache Lucene algorithm (so you skip the Java part) but you can't take advantage of Solr's features.

Solr has enough feature to use with it, It has full-text search, hit highlighting, faceted search, dynamic clustering, Caching, Range queries and geospatial search. We are running our CMS website on top of Solr, we able to handle all type of search request with in Solr.

Related

Recommendations for a Full-stack Framework for REST?

I am looking for a robust REST framework to eliminate all that boilerplate code with starting up a new REST-only web service (mobile clients). Is there a framework that already has this built-in where I could, for example, simply build the domain models and run with it? I would like to see:
Authentication & User Model
Logging
Basic CRUD
Permissions (for model access)
Scalability
It seems every web service at a minimum needs the above capabilities. Somebody, somewhere must have written a good re-usable framework with the above capabilities. Any ideas? I would prefer Node.js, Java or even hosting with a PaaS service provider that offers these features.
Spring 3 MVC provides a very nice and simple annotation based framework for REST.
See http://blog.springsource.org/2009/03/08/rest-in-spring-3-mvc/ it can be deployed on any java web server like Jetty or Tomcat.
A framework like XAP provides a combined solution of Spring and Jetty plus it's built for dynamic scaling.
See http://www.gigaspaces.com/xap.
Last if you want to easily on board this solution on any cloud CloudifySource provides an open source project which includes XAP capabilities and PaaS.
See http://www.cloudifysource.org
I use Symfony 1.4 for this. It is an PHP framework. It generates most of what you need for free. The database stuff is also quite easy as the Symfony uses ORM libraries (you can choose but I can recommend Doctrine: http://www.doctrine-project.org/).
For example the whole backend site(admin) generating is a matter of running one command. They have a great e-book fro free. More info here:http://www.symfony-project.org/.
There is also Symfony 2.X (http://symfony.com/), which have a lot of new features (e.g. new Doctrine 2.0). Especially with the bundle (plugin) https://github.com/FriendsOfSymfony/FOSRestBundle is the RESTful service quite easy.

how to integrate zend framework with cakephp 2.0 for the search optimization?

Is it possible to integrate the zend framework with cakephp 2.0?
I want to have search features in the cakephp 2.0 web-application using zend framework.
Define what your requirements for a search are. This questions is to generic.
There is a search plugin for CakePHP by the way: CakeDC Search
A plugin to create search indexes.
Maybe it's going to do what you want.
Your best solution for actual searching is to use Apache Solr. It's a search engine that can be used with any framework through its web service interface.
http://lucene.apache.org/solr/
It is based upon the same library Zend Framework Search uses called "Lucence", expect Apache Solr exposes "Lucence" functions as web services. Very nice and easy to use.

Restlet + mongoDB + Freemarker

We are making a web based application in Java that should be accessible to any device and so we zeroed in for Restlet for our REST based web service need.
For UI we are thinking of Freemarker together with Twitter bootstrap and database will be mongoDB. And guice for dependency injection.
Since I am new to most these technology stack, do you think this is fair choice for a long run. Also, for database mapper framework we decided to use Jongo it seems lightweight. Kundera is an option but it has lots of dependency. What you expert say ?
"Kundera is an option but it has lots of dependency." Not sure what do you mean by this statement? could you please explain it more?
Please take a look at https://github.com/impetus-opensource/Kundera/wiki/Kundera-Mongo-performance for performance using Kundera!
It really depends on your needs
REST Framework :
IHMO you should test at least theses 3 JAX-RS Frameworks : RestEasy / Jersey / Restlet and choose the one according to your needs.
JAX-RS Frameworks
https://stackoverflow.com/questions/1710199/which-is-the-best-java-rest-api-restlet-or-jersey
UI :
I've worked with Jersey + Freemarker through a framework called Webengine from Nuxeo, it was ok.
Nevertheless, you should consider a rich client approach based on Javascript/CSS/HTML (see Backbone.js, Ember.js)
Pros :
With such approach you could expose JSON REST services using a JAX-RS Framework (instead of freemarker/html services) .
Theses services can be consumed by a web application and/or native mobile apps (ios, android).
Cons:
Your team must have advanced javascript skills (this blog can help )
Database :
What kind of data do you need to store ?
MongoDB is document-oriented and flexible enough to cover lots of needs
As you said, Jongo is a lightweight API (500 lines of code + 1 dependency) over mongo-java-driver.
It allows you to query MongoDB as if you were in MongoShell (ie. with plain json/bson queries) and map your object using jackson.
This question is a good example: Mongo DB query in java
Relying on Restlet Framework for your RESTful web API/service backend sounds like a good choice for a multi-devices application. FreeMarker is very powerful and flexible so you should be in good company there as well.
I don't know too much about the other pieces of your stack.

Lucene indexing service

We want to use lucene in our J2EE web application. We want create separate lucene service(Which will be deployed in separate JBOSS server) for Lucene related functionality (like indexwriting/searching documents).
We will call lucene service from Our JEEE applcation for lucene related funcationality.
What is the best way to communicate between two application? RMI/http/webservice or any other?
Please give some thoughts..
Don't reinvent the wheel. use Solr.

Is Google Web Toolkit is fine to develop database based web application?

Is Google Web Toolkit is fine to develop database based web application or do you have any other suggestion?
Thanks to answerers!
For a heavy Database based web application, nothing beats Grails. Check out this tutorial by IBM. It will show you the power of Grails and how easy it is to develop database based web applications in minutes. I love GWT and smartGwt, but will go for them over pure grails only if there is a lot of non-database based front end (client side) logic.
If you do not have a programming language of choice (Grails is groovy based, which is based on Java), you could even look at Ruby on Rails which was the inspiration for Grails itself.
Alternately, you can add both grails and gwt in the mix by using this gwt grails plugin so that you have a powerful database integration, as well as a powerful front end developer. (I haven't used this though)
Sure, but you will need to create your own RPC service to get records from server to client and to deliver modified records back the server. But it isn't difficult at all.
Alternatively you could also use SmartGWT, which is an extension for GWT with more widgets, etc. They have data bound objects but in free version would would need to create your own data sources. If you decide to buy a license they seem to have database integration out of the box.
And additional note to consider with SmartGWT is that it has relatively big download size - about 3MB uncompressed and almost 1MB compressed (HTTP server should compresse it; it is in HTTP standard and it is transparent). So if it is going to be a service in public internet it might get quite long to load (often exceeding magic 8 seconds).
I had been working on GWT (Google Web Toolkit) for 1.5 years and learned that its a perfect platform for developing web application which uses backend database for its operation unless you have the right skill sets working on your project and a basic design which is developed according to the requirement of your project.