Clustering/load balancing strategies using JBoss - jboss

I´ve studied about clustering and load balancing in JBoss and I liked two options in particular: mod_cluster and HAProxy.
I´ve already read both documentations and need to decide which one to choose.
What would be your choice and why?
Regards,
Rafael Roque

Related

Connection between programs over the network

I want to dive into the whole diversity of tools which provide connection between programs over the network.
To clarify the question, I divide it on subquestions:
Why some groups of programs (or specific tools/frameworks/approaches with programming languages where this frameworks can be used) were popular in each period of time? (I expect description of problems which were solved, description of tools, why those tools are considered as best solution to those problems at that time, why some tools lost popularity)
What is the entire history of software communication over the network? (tools/approaches popularity precisely to decades)
What are the modern solutions to this problem?
I can distinguish only two significant approaches.
RPC, RMI and their implementations (I saw this, but it is about concrete problem and specific tools to solve this problem, I want to see the place of this problem in the whole picture of interconnection programs over the network. I heard about implementations: ONC RPC, XML-RPC, CORBA, DCOM, gRPC, but which are active now? which are reasonable to use? which are preferable and why? I want answers not to be opinion based, so I accept answers like "technology A better than technology B for problem X because ..." only if there is reliable research/statistics or facts). I heard that RPC and RMI were popular 10 years ago. Are they still?
Web services: REST, SOAP.
Am I miss something? Maybe there are some technologies which solve problem completely new way? Maybe there are technologies which can be treated as replacement to RPC(RMI) and Web Services? Can we replace RPC(RMI) by REST for any task? Can we replace RPC(RMI) by REST only for modern tasks? Should I separate technologies not as RPC and Web Services, but in some other manner?
As a partial answer, I can give you my feedback on the use of RabbitMQ.
As explain here, it provides a lot of different ways to use it :
RPC by implementing a "callback" queue
One to one, one to many routing strategy to propagate your events through your whole infrastructure and target the right destination.
It comes with the ability to persist messages to avoid loosing data when a crash appears but also with some plugins to increase possibilities (e.g x-delayed plugin)
This technologie written in Erlang is powerful and is a must try in term of communication between programs.
To your question „Am I missing something“: yes.
Very popular communication patterns are the so-called Event-Driven or Message-Driven protocols. This type of protocols are often used in distributed systems such web applications, microservices and IoT-Environments. The communication is complete asynchronously and allows building scalable and loosely coupled systems.
There are many different frameworks and methods for Event-Driven systems like WebSockets, WebHooks, Pub-Sub and Messaging-Librarys like AcitveMQ, OpenMQ, RabbitMQ, ZeroMQ and MQTT.
Hope this info helps for your research.

Apache Stanbol scalability and real-world applications

I'm starting a project with requirements such as NLP, storage of semantic data, content managment etc. and Apache Stanbol seems like a nice fit, but I'm not exactly sure it's ready so I'm trying to make an appropriate assessment before starting to work with it, as there are few things that worry me:
Stanbol seems a bit young and immature (newest version 0.12). Has anybody used it in a commercial project/application/setup (I failed to find this information online)? What is the scale of those projects?
How horizontally scalable is Stanbol? What are its cloud/clustering capabilities? As far as I know it relies on Apache Jena for storage, and Jena storage isn't horizontally scalable which would make Stanbol unable to scale horizontally as well. I might be wrong about this, but this is my current understanding, please correct me if I'm wrong. Maybe Jena can be swapped with something else to be used as RDF storage provider and I'm not aware of it?
Learning resources for Stanbol seem a little scarce. Does anyone know of a place/book/whatever where I can get more understanding about Stanbol under the hood (other than the official Stanbol website and the IKS website)?
Are there any good alternatives? I know there are nice alternatives regarding NLP (e.g. GATE, UIMA), but they lack CMS capabilities.
Thanks.
To your question:
1) I've been working on a project involving Stanbol(version 0.10). Its
still in the pre production stage. For CMS, we evaluated JackRabbit
and Alfresco. Alfresco (CMIS) was found to be a better choice in our case. What I
like about stanbol is the enhancement chains and the set of
Enhancement
Engines
that come by default. This is a small to mid size project.
3) I found this book (Instant Apache Stanbol, Packt Publishing)
very practical and useful while going about with my work especially the sections on Entity hubs and Enhancement engines.
A viable option is to use Redlink that offers content analysis and linked data services in the cloud using Apache Stanbol and Apache Marmotta in the back-end.
The Readlink team has worked on IKS and Apache Stanbol; for these reasons getting in contact with them can be a good starting point when deciding to use these technologies in production environments.

Anyone using HyperDex in production?

I just noticed that the relatively new Open Source noSQL database "HyperDex" has no mentions in questions in S.O. yet - is someone using it? How does it compare with other noSQL engines?
We're working with some folks at LinkedIn to use HyperDex to power some of their custom analytics applications. We're also in discussion with a couple startups to build applications on top of HyperDex.
Our mailing list is the HyperDex discuss list and has been relatively active as of late. Many of these folks are using HyperDex and helping us to improve it.
Finally, HyperDex holds its own against other popular NoSQL engines. The HyperDex performance benchmarks show that HyperDex offers both higher throughput and lower latency than other popular systems. The HyperDex tutorials show just how easy it is to deploy a cluster. Start with the QuickStart and work your way up to deploying a truly fault tolerant cluster.
Emin gave a talk in our company today. Sounds interesting project, but I think you would encounter some gotchas if deploy in production, such as load balancing and optimal subspace scheme.
You can find the performance comparison at their site. The paper is also worth reading.

How to set up a production server environment for a Scala Lift web application?

I am going to need to set up a production server to host some Scala Lift web services and applications but I've never dealt with JavaEE/servlet technologies. Could you point to a Scala/Lift-specific HOWTO on setting up a production server or, if you don't know of such a publication, explain it in more-or-less simple way?
Lift runs on any regular servlet container; so there's nothing particularly Lift-specific you need to do when building you're environment. That being said, chapter 15 of Lift in Action should help you out with the more general case of taking a Lift application to deployment.
Hope that helps.
Not sure if you are asking this, but you can set up server with Debian Lenny to serve Lift application using this reference in the Lift wiki, with a Jetty container, and PostgreSQL database. Usually the setup varies depending on the requirements of your application (which database, etc), so eventually will need to provide more information on what you need to setup for a given environment. Apart from this, reinforcing what Tim Perret said, chapter 15 in his book is really good detailing the servlet container to choose, and deployment techniques, tools, and options.

Are there any proven set of recommendations for JBoss clustering

Are there any proven set of recommendations for JBoss clustering like
Recommended number of minimum physical machines for JBoss cluster?
How do we conclude on RAM requirement for Spring+Hibernate based Apps to be run on JBoss server instance?
What is possible Minimum number of CPUs to be available for each Application server instance?
Is it better to have more physical boxes with less number of Application server instances per box or less number of physical boxes with more number of Application server instances?
A recommended way of weight-based load balancing if available on JBoss?
What is the right proxy plug-in for load balancing and right algorithm to do that?
Appreciate your suggestions
Seeing words like "ideal" and "better way" make me wonder if you can define and quantify either one. It's doubtful for your special case; it's even more difficult for a general case.
Your choices are restricted to the techniques that JBOSS clustering makes available to you (e.g. round robin, least busy, etc.) The hard work of choosing and tuning to create an optimum solution is up to you. The answer is likely to depend on the details of your situation. No one knows those, except perhaps you.
P.S. - That's what I'd call a run-on sentence. I'd refactor that question into several if I were you. Add some punctuation.