Are there any proven set of recommendations for JBoss clustering like
Recommended number of minimum physical machines for JBoss cluster?
How do we conclude on RAM requirement for Spring+Hibernate based Apps to be run on JBoss server instance?
What is possible Minimum number of CPUs to be available for each Application server instance?
Is it better to have more physical boxes with less number of Application server instances per box or less number of physical boxes with more number of Application server instances?
A recommended way of weight-based load balancing if available on JBoss?
What is the right proxy plug-in for load balancing and right algorithm to do that?
Appreciate your suggestions
Seeing words like "ideal" and "better way" make me wonder if you can define and quantify either one. It's doubtful for your special case; it's even more difficult for a general case.
Your choices are restricted to the techniques that JBOSS clustering makes available to you (e.g. round robin, least busy, etc.) The hard work of choosing and tuning to create an optimum solution is up to you. The answer is likely to depend on the details of your situation. No one knows those, except perhaps you.
P.S. - That's what I'd call a run-on sentence. I'd refactor that question into several if I were you. Add some punctuation.
Related
Why is networking not in the core code? I understand there are many different needs and environments but it seems like an opportunity to get many people contributing to a common project.
edit: This is for Office Hours, they asked for questions to be posted on SO. Maybe a new tag is appropriate.
https://github.com/kubernetes/community/blob/master/events/office-hours.md
Cause there are literally dozens of ways to configure networking, depending on so many factors that it is hard or even impossible to say that one of them is the one. Plus, for many it can be observed that networking is one of the hardest parts of bootstraping a cluster, and you really need it to be tailored to your needs. There is no "one fits all" in this scope. It's just not possible. There have been multiple ways to network since like 1960s, competing systems, different ideas. Some people survive with static routing, other need dynamic like RIP/OSPF another scale calls for BGP. CNI is as far as it makes sense - provides a common interface that is pluggable with different implementations. That is how it is, how it should be, and how it most likely will be for a long time :)
In our firm we run complex simulations using our own software developed in .NET. These simulations are well-suited to parallel computation and we currently make much use of the various multi-threading features native to .NET. Even so, simulations often take hours or days.
We'd like to explore the potential of distributing computation over our local network of high-performance (24 core) workstations to access more CPU power. However we have no experience in this area.
Searching on Google reveals a few MPI-based options such as Pure MPI, MPI.NET, plus some commercial software such as Frontier.
Which solution should we consider for something that is ideally well-suited to a .NET environment and is relatively easy to set up?
Thanks!
Multithreading != grid computing, so you will need to rewrite some parts of your application regardless of what you will choose in the end.
I don’t know your network infrastructure but it sounded to me, like you would want to use normal desktop workstations to run distribute the code. I wouldn’t use MPI for that. MPI was rather developed for clusters and supercomputers where the network supports high bandwidth and low latency. Those aren’t the properties of a traditional office network (unless I understood something wrong).
The next thing you have to deal with is the fact that users shouldn’t turn off their machines if computations are performed on them. No grid computing platform (including MPI) deals with these kind of issues, as it is usually running on server hardware which has little failures and are running 24/7.
I don’t think there is a simple and inexpensive solution to this. You could have a service running on each machine which could execute code from DLLs with predefined parameters and send responses. Those assemblies could be downloadable from some windowsshare. But you want to have really huge peaces of work to be distributed like this. You wouldn’t get almost any improvements if the application runs only for a minute or less.
In the end you’d need also a service to find those services which are online or not, some kind of in memory DB where every service could write the IP address and that it’s online so that the clients would know to whom they can distribute the work. This could be done using RavenDB (as you said you are working with .Net), Redis or an application which was actually written for these kind of problems, Zookeeper.
thanks you very much in advance.
First of all, I conceive scalability as the ability to design a system that doest not change when the demand of its services, whatever they are, increases considerably. May you need more hardware (vertically or horizontally0? Fine, add it at your leisure because the system is prepared and has been designed to cope with it.
My question is simple to ask but presumably very complex to answer. I would like to know what you I look at in a framework to make sure it will scale accordingly, both in number of hits and number of sessions running simultaneously.
This question is not about technology nor a particular framework at all, it is more a theoretical question.
I know that depend very much on having a good database design and a proper hardware behind with replication, etc... Let's assume that this all exists, however yet my framework must meet some criteria, what?
Provide a memcache?
Ability to run across multiple machines (at the web server level) and use many replicated databases? But what is in the software that makes that possible?
etc...
Please, let's not relate the answers with any particular programming language or technology behind.
Thanks again,
D.
I think scalability depends most of all on the use case: do you expect huge amounts of data, then you should focus on the database, if it's about traffic, focus on the server, is it about adding new features, focus on your data-model and the framework you are using...
Comparing a microposts-service like Twitter to a university website or a webservice like GoogleDocs you will find quite different requirements.
First of all the common notion of scalability is the ability of a software to improve in throughput or capacity if more hardware resources are added (CPUs, memory, bandwidth etc).
Software that does not improve in increased resources is not scalable.
Getting out of the definitions, I think your question is related to evaluation of frameworks you are planning to introduce to your implementation that may affect your software's ability to scale.
IMHO the most important factor to evaluate when introducing a framework is to see if there is hidden serialization in it (that serialization in effects transfers to/affects your software)
So if you introduce a framework that introduces serialization in your application that can affect your ability to scale.
How to evaluate?
Careful source code inspection (if open source)
Are there any performance guarantees offered by those that build the
framework?
Do measurements yourself to see how introducing this framework
affects your performance and replace if not satisfied
I am unsure whether this question belongs here, so please feel free to migrate it if it doesn't.
My question is this, Is it possible to combine many different PC units to work as one?
Take for example, buying 3 different HP desktop PCs. Then link the hardware so that they act as one PC.
If so, please point me to some resources I can use.
Thanks for your time.
Note
I am not referring to linking them over a network, but rather, making the actual hardware work together.
I am not sure this is possible, so I am sure all my google search terms are not related to the issue.
You should realize that linking them over a network does not obviate their ability to work together to complete a task. Most supercomputers and clusters today are interconnected via a network (albeit a very high speed one like Infiniband). The key is to have software that can understand that it's operating in a distributed environment (e.g. MPI libraries). You might also take a look at OpenMP or Hadoop. It really depends on what you want to do with it.
You can not link some computer together to behave like one!!! Therefore you will need special hardware, which offers you the possibility to extend the numbers of CPU's working together. (Like a cray)
If you are talking about write an application that will be processed by those computers, you may be referring to MPI.
You can use the Open MPI to do that, most of languages nowdays have MPI libraries.
You can find a more elaborated information about Parallel Computing on Wikipedia Parallel Computing Article.
Good day,
Our school, a small high school in semi-rural New Zealand, is currently looking into online homework solutions. Being one of the IT guys, I have been asked to look into some of the options. We have checked around and there are no robust solutions that cover what we are looking for. So, we are considering development of our own system, either on our own or in collaboration with some other schools.
Before I put significant time into any one option, I would thought I should ask for some expert advice.
Please keep in mind that one of our major obstacles is that around 20% of our students are on dial-up because broadband is not available in their area.
We are also not limited to the technologies listed, they just are the ones that we have been looking into up to this point.
With that in mind, here goes.
1. Is there a way to pre-determine the bandwidth needed for these technologies?
2. If bandwidth continued to be too limiting, could the final solution stand alone so we could distribute it to students on CD or USB stick?
3. What are some pros/cons of each for use with databases, specifically mysql or postgresql? (After all we do need to keep track of lots of data)
4. What are some pros/cons of each for of these RIA development?
I appreciate everyone for sharing their time and expertise on the matter.
Cheers,
Ben
1) If you write full-AJAX application, such as in GWT, the bandwitch will be:
a) the size of application java script, images, etc., you may consider that everything is loaded when user logs in (cache for images may seems to be big, but it's easily overloaded)
b) the size of communication - in GWT it depends only from you! no magic full-frame reloading, sending is only what YOU are wanting to send
2) I do not catch your point, stand alone applications can be distributed such way, applications that use databases generally can't
3) postgresql has high compatibility with Oracle - same transaction+select for update behaviour, pgPLSQL is highly inspired by PL/SQL (easy to rewrite stored procedures).
I personally suggest MySQL for a school project for its simplicity. PostgreSQL is powerful but a bit complicate to configure and the visual tool for optimizing queries not good.
Without considering the bandwidth, I definitely suggest ZK since, again, it is much easier to learn, to develop and to maintain (also much more powerful). The bandwidth consumption and latency of GWT really depends how much effort you want to invest, and how skillful your people are familiar with distributed computing, while the network bandwidth is basically the states of UI (not data), which is reasonably small. In short, you could have the best network bandwidth and latency if you optimize it at the best with GWT, while ZK is less to worry but, if you want to improve, you have to use jQuery (i.e, in JavaScript).
Thanks lechlukasz, I appreciate your comments and insight.
I will clarify my point about stand alone applications. We have a number of students, as high as 20%, who do not have access to broadband due to their geographic location. We are considering, as part of the design, how we may be able to distribute a stand alone version.
For instance, if we were to abstract all the database calls using a separate class in GWT, we could recompile a stand alone version that didn't make the database calls. The database would likely only be for tracking results and reporting.
In reality, we would likely implement the front end product first with references to empty methods for storing the results in a database and implement those methods at a later time.
For the record, we have started to code up some test cases using GWT/SmartGWT and are pleased with the results. Although we cannot comment on the other technologies considered because we didn't try them to the same extent, we are pleased with the results to this point of the project.
Cheers,
Ben