restarting a multi tier server architecture - jboss

My project has 4 servers, 2 on one layer and 2 on another layer. I use a context switch to load balance each layer so it shares requests amongst the two servers. 2 servers lie in the presentation tier side and the other 2 servers lie in the application tier (or we call it the business tier). The presentation tier has a dependency on the application tier. Now, the question I have is if one of the servers in the application tier fails to start but the other three servers start up correctly can you just restart that one application server that failed or do you have to restart all 4 servers? We are using jboss on these servers if that helps with the question. If more info is needed please ask.

I did some tests and to reiterate what was mention as a comment by alpha yes you can just restart one application tier without the need to restart all of the other servers. I did notice that, I don't know if it's a configuration thing or jboss thing, when you restart the application tier server most transactions tend to hit the other application tier that wasn't restarted. I don't know why this happens but it isn't a problem because the transactions, though minimal, that end up going to the restarted server work fine and after some time the balance does return back to 50-50.

Related

Does it make sense Service Fabric in a single machine?

Service Fabric looks great but right now, I do not have enough demand to hire 5 machines (I think it is the minimum number of nodes of a cluster).
I was thinking to install Service Fabric SDK on a single Azure Virtual Machine.
I know that I will not have the main benefits of a service fabric application: reliability and scalability, but I will be developing in a framework that I can easily can hire more machines and to scale if it is necessary in the future without changing anything.
Right now, I have 15 microservices and I plan to add 10 more. At the present I am using IIS and deployment and maintenance is not too fast. It seems that Service Fabric could solve it, plus it would be easily scalabe
Does it make sense to use Service Fabric in a single machine? or better to keep under IIS.
Technically it is possible, though it doesn't make much sense. The one node cluster, runs with a special configuration and so, scale out of that cluster is not supported. You can use a single node cluster for testing and then create another one for production use.

LiveRebel Update Strategy

I am trying to utilize LiveRebel on my production environment. After most parts are configured I tried to perform update on my application from lets say version 1.1 to 1.3 as shown below
Does this mean that LiveRebel require two server installation on 2 physical IP addresses ? Can I have two server on 2 virtual IP addresses ?
Rolling restarts use request routing to achieve zero downtime for the users. Sessions are first drained by waiting for old sessions to expire and routing new ones to an identical application on another server. When all sessions are drained, application is updated, while the other server handles the requests.
So, as you can see, for zero downtime you need additional server to handle the requests while application is updated. Full restart doesn't have that requirement, but results in downtime for users.
As for the question about IPs, as long as the two server (virtual) machines can see each other , doesn't really make much difference.

How to make restfull service truely Highly Available with Hardware load balancer

When we have a cluster of machines behind a load balancer (lb), generally hardware load balancer have persistent connections,
Now when we need to deploy some update on all machines (rolling update), the way to do is by bringing one machine Out of rotation, looks for no request sent to that server via lb. When the app reached no request state then update manually.
With 70-80 servers in picture this becomes very painful.
Can someone have a better way of doing it.
70-80 servers is a very horizontally scaled implementation... good job! Better is a very relative term, hopefully one of these suggestions count as "better".
Implement an intelligent health check for the application with the ability to adjust the health check while the application is running. What we do is have the health check start failing while the application is running just fine. This allows the load balancer to automatically take the system out of rotation. Our stop scripts query the load balancer to make sure that it is out of rotation and then shuts down normally which allows the existing connections to drain.
Batch multiple groups of systems together. I am assuming that you have 70 servers to handle peak load. This means that you should be able to restart several at a time. A standard way to do this is to implement a simple token granting service with a maximum of 10 tokens. Have your shutdown scripts checkout a token before continuing.
Another way to do this is with blue/green deploys. That means that you have an entire second server farm and then once the second server farm is updated switch load balancing to point to the new server farm.
This is an alternate to option 3. Install both versions of the app on the same servers and then have an internal proxy service (like haproxy) switch the connections between the version of the app that is deployed. For example:
haproxy listening on 8080
app version 0.1 listening on 9001
app version 0.2 listening on 9002
Once you are happy with the deploy of app version 0.2 switch haproxy to send traffic to 9002. When you release version 0.3 then switch load balancing back to 9001 etc.

Questions Concerning Using Celery with Multiple Load-Balanced Django Application Servers

I'm interested in using Celery for an app I'm working on. It all seems pretty straight forward, but I'm a little confused about what I need to do if I have multiple load balanced application servers. All of the documentation assumes that the broker will be on the same server as the application. Currently, all of my application servers sit behind an Amazon ELB and tasks need to be able to come from any one of them.
This is what I assume I need to do:
Run a broker server on a separate instance
Configure each application instance to connect to that broker server
Each application instance will also be be a celery working (running
celeryd)?
My only beef with that is: What happens if my broker instance dies? Can I run 2 broker instances some how so I'm safe if one goes under?
Any tips or information on what to do in a setup like mine would be greatly appreciated. I'm sure I'm missing something or not understanding something.
For future reference, for those who do prefer to stick with RabbitMQ...
You can create a RabbitMQ cluster from 2 or more instances. Add those instances to your ELB and point your celeryd workers at the ELB. Just make sure you connect the right ports and you should be all set. Don't forget to allow your RabbitMQ machines to talk among themselves to run the cluster. This works very well for me in production.
One exception here: if you need to schedule tasks, you need a celerybeat process. For some reason, I wasn't able to connect the celerybeat to the ELB and had to connect it to one of the instances directly. I opened an issue about it and it is supposed to be resolved (didn't test it yet). Keep in mind that celerybeat by itself can only exist once, so that's already a single point of failure.
You are correct in all points.
How to make reliable broker: make clustered rabbitmq installation, as described here:
http://www.rabbitmq.com/clustering.html
Celery beat also doesn't have to be a single point of failure if you run it on every worker node with:
https://github.com/ybrs/single-beat

Does using AppFabric as a session state server for asp.net make it highly available?

I have a 4 server asp.net farm. I want to use AppFabric as my session state server but I'm not sure if it will do what I want it to do. Some questions...
1: If some of the nodes crash, is any of the session data lost?
2: Does each server have a copy of the session data in case of failure?
The documentation states that you need to be running Windows Server 2008 Enterprise Edition or above for the "High Availability" features of AppFabric. I am running Windows Server 2008 Standard.
3: Does that mean I need the enterprise edition to have my session data stay safe if some of the nodes fail, or does AppFabric automatically keep the session data copied on all machines in case of failure?
I have't played much with the session state bits yet so this is based on AppFabric generally.
If you're not on Enterprise Edition, you can't use high availability :-( Essentially, in a non-HA scenario, each cache is 'tied' to a single node in your cluster, so the answer to your question is - it depends which node crashes. If it's the one that's got the cache on it then yes, you're boned.
If, however, you are in a HA environment any cache that is created withthe Secondaries option switched on, has two copies of the cache spread across the nodes so that if one goes down, the other copy picks up the load (and another secondary copy is created on another node).
There's quite a good conceptual explanation of HA for AppFabric here.