Is sticky sessions across multiple Data Centres with GTM possible? - session-state

I'm a software developer with limited knowledge of F5 GTMs.
We are building web applications that will run in Edge within a company network.
We have a set-up of two data centres & a GTM (F5) across the two. We use Kubernetes inside the DCs with Istio. We are building web applications that hold state in the DCs.
I'm told the GTM will allocate an IP for the data centre it decides to direct to with a TTL.
Once the TTL expires the browser will call the GTM again to get a new IP which the GTM could allocate to the other DC.
We were hoping to use sticky sessions & hold state in application memory but the GTM & it's TTL seem to make this unreliable. What I don't understand is I've worked at other places with F5 GTMs, active active deployments and sticky sessions. Is anyone aware of any way to get this to work with the GTM set-up we have? We could invest more and externalize the state but would prefer not to.

Related

How to provide multiple services through a cloud gateway?

Assume I'm working on a multiplayer online game. Each group of players may start an instance of the game to play. Take League Of Legends as an example.
At any moment of time, there are many game matches being served at the same time. My question is about the architecture of this case. Here are my suggestions:
Assume we have a cloud with a gateway. Any game instance requires a game server behind this gateway to serve the game. For different clients outside the cloud to access different game servers in the cloud, the gateway may differentiate between connections according to ports. It is like we have one machine with many processes each of them listening on a different port.
Is this the best we can get?
Is there another way for the gateway to differentiate connections and forward them to different game instances?
Notice that these are socket connections NOT HTTP requests to an API gateway.
EDIT 1: This question is not about Load Balancing
The keyword is ports. Will each match be served on a different port? or is there another way to serve multiple services on the same host (host = IP)?
Elaboration: I'm using client-server model for each match instance. So multiple clients may connect to the same match server to participate in the same match. Each match need to be server by a match server.
The limitation in mind is: For one host (=IP) to serve multiple services it need to provide them on different ports. Match 1 on port 1234. So clients participating in match 1 will connect to and communicate with the match server on port 1234.
EDIT 2: Scalability is the target
My match server does not calculate and maintain the world of many matches. It maintains the world of one match. This is why each match need another instance of the match server. It is not scalable to have all clients communicating about different matches to connect to one process and to be processed by one process.
My idea is to serve the world of each match by different process. This will require each process to be listening on a different port.
Example: Any client will start a TCP connection with a server listening on port A. Is there is a way to serve multiple MatchServers on the same port A (so that more simultaneous MatchServers won't result in more ports)?
Is there a better scalable way to serve the different worlds of multiple matches?
Short answer: you probably shouldn't use proxy-gateway to handle user connections unless you are absolutely sure there's no other way - you are severely limiting your scaling ability.
Long answer:
What you've described is just a load balancing problem. You can find plenty of solutions based on given restrictions via google.
For League Of Legends it can be quite simple: using some health-check find server with lowest amount of load and stick (kinda like sticky sessions) current game to this server - until the game is finished any computations for particular game are made there. You could use any kind of caching mechanism to store game - server relation for subsequent requests on gateway side.
Another, a bit more complicated example could be data storage for statistics for particular game - it's usually solved via sharding which is a usual consequence of distributed computing. It could be solved this way: use some kind of hashing function (for example, modulo) with game ID as parameter to calculate server number. For example 18283 mod 15 = 13 for game ID = 18283 and 15 available shards - so 13th server should store/serve this data.
Main problem here would be "rebalancing" - adding/remove a shard from cluster, for example.
Those are just two examples, you can google more of them using appropriate keywords. Just keep in mind that all of this is just a subset of problems of distributed computing.

Does this make sense for Orleans or SF and if so guidance please

We’re working to take our software to Azure cloud and looking at Orleans and Service Fabric (SF) as potential frameworks. We need to:
Populate our analysis engines with lots of data (e.g., 100MB to 2GB) per engine instance.
Maintain that state, and if an engine instance goes idle for say 20 minutes or more, we’d like to unload it (i.e., and not pay for the engine instance resource).
Each engine instance will support one to several end users with a specific data set.
Each engine instance can be highly interactive generating lots of plot data near realtime. We’re maintaining state as we don’t want to pay the price to populate engine instance for each engine interaction.
An engine instance action can take a few seconds, a few minutes, to even tens of minutes. We’ll want some feedback.
Users may access an engine instance every few seconds (e.g., to steer the engine towards a result based on feedback) and will want live plot data.
Each user will want to talk to a specific engine instance.
As a user expresses interest in running a simulation (i.e., standing up an engine instance), ideally we want him to choose small/medium/large computing resource to run his engine instance (i.e., based on the problem he’s trying to solve he may want more or less computing/memory power).
We’re considering Orleans and SF but we’re having difficulty specifying architecture based on above requirements. We’ve considered:
Trying to think about an SF partition, or an Orleans silo as an ‘engine instance’ described above.
Leveraging both Orleans and SF notion of fault tolerance through replication.
Leveraging local (i.e., to partition or silo) storage to store results and maintain state (i.e., for long periods or until idle for 20 minutes).
We’ve not understood how to:
Limit a silo or a partition to a single engine instance so that we can control resourcing of the engine instance.
Keep a user’s engine instance data separate from another users engine instance data.
Direct a request from a user (e.g., through a web API) to a particular engine instance.
Does this make sense for Orleans, does it make more sense for SF? Any pointers on how to implement the above would be helpful.
When you say SF I assume you mean SF Actors right?
You can use them the way you want, but in both cases does not look as the right solution for your problem, because:
Actors are single threaded, if you plan to share the same instance with multiple clients, each one would have to wait for the previous one to finish before it start processing anything. If you need to monitor the status of a running actor, you would have to make the actor publish the updates to external subscribers.
Actor state is isolated, so you can't access the state of other actors, the way to do it is provide a method to return it, but if the actor is running a command you have to wait the completion, unless you make a separate state service to hold the processed data.
You can't limit the resources required for a actor, in service fabric you specify the resources needed for a service, but you can't do it for actors, and you can't limit the resources they use, when they hit the limit, service fabric will try to balance the resources for your, but nothing prevent the process to consume more memory than requested.
Both actor services communicates using the ask approach, so they will "block" the caller waiting for an answer, it is asynchronous but you still have to keep the caller 'waiting'. (block and wait is because there is not an idea of fire and forget like Akka that uses the Tell approach, where it delivery the message and forget.)
Based on some of your requirements, I think a containers would be a better approach. Because:
You can limit the resource consumption for each container
The data is isolated inside the container and not visible to others
But on containers you have to manage the replication and partitioning by yourself, so in this case I would recommend the best of both worlds:
Create SF services to host the shared data sets between the the users
SF Service+Actor to only store the results of users simulations.
Containers to run the simulations and send updates to actors
This is just an example, it all will depend on your requirements, architecture and how data will be isolated from each other.

Server Architecture for hosting Java PLAY application in the cloud

This is rather a set of questions than one very specific question. In the last couple weeks/days I puzzled together information regarding how to properly host a JAVA PLAY application "in the cloud", as lots of this information is scattered over different services, I felt like gathering up all these small pieces to one, because lots of things are important to be seen in full context. However, I moved my considerations to the bottom of the question, as they are mainly my opinions and subjective findings, which I don't want to be held responsible for. If I got something wrong, please don't hesitate to point that out.
Hosting Java PLAY + MySQL on AWS for world wide accessibility
Our Scenario: we have a quite straight forward application written within the Java PLAY framework (https://www.playframework.com/), working on iOS and Android as well as with a backend-system (for administration, content management and API), storing data in a MySQL DB. While most of the users' interactions with the server is quick and easy (login, sync some data) there are also some more data-intensive tasks (download some <100mb data zips to the mobile phone, upload a couple of mb to the server). Therefore we were looking for a solution to properly provide users far away from our servers with reasonable response times. The obvious next step was hosting in the cloud.
Hosting setup within AWS:
Horizontal scaling: for the start, only 1 EC2 instance with our app will be running in eu-1a. We will need to evaluate how much resources one instance actually requires, if more instances are needed and if more instances would actually benefit to quicker response times.
Horizontal scaling across regions: once the app generates heavy user load from another region, the whole EC2 instance should be duplicated and put to another region, running a db read replica (see Setting up a globally available web app on amazon web services and https://aws.amazon.com/de/blogs/aws/cross-region-read-replicas-for-amazon-rds-for-mysql/ ).
Vertical scaling of EC2 instances: in recent tests of the old hosting setup, the database proved to be the bottleneck rather than the play app and its server's hardware specifications. Therefore it is not yet fully clear how much vertical scaling would affect response times. If a t2.micro instance serves as good as a m3.xlarge instance, of course we would rather climb our way up from the bottom here.
Vertical scaling of RDS: we will need to estimate how much traffic hits the DB server and what CPU/RAM/etc will be required. Probably we will work our way up here aswell.
Global Redirection: done using Amazon Route 53 (?). A user from Tokio should be redirected to the EC2 instance running in Asia; a user from Rome to the EC2 instance in Europe. This does not only affect API calls within the app, but also content delivery (in both directions).
Open Questions regarding the setup
Is this setup conclusive? Am I missing crucial components?
Regarding global redirection: is Amazon Route 53 the right tool? How does it differ from CloudFront (which strikes me to be purely for content / media distribution?).
How do I define correct data/api endpoints for my app? Of course I don't want to define the database endpoint of a db read replica during app deployment. Will this also happen during the AR53 (question 2) setup? Same goes for API calls, of course the app should direct it's calls to https://myurl.com/api and from there it should be redirected. Is this realistic?
I would highly appreciate all kinds of thoughts (!), also regarding the background info written below. If you can point me to further reading to solve my questions on my own, I am also very thankful - there is simply a huge load of information regarding this, but this makes it hard to narrow the answers down. I do have knowledge in hosting/servers, but I am pretty sure there are true experts out there waiting to slap me with knowledge. :)
Background-Information
Current Hosting Setup: a load balancer distributes the traffic on 2 root linux servers, both of them running the PLAY app, one of them also holding the MySQL installation.
The current hosting setup has 3 big flaws:
No vertical scalability: the hosting company would take money for each scaling step. Currently the servers are running idle, but if the app booms, we could run short on capacity quickly. Running idle is still paid as if permanently under full load. This is expensive!
No deployment support: currently, we connect through SSH, manually deploy the correct folders to the file system, recompile on the server, set privileges, apply database evolutions; do the same for the second server (with different db connection parameters). What could possibly go wrong. ;)
No worldwide availability: to set up another server in another region of the world would mean a huge effort. To have a synchronized replica of our DB can be done, but once again deploying would mean downtime, room for errors and therefore time and money.
Hosting Options for Java PLAY:
There are lot of different blog posts about this. In short:
AWS: Amazon Web Services is one of the first places you start looking. Here you get everything that's possible, at a flexible price. You set yourself up an EC2 instance, a MySQL RDS and you're good to go - all of this in the free tier, so you can experiment, play around, test your stuff.
Microsoft Azure: similar to AWS regarding pricing and possibilities. However, I did not dive into setting up and deploying our application for test purposes.
Heroku: super easy deployment from within PLAY, scalable servers. However (on the first glance?) lacks possibility to supply remote regions with high speed content.
Jelastic: even easier deployment from within PLAY / IntelliJ IDEA. You push your app image to jelastic, jelastic distributes it further to their infrastructure providers.
RedHat OpenShift (https://www.openshift.com/): sounds promising, yet not as complete as AWS.
Lots of choices and possible setups/prices. Especially after finding out about deployment using boxfuse (https://cloudcaptain.sh/) I made my choice for AWS, as it offers absolutely all we need from 1 source. Boxfuse has low monthly costs but is perfectly integrated into AWS. Scaling is supported as well as the 3 common environments (dev/test/prod). Support is outstanding.
The setup looks good. I would however make one change: your large up- & downloads. As mobile speeds may not be ideal, have your app serve long-running requests is something you should avoid as this will needlessly tie up server threads. Instead consider having users upload and download straight from S3 using presigned URLs. You can then later add CloudFront to the mix when it makes financial sense to do so.
R53 will work just fine for picking the best server(s) for each end user.
For EC2 consider having an ELB + Auto-Scaling Group setup. Even just for a single instance you get the benefit of permanent health monitoring and auto-respawns. If you expect more load you can then auto-scale based on your expected bottleneck (cpu, network i/o). This will give you a more autonomous and robust setup than manually having to scale up and down based on your own monitoring analysis (even though the scaling part is very easy if you stick with immutable infrastructure & blue/green deployments like what Boxfuse offers).
Your focus on vertical server scaling might not serve you well on AWS. I would start thinking about horizontal scaling of app servers behind an Elastic Load Balancer, and possibly look into Elastic Beanstalk.
I'm not sure you can setup a read replica in another region via RDS, you might have to set that up via MySQL servers running on standard EC2 instances. And even if you can, that's going to be some expensive and high-latency data transfer.
If file uploads and downloads are all you are worried about, you just need to put CloudFront (Amazon's CDN service) in front of your application, and allow it to handle file uploads and downloads via its global edge servers. You could even do this without moving your entire application into AWS. I would recommend reading this blog post as a start.

Redirect users to the nearest server based on their location without changing url

This is my case:
I have 6 servers across US and Europe. All servers are on a load balancer. When you visit the website (www.example.com) its pointing on the load balancer IP address and from their you are redirect to one of the servers. Currently, if you visit the website from Germany for example, you are transfered randomly in one of the server. You could transfer to the Germany server or the server in San Fransisco.
I am looking for a way to redirect users to the nearest server based on their location but without changing url. So I am NOT looking of having many url's such as www.example.com, www.example.co.uk, www.example.dk etc
I am looking for something like a CDN where you retrieve your files from the nearest server (?) so I can get rid of the load balancer because if it crashes, the website does not respond (?)
For example:
If you are from uk, redirect to IP 53.235.xx.xxx
If you are from west us, redirect to IP ....
if you are from south europe, redirect to IP ... etc
DNSMadeeasy offers a feature similar to this but they are charging a 600 dollars upfront price and for a startup that doesnt know if that feature will work as expected or there is no trial version we cannot afford: http://www.dnsmadeeasy.com/enterprise-dns/global-traffic-director/
What is another way of doing this?
Also another question on the current setup. Even with 6 servers all connected to the load balancer, if the load balancer has lag issues, it takes everything with it, right? or if by any change it goes down, the website does not respond. So what is the best way to eliminate that downtime so that if one server IP address does not respond, move to the next (as a load balancer would do but load balancers can have issues themselves)
Would help to know what type of application servers you're talking about; i.e. J2EE (like JBoss/Tomcat), IIS, etc?
You can use a hardware or software load balancer with Sticky IP and define ranges of IPs to stick to different application servers. Each country's ISPs should have it's own block of IPs.
There's a list at the website below.
http://www.nirsoft.net/countryip/
Here's also a really, really good article on load balancing in general, with many high availability / persistence issues addressed. That should answer your second question on the single point of failure at your load balancer; there's many different techniques to provide both high availability and load distribution. Alot depends on what kind of application your run and whether you require persistent sessions or not. Load balancing by sticky IP, if persistence isn't required and you're LB does health checks properly, can provide high availability with easy failover. The downside is that load isn't evenly distributed, but it seems you're looking for distribution based on proximity, not on load.
http://1wt.eu/articles/2006_lb/index.html

Scala + Akka: How to develop a Multi-Machine Highly Available Cluster

We're developing a server system in Scala + Akka for a game that will serve clients in Android, iPhone, and Second Life. There are parts of this server that need to be highly available, running on multiple machines. If one of those servers dies (of, say, hardware failure), the system needs to keep running. I think I want the clients to have a list of machines they will try to connect with, similar to how Cassandra works.
The multi-node examples I've seen so far with Akka seem to me to be centered around the idea of scalability, rather than high availability (at least with regard to hardware). The multi-node examples seem to always have a single point of failure. For example there are load balancers, but if I need to reboot one of the machines that have load balancers, my system will suffer some downtime.
Are there any examples that show this type of hardware fault tolerance for Akka? Or, do you have any thoughts on good ways to make this happen?
So far, the best answer I've been able to come up with is to study the Erlang OTP docs, meditate on them, and try to figure out how to put my system together using the building blocks available in Akka.
But if there are resources, examples, or ideas on how to share state between multiple machines in a way that if one of them goes down things keep running, I'd sure appreciate them, because I'm concerned I might be re-inventing the wheel here. Maybe there is a multi-node STM container that automatically keeps the shared state in sync across multiple nodes? Or maybe this is so easy to make that the documentation doesn't bother showing examples of how to do it, or perhaps I haven't been thorough enough in my research and experimentation yet. Any thoughts or ideas will be appreciated.
HA and load management is a very important aspect of scalability and is available as a part of the AkkaSource commercial offering.
If you're listing multiple potential hosts in your clients already, then those can effectively become load balancers.
You could offer a host suggestion service and recommends to the client which machine they should connect to (based on current load, or whatever), then the client can pin to that until the connection fails.
If the host suggestion service is not there, then the client can simply pick a random host from it internal list, trying them until it connects.
Ideally on first time start up, the client will connect to the host suggestion service and not only get directed to an appropriate host, but a list of other potential hosts as well. This list can routinely be updated every time the client connects.
If the host suggestion service is down on the clients first attempt (unlikely, but...) then you can pre-deploy a list of hosts in the client install so it can start immediately randomly selecting hosts from the very beginning if it has too.
Make sure that your list of hosts is actual host names, and not IPs, that give you more flexibility long term (i.e. you'll "always have" host1.example.com, host2.example.com... etc. even if you move infrastructure and change IPs).
You could take a look how RedDwarf and it's fork DimDwarf are built. They are both horizontally scalable crash-only game app servers and DimDwarf is partly written in Scala (new messaging functionality). Their approach and architecture should match your needs quite well :)
2 cents..
"how to share state between multiple machines in a way that if one of them goes down things keep running"
Don't share state between machines, instead partition state across machines. I don't know your domain so I don't know if this will work. But essentially if you assign certain aggregates ( in DDD terms ) to certain nodes, you can keep those aggregates in memory ( actor, agent, etc ) when they are being used. In order to do this you will need to use something like zookeeper to coordinate which nodes handle which aggregates. In the event of failure you can bring the aggregate up on a different node.
Further more, if you use an event sourcing model to build your aggregates, it becomes almost trivial to have real-time copies ( slaves ) of your aggregate on other nodes by those nodes listening for events and maintaining their own copies.
By using Akka, we get remoting between nodes almost for free. This means that which ever node handles a request that might need to interact with an Aggregate/Entity on another nodes can do so with RemoteActors.
What I have outlined here is very general but gives an approach to distributed fault-tolerance with Akka and ZooKeeper. It may or may not help. I hope it does.
All the best,
Andy