I have come across a need that I need to serve application users based on their geo-location.
One possibility, I could think of it to have application installed on multiple k8s clusters hosted in different region and then load-balance the traffic based on geo-location of the users.
While exploring this idea, I came across several articles on "Kubernetes Cluster Federation" (e.g. https://kubernetes.io/blog/2016/10/globally-distributed-services-kubernetes-cluster-federation/). But seems like this functionality has been retired as mentioned in https://github.com/kubernetes-retired/federation.
Does someone know:
If there is any alternative for "Kubernetes Cluster Federation"?
Is there any other solution/s to address the need of serving users based on their geo-location?
If we leave the application part, is there any way to store the data in same geo-location?
Thanks!
https://github.com/kubernetes-sigs/kubefed is a successor to the "Kubernetes Cluster Federation", though I am not sure what is its current state. If you want to deploy a global loadbalancer, I suggest to have a look into https://www.k8gb.io/ .
...k8s clusters hosted in different region and then load-balance the traffic based on geo-location of the users
If you determine the user location simply by the network location, you can use DNS geolocation routing capability such as Route 53 to reach nearest services. In this context k8s federation is not required.
If we leave the application part, is there any way to store the data in same geo-location?
Apart from global scale database solution such as Aurora, Spanner, your application can point to a centralize database that resides in one of the region; if the increase latency is acceptable.
Related
I am new to Kubernetes. I am planning to build/deploy an application to EKS. This will likely be deployed on azure and gcp as well.
I want to separate data plane and control plane in my application deployed in EKS.
Is there anyway EKS/kubernetes allows to accomplish this?
Or should we go for two EKS with one for data plane and another for control plane?
Here is the problem(copied from the answer below)
I have an application, built using the microservice architecture
(meaning you will have it split into components that will communicate
with eachother).
I want to deploy this application on a public cloud (EKS, GCP, AWS).
I want a separation of the APPLICATION control plane (decision making
components like authentication APIs, internal service routing) from
the APPLICATION data plane (the serving of your application data to
the clients through the egress).
My Understanding
What I understand from your description is:
You have an application, built using the microservice architecture (meaning you will have it split into components that will communicate with eachother).
You want to deploy this application on a public cloud (EKS, GCP, AWS).
You want a separation of the APPLICATION control plane (decision making components like authentication APIs, internal service routing) from the APPLICATION data plane (the serving of your application data to the clients through the egress).
If those 3 points above are true, then the answer is yes, every cloud platform has these capabilities. You will need to architect your application in a manner that your control microservices are isolated from your data microservices. There are many ways in which you can do it (most/all of them are available in all public clouds).
Some Design Ideas
Here is a conceptual description:
By using authentication and authorization mechanisms to ensure authorized communication between control and data plane applications. Think kubernetes ServiceAccounts.
By using network policies to restrict unauthorized traffic flow between microservices. You will require a networking overlay that supports these. Think calico CNI.
By using separate namespaces for your application services as necessary for better isolation.
At one level below, in your cloud, you can limit the traffic using security groups. Or even gateways.
You can use different instance types that match your various workload types, this will not only ensure optimal performance but also separation of failure domains. For example, if a dedicated database instance crashed, the event streaming service will still be running.
Some Notes
Also understand that in a public cloud solution (even EKS) a cluster->cluster traffic is more expensive for you than the traffic inside a single cluster. This will be a very important cost factor and you should consider using a single cluster for your application. (k8s clusters can typically scale to 1000s of nodes).
I hope this somewhat answers your question. There are a lot of decisions you need to make but in short, yes, it is possible to do this separation, and your entire application will have to be designed in this way.
I great open-source observability control plane for you apps is Odigos. The installation is super easy and within a few minutes you can get traces, metrics and logs. You get auto-instrumentation for all languages (including GO) as well as a manager of your opentelemetry collectors.
Check it out: https://github.com/keyval-dev/odigos
I read this article about the API Gateway pattern. I realize that API Gateways typically serve as reverse proxies, but this forces a bottleneck situation. If all requests to an application's public services go through a single gateway, or even a single load balancer across multiple replicas of a gateway (perhaps a hardware load balancer which can handle large amounts of bandwidth more easily than an API gateway), then that single access point is the bottleneck.
I also understand that it is a wide bottleneck, as it simply has to deliver messages in proxy, as the gateways and load balancers themselves are not responsible for any processing or querying. However, imagining a very large application with many users, one would require extremely powerful hardware to not notice the massive bandwidth traveling over the gateway or load balancer, given that every request to every microservice exposed by the gateway travels through that single access point.
If the API gateway instead simply redirected the client to publicly exposed microservices (sort of like a custom DNS lookup), the hardware requirements would be much lower. This is because the messages traveling to and from the API Gateway would be very small, the requests consisting only of a microservice name, and the responses consisting only of the associated public IP address.
I recognize that this pattern would involve greater latency due to increased external requests. It would also be more difficult to secure, as every microservice is publicly exposed, rather than providing authentication at a single entrypoint. However, it would allow for bandwidth to be distributed much more evenly, and provide a much wider bottleneck, thus making the application much more scalable. Is this a valid strategy?
A DNS/Public IP based approach not good from a lot of perspectives:
Higher attack surface area as you have too many exposed points and each need to be protected
No. of public IPs needed is higher
You may need more DNS settings with subdomains or domains to be there for these APIs
A lot of times your APIs will run on a root path but you may want to expose them on a folder path example.com/service1, which requires
you to use some gateway for the same
Handling SSL certificates for these public exposures
Security on a focussed set of nodes vs securing every publically exposed service becomes a very difficult task
While theoretically it is possible to redirect clients directly to the nodes, there are a few pitfalls.
Security, Certificate and DNS management has been covered by #Tarun
Issues with High Availability
DNS's cache the domains vs IP's they serve fairly aggressively because these seldom change. If we use DNS to expose multiple instances of services publicly, and one of the servers goes down, or if we're doing a deployment, DNS's will continue routing the requests to the nodes which are down for a fair amount of time. We have no control over external DNS's and their policies.
Using reverse proxies, we avoid hitting those nodes based on health checks.
Google has ]this cool tool kubemci - Command line tool to configure L7 load balancers using multiple kubernetes clusters with which you can basically have a HA multi region Kubernetes setup. Which is kind of cool.
But let's say we have an basic architecture like this:
Front end is implemented as SPA and uses json API to talk to backend
Backend is a set of microservices which use PostgreSQL as a DB storage engine.
So I can create two Kubernetes Clusters on GKE, put both backend and frontend on them (e.g. let's say in London and Belgium) and all looks fine.
Until we think about the database. PostgreSQL is single master only, so it must be placed in one of the regions only. And If backend from London region starts to talk to PostgreSQL in Belgium region the performance will really be poor considering the 6ms+ latency between those regions.
So that whole HA setup kind of doesn't make any sense? Or am I missing something? One option to slightly mitigate the issue is would be have a readonly replica in the the "slave" region, and direct read-only queries there (is that even possible with PostgreSQL?)
This is a classic architecture scenario that has no easy solution. Making data available in multiple regions is a challenging problem that major companies spend a lot of time and money to solve.
PostgreSQL does not natively support multi-master writes. Your idea of a replica located in the other region with logic in your app to read and write to the correct database would work. This will give you fast local reads, but slower writes in one region. It's also more complicated code in you app and more work to handle failover of the master. Bandwidth and costs can also be problems with heavy updates.
Use 3rd-party solutions for multi-master Postgres (like Postgres-BDR by 2nd Quadrant) to offload the work to the database layer. This can get expensive and your application still has to manage data conflicts from two regions overwriting the same data at the same time.
Choose another database that supports multi-regional replication with multi-master writes. Cassandra (or ScyllaDB) is a good choice, or hosted options like Google Spanner, Azure CosmosDB, AWS DynamoDB Global Tables, and others. An interesting option is CockroachDB which supports the PostgreSQL protocol but is a scalable relational database and supports multiple regions.
If none of these options work, you'll have to create your own replication system. Some companies do this with a event-sourced / CQRS architecture where every write is a message sent to a central log, then applied in every location. This is a more work but provides the most flexibility. At this point you're also basically building your own database replication system.
If you have multi cluster ingress set up on two clusters in different regions, then the multi cluster ingress will only send traffic to the closest region to the user.
If the closest region is down, this is when traffic will be routed to the cluster in the other region.
So using the example you have provided, if there is traffic being sent to the backend and this user is closer to London, then traffic sent by this user will always be sent to London as long as the Region is up and running.
In regards dealing with latency, you will have to deal with the latency in this case as you cannot create a read replica within another region.
The benefit of this functionality (multi-cluster ingress) is that if one region goes down, then you have another region to route the traffic to.
I am using Haproxy with two different nodes having different machines 'geographically scattered'
Load-balancer-one having dns = http1.example.com
Load-balancer-two having dns = http2.example.com
The service is listening on DNS main site with original hostname --haproxy
My question is how to maintain a static URL? i.e. it must not show the back-end server domain's or IPs, I want to show only original hostname.
The simplest method is to setup a round robin DNS entry that returns the IP addresses of both servers.
You likely however want to use a GSLB (global server load balancing) solution that can remove failed load balancers from responses based on a health check. If you are in multiple data centers, some GSLB solutions can route users to the most performant location for them.
F5 and Netscaler have hardware GSLB solutions. Dyn, Akamai, UltraDNS and others offer GSLB as a service. AWS' Route53 has a weighted round robin solution. They do not currently offer health checking or routing based on geographic location.
This is my case:
I have 6 servers across US and Europe. All servers are on a load balancer. When you visit the website (www.example.com) its pointing on the load balancer IP address and from their you are redirect to one of the servers. Currently, if you visit the website from Germany for example, you are transfered randomly in one of the server. You could transfer to the Germany server or the server in San Fransisco.
I am looking for a way to redirect users to the nearest server based on their location but without changing url. So I am NOT looking of having many url's such as www.example.com, www.example.co.uk, www.example.dk etc
I am looking for something like a CDN where you retrieve your files from the nearest server (?) so I can get rid of the load balancer because if it crashes, the website does not respond (?)
For example:
If you are from uk, redirect to IP 53.235.xx.xxx
If you are from west us, redirect to IP ....
if you are from south europe, redirect to IP ... etc
DNSMadeeasy offers a feature similar to this but they are charging a 600 dollars upfront price and for a startup that doesnt know if that feature will work as expected or there is no trial version we cannot afford: http://www.dnsmadeeasy.com/enterprise-dns/global-traffic-director/
What is another way of doing this?
Also another question on the current setup. Even with 6 servers all connected to the load balancer, if the load balancer has lag issues, it takes everything with it, right? or if by any change it goes down, the website does not respond. So what is the best way to eliminate that downtime so that if one server IP address does not respond, move to the next (as a load balancer would do but load balancers can have issues themselves)
Would help to know what type of application servers you're talking about; i.e. J2EE (like JBoss/Tomcat), IIS, etc?
You can use a hardware or software load balancer with Sticky IP and define ranges of IPs to stick to different application servers. Each country's ISPs should have it's own block of IPs.
There's a list at the website below.
http://www.nirsoft.net/countryip/
Here's also a really, really good article on load balancing in general, with many high availability / persistence issues addressed. That should answer your second question on the single point of failure at your load balancer; there's many different techniques to provide both high availability and load distribution. Alot depends on what kind of application your run and whether you require persistent sessions or not. Load balancing by sticky IP, if persistence isn't required and you're LB does health checks properly, can provide high availability with easy failover. The downside is that load isn't evenly distributed, but it seems you're looking for distribution based on proximity, not on load.
http://1wt.eu/articles/2006_lb/index.html