How to achieve high availability and load balancing in Kubernetes cluster - kubernetes

I am creating a two node Kubernetes cluster (1 master and 2 slave nodes) which will host Netflix eureka. Microservices would be created for applications which would register themselves on the Eureka server and would find other microservices to communicate from the service registry of Eureka.I want a scenario such that if any node fails, then how can we achieve high availability in this ? Also , there should be load balancing so that requests get simultaneously directed to other nodes in the cluster.
Can anybody let me know a solution for this ?

I want a scenario such that if any node fails, then how can we achieve high availability in this
Creating a Pod directly is not a recommended approach. Lets say that the node on which the Pod is running crashes, then the Pod is not rescheduled and the service is not accessible.
For HA (High Availability), higher level abstractions like Deployments should be used. A Deployment will create a ReplicaSet which will have multiple Pods associated with it. So, if a node on which the Pod is running crashes then the ReplicaSet will automatically reschedule the Pod on a healthy node and you will get HA.
Also , there should be load balancing so that requests get simultaneously directed to other nodes in the cluster.
Create a Service of type LoadBalancer for the Deployment and the incoming requests will be automatically redirected to the Pods on the different nodes. In this case a Load Balancer will be automatically created. And there is charge associated with the Load Balancer.
If you don't want to use a Load Balancer then another approach though which is a bit more complicated and powerful is to use Ingress. This will also load balance the requests across multiple nodes.
Here is a nice article explaining the difference between a Load Balancer and Ingress.
All the above queries are addressed directly or indirectly in the K8S documentation here.

Related

K8s: how to deploy multiple Django services inside the same node

I'm new to DevOps work and am having a though time figuring out how the whole final architecture should look like. My project currently runs on a single Kubernetes Cluster and a single node with a single pod, in the very common Nginx reverse proxy + UWSGI Django app. I have to implement a scaling architecture. My understanding is that I should use an Ingress Controller behind a LoadBalancer (I'm hosted at OVH, they do provide a built-in LoadBalancer). The Ingress Controller will then distribute the traffic to my pods.
Question 1: if my Django app listens on port 8000, setting ReplicaSet to 2 does not work because the port is already taken. This makes me believe I'm only supposed to have one pod per node but some information says otherwise. How can I run multiple replicas on the same node?
Question2: let's say I deploy 9 more nodes. Should all my 10 nodes be behind 1 Ingress Controller (and 1 Load Balancer) or should each node have its own Ingress Controller ?
Question3: if I have only one Ingress Controller, the Load Balancer does not really "balance" any load, its sole purpose is to expose my service to the Internet, is that normal?
Question4: what happens when the Ingress Controller is overloaded? Do I duplicate everything and then the Load Balancer distributes the requests on the 2 Controllers?
This and this is a good starting point, but still does not answer my questions directly.
Every pod has its own networking setup so two replicas (i.e. two pods) can both listen on the same port. Unless you've enabled host networking mode which should not be used here.
Not directly, the ingress controller can be a lot of things. If you're using a self-hosted one (I see the ingress-nginx tag so assuming you are using that) then each controller replica is an independent copy of the proxy setup. You would want 2 at least for redundancy but unless you need to break up your traffic because those two can't keep up with it (would have to be truly huge request volume) that's probably all you need.
Yes, that's fine on the K8s side, though as mentioned if you have multiple nodes available you probably want at least two ingress controller replicas in case one node dies unexpectedly.
The edge LoadBalancer is round-robin-ing requests between all the nginx proxy instances so if you need more capacity you would spawn more replicas (assuming you have spare CPU on the cluster, if not then more nodes first then more replicas).

How can I scale my google compute engine that hosts my kubernetes cluster?

Description :
Ideally (i.e in a non kubernetes scenario where my compute engines is hosting my application ) a load balancer would distribute the load on multiple replicated version of compute engines. But in case when I am using just my compute engine as worker node and it has some pods deployed on it.
Question 1 :
What would happen if my worker node ( a google computer engine ) starts receiving a lot of traffic.
Question 2 :
What would be the best(or atleast a better) way to scale my current solution so that it is able to manage more load and also that my load is efficiently distributed ?
In Kubernetes you deploy applications as pods. You can deploy multiple replicas of pods and Kubernetes will schedule it into multiple worker node VMs based on the resource requirement of the pods and available capacity on the nodes.This will provide resiliency and availability for applications. Once your workload increases you can scale the kubernetes cluster horizontally by adding more worker nodes.
You can use an ingress or L7 Loadbalancer to load balance user traffic onto the pods across different nodes. Even without those kubernetes provides L4 load balancing via kube proxy component.
Kubernetes scales to 5000 nodes. Some best practices for large cluster.

How DNS service works in the Kubernetes?

I am new to the Kubernetes, and I'm trying to understand that how can I apply it for my use-case scenario.
I managed to install a 3-node cluster on VMs within the same network. Searching about K8S's concepts and reading related articles, still I couldn't find answer for my below question. Please let me know if you have knowledge on this:
I've noticed that internal DNS service of K8S applies on the pods and this way services can find each other with hostnames instead of IPs.
Is this applicable for communication between pods of different nodes or this is only within the services inside a single node? (In other words, do we have a dns service on the node level in the K8S, or its only about pods?)
The reason for this question is the scenario that I have in mind:
I need to deploy a micro-service application (written in Java) with K8S. I made docker images from each service in my application and its working locally. Currently, these services are connected via pre-defined IP addresses.
Is there a way to run each of these services within a separate K8S node and use from its DNS service to connect the nodes without pre-defining IPs?
A service serves as an internal endpoint and (depending on the configuration) load balancer to one or several pods behind it. All communication typically is done between services, not between pods. Pods run on nodes, services don't really run anything, they are just routing traffic to the appropriate pods.
A service is a cluster-wide configuration that does not depend on a node, thus you can use a service name in the whole cluster, completely independent from where a pod is located.
So yes, your use case of running pods on different nodes and communicate between service names is a typical setup.

Kubernetes Load balancing and Proxy

I am quite new with Kubernetes and I have a few questions regarding REST API request proxy and load balancing.
I have one Master and two Worker nodes with some of the Services on one Worker node and few on other Worker node.
At a beginning I had just one worker node and I accessed to my pods using Worker node IP and service NodePort. After adding another Worker node to cluster, Kubernetes have "redistributed" mu pods to both of Working nodes.
Now, I can again access to my pods using both Worker node IPs and Service NodePorts. This i a bit confusing to me: how can I reach my pod REST APIs for pods that are not on the worker node which IP address is used?
Also, since I have 2 Worker nodes now, how Load balancing should be done in a proper way over both of Worker nodes? I know that I can set serviceType to LoadBalancer for Service, but is that enough?
Thank you for your answers!
how can I reach my pod REST APIs for pods that are not on the worker node which IP address is used?
It is better to think of exposing your services to outer world, rather than pods, and consequently avoid considering IP addresses of nodes that pods are running on. Answer to this question is dependent on your setup. Many configurations are possible depending on actual complexity and speed/availability requirements, but basic setup boils down to:
If you are running in some supported cloud environment then setup of load balanced ingress would expose it to outer world without much fuss.
If, however, you are running on bare metal, then you have to make your own ingress (simple nginx or apache proxy pod would suffice) and point upstream to your service name (or fqdn in case of another namespace), thus exposing all pods within service regardless of actual nodes they are running on to outer world and leaving load balancing to kubernetes service.
how Load balancing should be done in a proper way over both of Worker nodes?
This is a bit more complex topic since in uniform distribution of your pods across the nodes, you can make do with external load balancer that is oblivious of pod distribution. For us, leaving load balancing to kubernetes service proved to be more accurate, since more often than not you can have two pods run on same node (if number of pods is larger than number of nodes) in which case external load balancer will not be able to balance uniformly and kubernetes service layer will be.

Where do services live in Kubernetes?

I am learning Kubernetes and currently deep diving into high availability and while I understand that I can set up a highly available control plane (API-server, controllers, scheduler) with local (or with remote) etcds as well as a highly available set of minions (through Kubernetes itself), I am still not sure where in this concept services are located.
If they live in the control plane: Good I can set them up to be highly available.
If they live on a certain node: Ok, but what happens if the node goes down or becomes unavailable in any other way?
As I understand it, services are needed to expose my pods to the internet as well as for loadbalancing. So no HA service, I risk that my application won't be reachable (even though it might be super highly available for any other aspect of the system).
Kubernetes Service is another REST Object in the k8s Cluster. There are following types are services. Each one of them serves a different purpose in the cluster.
ClusterIP
NodePort
LoadBalancer
Headless
fundamental Purpose of Services
Providing a single point of gateway to the pods
Load balancing the pods
Inter Pods communication
Provide Stability as pods can die and restart with different Ip
more
These Objects are stored in etcd as it is the single source of truth in the cluster.
Kube-proxy is the responsible for creating these objects. It uses selectors and labels.
For instance, each pod object has labels therefore service object has selectors to match these labels. Furthermore, Each Pod has endpoints, so basically kube-proxy assign these endpoints (IP:Port) with service (IP:Port).Kube-proxy use IP-Tables rules to do this magic.
Kube-Proxy is deployed as DaemonSet in each cluster nodes so they are aware of each other by using etcd.
You can think of a service as an internal (and in some cases external) loadbalancer. The definition is stored in Kubernetes API server, yet the fact thayt it exists there means nothing if something does not implement it. Most common component that works with services is kube-proxy that implements services on nodes using iptables (meaning that every node has every service implemented in it's local iptables rules), but there are also ie. Ingress Controller implementations that use Service concept from API to find endpoints and direct traffic to them, effectively skipping iptables implementation. Finaly there are service mesh solutions like linkerd or istio that can leverage Service definitions on their own.
Services loadbalance between pods in most of implementations, meaning that as long as you have one backing pod alive (and with enough capacity) your "service" will respond (so you get HA as well, specially if you implement readiness/liveness probes that among other things will remove unhealthy pods from services)
Kubernetes Service documentation provides pretty good insight on that