I am quite new with Kubernetes and I have a few questions regarding REST API request proxy and load balancing.
I have one Master and two Worker nodes with some of the Services on one Worker node and few on other Worker node.
At a beginning I had just one worker node and I accessed to my pods using Worker node IP and service NodePort. After adding another Worker node to cluster, Kubernetes have "redistributed" mu pods to both of Working nodes.
Now, I can again access to my pods using both Worker node IPs and Service NodePorts. This i a bit confusing to me: how can I reach my pod REST APIs for pods that are not on the worker node which IP address is used?
Also, since I have 2 Worker nodes now, how Load balancing should be done in a proper way over both of Worker nodes? I know that I can set serviceType to LoadBalancer for Service, but is that enough?
Thank you for your answers!
how can I reach my pod REST APIs for pods that are not on the worker node which IP address is used?
It is better to think of exposing your services to outer world, rather than pods, and consequently avoid considering IP addresses of nodes that pods are running on. Answer to this question is dependent on your setup. Many configurations are possible depending on actual complexity and speed/availability requirements, but basic setup boils down to:
If you are running in some supported cloud environment then setup of load balanced ingress would expose it to outer world without much fuss.
If, however, you are running on bare metal, then you have to make your own ingress (simple nginx or apache proxy pod would suffice) and point upstream to your service name (or fqdn in case of another namespace), thus exposing all pods within service regardless of actual nodes they are running on to outer world and leaving load balancing to kubernetes service.
how Load balancing should be done in a proper way over both of Worker nodes?
This is a bit more complex topic since in uniform distribution of your pods across the nodes, you can make do with external load balancer that is oblivious of pod distribution. For us, leaving load balancing to kubernetes service proved to be more accurate, since more often than not you can have two pods run on same node (if number of pods is larger than number of nodes) in which case external load balancer will not be able to balance uniformly and kubernetes service layer will be.
Related
Let's say I have a web application Backend that I want to deploy with the help of Kubernetes, how exactly does scaling work in this case.
I understand scaling in Kubernetes as: We have one a master node that orchestrates multiple worker nodes where each of the worker nodes runs 0-n different containers with the same image. My question is, if this is correct, how does Kubernetes deal with the fact that the same application use the same Port within one worker node? Does the request reach the master node which then handles this problem internally?
Does the request reach the master node which then handles this problem internally?
No, the master nodes does not handle traffic for your apps. Typically traffic meant for your apps arrive to a load balancer or gateway, e.g. Google Cloud Load Balancer or AWS Elastic Load Balancer, then the load balancer forwards the request to a replica of a matching service - this is managed by the Kubernetes Ingress resource in your cluster.
The master nodes - the control plane - is only used for management, e.g. when you deploy a new image or service.
how does Kubernetes deal with the fact that the same application use the same Port within one worker node?
Kubernetes uses a container runtime for your containers. You can try this on your own machine, e.g. when you use docker, you can create multiple containers (instances) of your app, all listening on e.g. port 8080. This is a key feature of containers - the provide network isolation.
On Kubernetes, all containers are tied together with a custom container networking. How this works, depends on what Container Networking Interface-plugin you use in your cluster. Each Pod in your cluster will get its own IP address. All your containers can listen to the same port, if you want - this is an abstraction.
I am new to the Kubernetes, and I'm trying to understand that how can I apply it for my use-case scenario.
I managed to install a 3-node cluster on VMs within the same network. Searching about K8S's concepts and reading related articles, still I couldn't find answer for my below question. Please let me know if you have knowledge on this:
I've noticed that internal DNS service of K8S applies on the pods and this way services can find each other with hostnames instead of IPs.
Is this applicable for communication between pods of different nodes or this is only within the services inside a single node? (In other words, do we have a dns service on the node level in the K8S, or its only about pods?)
The reason for this question is the scenario that I have in mind:
I need to deploy a micro-service application (written in Java) with K8S. I made docker images from each service in my application and its working locally. Currently, these services are connected via pre-defined IP addresses.
Is there a way to run each of these services within a separate K8S node and use from its DNS service to connect the nodes without pre-defining IPs?
A service serves as an internal endpoint and (depending on the configuration) load balancer to one or several pods behind it. All communication typically is done between services, not between pods. Pods run on nodes, services don't really run anything, they are just routing traffic to the appropriate pods.
A service is a cluster-wide configuration that does not depend on a node, thus you can use a service name in the whole cluster, completely independent from where a pod is located.
So yes, your use case of running pods on different nodes and communicate between service names is a typical setup.
I am creating a two node Kubernetes cluster (1 master and 2 slave nodes) which will host Netflix eureka. Microservices would be created for applications which would register themselves on the Eureka server and would find other microservices to communicate from the service registry of Eureka.I want a scenario such that if any node fails, then how can we achieve high availability in this ? Also , there should be load balancing so that requests get simultaneously directed to other nodes in the cluster.
Can anybody let me know a solution for this ?
I want a scenario such that if any node fails, then how can we achieve high availability in this
Creating a Pod directly is not a recommended approach. Lets say that the node on which the Pod is running crashes, then the Pod is not rescheduled and the service is not accessible.
For HA (High Availability), higher level abstractions like Deployments should be used. A Deployment will create a ReplicaSet which will have multiple Pods associated with it. So, if a node on which the Pod is running crashes then the ReplicaSet will automatically reschedule the Pod on a healthy node and you will get HA.
Also , there should be load balancing so that requests get simultaneously directed to other nodes in the cluster.
Create a Service of type LoadBalancer for the Deployment and the incoming requests will be automatically redirected to the Pods on the different nodes. In this case a Load Balancer will be automatically created. And there is charge associated with the Load Balancer.
If you don't want to use a Load Balancer then another approach though which is a bit more complicated and powerful is to use Ingress. This will also load balance the requests across multiple nodes.
Here is a nice article explaining the difference between a Load Balancer and Ingress.
All the above queries are addressed directly or indirectly in the K8S documentation here.
Do worker nodes in a multi-master setup talk to the apiserver on the master nodes via the load-balancer? It seems like the cluster is aware of the active apiserver endpoints via the endpoint reconciler, so I would think the logical and HA way is for the worker nodes to talk to the active endpoints it knows of. But according to the official documentation/diagram (https://kubernetes.io/docs/admin/high-availability/building/), it shows that the worker nodes goes through the load-balancer. Doesn't this mean that if for whatever reason the load-balancer becomes unavailable, your worker nodes will also malfunction?
When your kubelet starts, it needs to connect to the apiserver. The location of the apiserver is provided as a configuration option and in most cases will be a non-changing domain name pointing to a loadbalancer. You can not rely on ClusterIP based service for kubernetes main components like kubelet or kube-proxy as you would essentially be running your self into a chicken-and-egg situation / introducing additional dependencies.
Any reasonable environment should have a dependable loadbalancer, and it it is down, odds are that quite a lot of other things is down (also keep in mind that in many cases kubernetes will survive temporary inaccessibility of control plane)
I am learning Kubernetes and currently deep diving into high availability and while I understand that I can set up a highly available control plane (API-server, controllers, scheduler) with local (or with remote) etcds as well as a highly available set of minions (through Kubernetes itself), I am still not sure where in this concept services are located.
If they live in the control plane: Good I can set them up to be highly available.
If they live on a certain node: Ok, but what happens if the node goes down or becomes unavailable in any other way?
As I understand it, services are needed to expose my pods to the internet as well as for loadbalancing. So no HA service, I risk that my application won't be reachable (even though it might be super highly available for any other aspect of the system).
Kubernetes Service is another REST Object in the k8s Cluster. There are following types are services. Each one of them serves a different purpose in the cluster.
ClusterIP
NodePort
LoadBalancer
Headless
fundamental Purpose of Services
Providing a single point of gateway to the pods
Load balancing the pods
Inter Pods communication
Provide Stability as pods can die and restart with different Ip
more
These Objects are stored in etcd as it is the single source of truth in the cluster.
Kube-proxy is the responsible for creating these objects. It uses selectors and labels.
For instance, each pod object has labels therefore service object has selectors to match these labels. Furthermore, Each Pod has endpoints, so basically kube-proxy assign these endpoints (IP:Port) with service (IP:Port).Kube-proxy use IP-Tables rules to do this magic.
Kube-Proxy is deployed as DaemonSet in each cluster nodes so they are aware of each other by using etcd.
You can think of a service as an internal (and in some cases external) loadbalancer. The definition is stored in Kubernetes API server, yet the fact thayt it exists there means nothing if something does not implement it. Most common component that works with services is kube-proxy that implements services on nodes using iptables (meaning that every node has every service implemented in it's local iptables rules), but there are also ie. Ingress Controller implementations that use Service concept from API to find endpoints and direct traffic to them, effectively skipping iptables implementation. Finaly there are service mesh solutions like linkerd or istio that can leverage Service definitions on their own.
Services loadbalance between pods in most of implementations, meaning that as long as you have one backing pod alive (and with enough capacity) your "service" will respond (so you get HA as well, specially if you implement readiness/liveness probes that among other things will remove unhealthy pods from services)
Kubernetes Service documentation provides pretty good insight on that