latency based routing for service endpoints in kubernetes cluster - kubernetes

we have single kubernetes cluster which has worker nodes in multiple data-centres which are in different geography area.
we have a service endpoint which connect to the application pods which are in different data-centres. lets say application A has 2 pods running in Data-CentresY, 2 pods in Data-CentreZ and 2 pods in Data-CentreX. now when requests lands on a service endpoint it route traffic to all these 6 pods which are in different data-centres.
we want to implement a latency based routing for service endpoints where when requests lands on a workers node it should route traffic to its nearest pods or pod with low network latency.
any suggestion or guidance are much appreciated.

Use kube-proxy with ipvs mode and use sed - shortest expected delay
Refer: https://kubernetes.io/docs/concepts/services-networking/service/#proxy-mode-ipvs

Related

Kubernetes Service not distributing the traffic evenly among pods

I am using Kubernetes v1.20.10 baremetal installation. It has one master node and 3 worker nodes. The application simply served HTTP requests.
I am scaling the deployment based on the (HPA) Horizontal Pod Autoscaler and I noticed that the load is not getting evenly across pods. Only the first pod is getting 95% of the load and the other Pod is getting very low load.
I tried the answer mentioned here but did not work : Kubernetes service does not distribute requests between pods
Based on the information provided I assume that you are using http-keepalive which is a persistent tcp connection.
A kubernetes service distributes load for each (new) tcp connection. If you have persistent connections, only the additional connections will be distributed which is the effect that you observe.
Try: Disable http keepalive or set the maximum keepalive time to something like 15 seconds, maximum requests to 50.

Cover for unhealthy pods

I am running multiple pods with python/gunicorn serving web request. At times, request get really slow (up to 60s) which blocks all workers and makes the livenessProbe fail.
In some instances, all pods are blocked in this state and are restarted at the same time (graceful shutdown takes up to 60s). This means that no pod is available to take new requests.
Is there a way of telling k8s to cover for pods that it is restarting? For example starting a new pod when other pods are unhealthy.
You can have an ingress or a load balancer at L7 layer which can route traffic to kubernetes service which can have multiple backend pods(selected by labels of the pods and label selector of the service) which spread across different deployments running in different nodes. The ingress controller or loadbalancer can do health check on backends and stop routing traffic to unhealthy pods.This topology overall increases the availability and resiliency of the application.

k8s should traffic goes to master nodes or worker nodes?

Should traffic from clients (outside world) to service inside k8s comes in through master nodes or worker nodes? and why?
From what i seen so far, docs are always showing LB pools consisting of master nodes instead of worker nodes. is there a reason for this?
in a big bluster, would it be more beneficial to send all traffic to a few designated worker nodes?
for example:
let say my k8s cluster has 2 master nodes, 4 worker nodes, and an external load balancer. most examples out there load balance incoming traffic to the 2 master nodes instead of the 4 worker nodes. why is this? is there a reason in term of efficiency/performance?
please advise. thank you.
What do you mean the traffic goes through worker nodes or master node? You expose your service in the pods to the outside world via NodePort or LoadBalancer. So who ever hits the LoadBalancer or reach the node on a particular port would be redirected to the corresponding service.

Kubernetes Service IP entry in IP tables

Deployed pod using replication controller with replicas set to 3. Cluster has 5 nodes. Created a service (type nodeport) for the pod. Now kube-proxy adds entry about the service into ip-tables of all 5 nodes. Would it not be a overhead if there are 50 nodes in the cluster?
This is not an overhead. Every node needs to be able to communicate with services even if it does not host the pods of that service (ie. it may have pods that connect to that service).
That said, in some very large clusters it was reported that performance of iptables updates might be poor (mind that this is for a very, very big scale). If that is the case, you might prefer to look into solutions like Linkerd (https://linkerd.io/) or Istio (https://istio.io/)

Kubernetes NodePort routing logic

I have a kubernetes setup that contains 4 minions (node1,2,3,4). I created a service that exposes port 80 as node port of 30010. There are 4 nginx pods that accepts the traffic from above service. However distribution of pods among nodes may vary. For example node 1 has 2 pods, node 2 has 1 pod and node 3 has 1 pod. Node 4 doesn't have any pod deployed. My requirement is, whenever I send a request to node1:30010 it should hit only 2 pods on node 1 and it should not hit other pods. Traffic should be routed to other nodes if and only if there is no pod in local node. For example node4 may have to route requests to node4:30010 to other nodes because it has no suitable pod deployed on it. Can I facilitate this requirement by changing configurations of kube-proxy?
As far as I'm aware, no. Hitting node1:30010 will pass traffic to the service, the service will then round robin the response.
Kubernetes is designed as a layer of abstraction above nodes, so you don't have to worry about where traffic is being sent, trying to control which node traffic goes to goes against that idea.
Could you explain your end goal? If your different pods are serving different responses then you may want to create more services, or if you are worried about latency and want to serve traffic from the node closest to the user you may want to look at federating your cluster.