Kubernetes NodePort routing logic - kubernetes

I have a kubernetes setup that contains 4 minions (node1,2,3,4). I created a service that exposes port 80 as node port of 30010. There are 4 nginx pods that accepts the traffic from above service. However distribution of pods among nodes may vary. For example node 1 has 2 pods, node 2 has 1 pod and node 3 has 1 pod. Node 4 doesn't have any pod deployed. My requirement is, whenever I send a request to node1:30010 it should hit only 2 pods on node 1 and it should not hit other pods. Traffic should be routed to other nodes if and only if there is no pod in local node. For example node4 may have to route requests to node4:30010 to other nodes because it has no suitable pod deployed on it. Can I facilitate this requirement by changing configurations of kube-proxy?

As far as I'm aware, no. Hitting node1:30010 will pass traffic to the service, the service will then round robin the response.
Kubernetes is designed as a layer of abstraction above nodes, so you don't have to worry about where traffic is being sent, trying to control which node traffic goes to goes against that idea.
Could you explain your end goal? If your different pods are serving different responses then you may want to create more services, or if you are worried about latency and want to serve traffic from the node closest to the user you may want to look at federating your cluster.

Related

How does the kubernetes cluster IP service pick end points?

How does the kubernetes cluster IP service select pods to route requests to? Does it act like a load balancer?
This is what I observe. I have a cluster IP service with 2 end points. let's say pod A and B.
I sent 100 http requests in a loop to the cluster ip service. All of them are directed pod A while B is idle. After about a minute I repeat the test and now all the requests are sent to pod B while pod A is idle. So it does not seem to behave like a load balancer.
I have tried changing the time required for a pod to respond to the request. That did not affect this behaviour.

latency based routing for service endpoints in kubernetes cluster

we have single kubernetes cluster which has worker nodes in multiple data-centres which are in different geography area.
we have a service endpoint which connect to the application pods which are in different data-centres. lets say application A has 2 pods running in Data-CentresY, 2 pods in Data-CentreZ and 2 pods in Data-CentreX. now when requests lands on a service endpoint it route traffic to all these 6 pods which are in different data-centres.
we want to implement a latency based routing for service endpoints where when requests lands on a workers node it should route traffic to its nearest pods or pod with low network latency.
any suggestion or guidance are much appreciated.
Use kube-proxy with ipvs mode and use sed - shortest expected delay
Refer: https://kubernetes.io/docs/concepts/services-networking/service/#proxy-mode-ipvs

Bare-Metal K8s: How to preserve source IP of client and direct traffic to nginx replica on current server on

I would like to ask you about some assistance:
Entrypoint to cluster for http/https is NGINX: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.0 running as deamonset
I want to achieve 2 things:
preserve source IP of client
direct traffic to nginx replica on
current server (so if request is sent to server A, listed as
externalIP address, nginx on node A should handle it)
Questions:
How is it possible?
Is it possible without nodeport? Control plane can be started with custom --service-node-port-range so I can add nodeport for 80
and 443, but it looks a little bit like a hack (after reading about
nodeport intended usage)
I was considering using metallb, but layer2 configuration will cause bottleneck (high traffic on cluster). I am not sure if BGP will solve this problem.
Kubernetes v15
Bare-metal
Ubuntu 18.04
Docker (18.9) and WeaveNet (2.6)
You can preserve the source IP of client by using externalTrafficPolicy set to local, this will proxy requests to local endpoints. This is explained on Source IP for Services with Type=NodePort.
Can should also have a look at Using Source IP.
In case of MetalLB:
MetalLB respects the service’s externalTrafficPolicy option, and implements two different announcement modes depending on what policy you select. If you’re familiar with Google Cloud’s Kubernetes load balancers, you can probably skip this section: MetalLB’s behaviors and tradeoffs are identical.
“Local” traffic policy
With the Local traffic policy, nodes will only attract traffic if they are running one or more of the service’s pods locally. The BGP routers will load-balance incoming traffic only across those nodes that are currently hosting the service. On each node, the traffic is forwarded only to local pods by kube-proxy, there is no “horizontal” traffic flow between nodes.
This policy provides the most efficient flow of traffic to your service. Furthermore, because kube-proxy doesn’t need to send traffic between cluster nodes, your pods can see the real source IP address of incoming connections.
The downside of this policy is that it treats each cluster node as one “unit” of load-balancing, regardless of how many of the service’s pods are running on that node. This may result in traffic imbalances to your pods.
For example, if your service has 2 pods running on node A and one pod running on node B, the Local traffic policy will send 50% of the service’s traffic to each node. Node A will split the traffic it receives evenly between its two pods, so the final per-pod load distribution is 25% for each of node A’s pods, and 50% for node B’s pod. In contrast, if you used the Cluster traffic policy, each pod would receive 33% of the overall traffic.
In general, when using the Local traffic policy, it’s recommended to finely control the mapping of your pods to nodes, for example using node anti-affinity, so that an even traffic split across nodes translates to an even traffic split across pods.
You need to take for the account the limitations of BGP routing protocol for MetalLB.
Please also have a look at this blog post Using MetalLb with Kind.

k8s should traffic goes to master nodes or worker nodes?

Should traffic from clients (outside world) to service inside k8s comes in through master nodes or worker nodes? and why?
From what i seen so far, docs are always showing LB pools consisting of master nodes instead of worker nodes. is there a reason for this?
in a big bluster, would it be more beneficial to send all traffic to a few designated worker nodes?
for example:
let say my k8s cluster has 2 master nodes, 4 worker nodes, and an external load balancer. most examples out there load balance incoming traffic to the 2 master nodes instead of the 4 worker nodes. why is this? is there a reason in term of efficiency/performance?
please advise. thank you.
What do you mean the traffic goes through worker nodes or master node? You expose your service in the pods to the outside world via NodePort or LoadBalancer. So who ever hits the LoadBalancer or reach the node on a particular port would be redirected to the corresponding service.

Kubernetes Service IP entry in IP tables

Deployed pod using replication controller with replicas set to 3. Cluster has 5 nodes. Created a service (type nodeport) for the pod. Now kube-proxy adds entry about the service into ip-tables of all 5 nodes. Would it not be a overhead if there are 50 nodes in the cluster?
This is not an overhead. Every node needs to be able to communicate with services even if it does not host the pods of that service (ie. it may have pods that connect to that service).
That said, in some very large clusters it was reported that performance of iptables updates might be poor (mind that this is for a very, very big scale). If that is the case, you might prefer to look into solutions like Linkerd (https://linkerd.io/) or Istio (https://istio.io/)