Kubernetes, How to enable inter-pod DNS within a same Deployment? - kubernetes

I am new to Kubernetes, and I am trying to make inter-pod communication over DNS to work.
Pods in My k8s are spawned using Deployments. My problem is all the Pods report its hostname to Zookeeper, and pods use those hostnames found in Zookeeper to ping the other peers. It always fail because the peer's hostnames are unresolvable between pods.
The only solution now is to manually add each pod's hostname to peer's /etc/hosts file. But this method would not endure to work for large clusters.
If there is a DNS solution for inter-pod communication, that keeps a record of any newly generated pods, and delete dead pods, will be great.
Thanks in advance.

One solution I had found was to add hostname and subdomain under spec->template->spec-> , then the communication over hostnames between each pod is successful.
However, this solution is fairly dumb, because I cannot set the replicas for each Deployment to more than 1, or I will get more than 1 pod with same hostname in the cluster. If I have 10 slave nodes with same function in a cluster, I will need to create 10 Deployments.
Any better solutions?

You need to use a service definition pointing to your pods
https://kubernetes.io/docs/concepts/services-networking/service/
With that you have a balanced proxy to control the inter-pod communications and the internal DNS on Kubernetes takes care of that service instead of each pod no matter the state of the pod.
If that simples solution didn't fit your needs you can substitute kubedns as the default internal DNS by using coreDNS.
https://coredns.io/

Related

ClusterIP service with one backend pod is equal to Headless service in kubernetes?

As per the Headless service definition:
Kubernetes allows clients to discover pod IPs through DNS lookups. Usually, when you perform a DNS lookup for a service, the DNS server returns a single IP which is the service’s cluster IP. But if you don’t need the cluster IP for your service, you can set ClusterIP to None , then the DNS server will return the individual pod IPs instead of the service IP.Then client can connect to any of them
Looks like its similar to creating a clusterIP with one backend pod. If so why should we use clusterIP with one backend pod?
You don't usually control the Pod's name, and you can't guarantee that there will always be exactly one matching Pod.
There are a couple of standard reasons to not directly create Pods, but instead to rely on higher-level constructs. A Deployment can do zero-downtime upgrades; a StatefulSet manages associated PersistentVolumeClaims; a Job can retry on failure. A Pod also can't be edited once created, so you'd need to delete and recreate the Pod on any update. If you're using a Deployment you'll have Pod names like deployment-name-12345678-abcde, but you don't control the ending parts at all. A Service will give you a consistent name.
During an upgrade there won't necessarily be exactly the requested number of replicas:. A Deployment by default will create an additional Pod, wait for it to pass its readiness checks, then tear down the old Pod; an associated Service will route traffic correctly in this case. The alternative is to tear down the old Pod first. In both cases, you could have zero or two matching Pods and not just one.
Having the Service as the standard pattern also helps if you do ever decide to increase the replica count; for example you might choose to run multiple replicas just for additional resiliency in case ons of the Nodes fails. Communicating via a Service works exactly the same way whether you have just one Pod or several.
There is a difference. A DNS lookup for Headless service returns as many IP addresses as there are pods running, while a DNS lookup for ClusterIP service always returns exactly one IP address.
So yes, if you ever have only one Pod then Headless service may be sufficient.
But if you have more than one pod, for Headless service the client must implement some kind of load balancing while for ClusterIP kubernetes will do the load balancing.
One use case for Headless service with multiple pods is when you want to send requests to every pod, e.g. local cache draining.

Kubernetes LoadBalancer type service's external IP is unreachable from pods within the cluster when externalTrafficPolicy is set to Local in GCE

The external IP is perfectly reachable from outside the cluster. It's perfectly reachable from all nodes within the cluster. However, when I try to telnet to the URL from a pod within the cluster that is not on the same node as a pod that is part of the service backend, the connection always times out.
The external IP is reachable by pods that run on the same node as a pod that is part of the service backend.
All pods can perfectly reach the cluster IP of the service.
When I set externalTrafficPolicy to Cluster, the pods are able to reach the external URL regardless of what node they're on.
I am using iptables proxying and kubernetes 1.16
I'm completely at a loss here as to why this is happening. Is someone able to shed some light on this?
From the official doc here,
service.spec.externalTrafficPolicy - denotes if this Service desires to route external traffic to node-local or cluster-wide endpoints. There are two available options: Cluster (default) and Local. Cluster obscures the client source IP and may cause a second hop to another node, but should have good overall load-spreading. Local preserves the client source IP and avoids a second hop for LoadBalancer and NodePort type services, but risks potentially imbalanced traffic spreading.
The service could be either node-local or cluster-wide endpoints. When you define the externalTrafficPolicy as Local, it means node-local. So, other nodes are not able to reach it.
So, you will need to set the externalTrafficPolicy as Cluster instead.

EKS provisioned LoadBalancers reference all nodes in the cluster. If the pods reside on 1 or 2 nodes is this efficient as the ELB traffic increases?

In Kubernetes (on AWS EKS) when I create a service of type LoadBalancer the resultant EC2 LoadBalancer is associated with all nodes (instances) in the EKS cluster even though the selector in the service will only find the pods running on 1 or 2 of these nodes (ie. a much smaller subset of nodes).
I am keen to understand is this will be efficient as the volume of traffic increases.
I could not find any advice on this topic and am keen to understand if this the correct approach.
This could introduce additional SNAT if the request arrives at the node which the pods is not running on and also does not preserve the source IP of the request. You can change externalTrafficPolicy to Local which only associates nodes have pods running to the LoadBalancers.
You can get more information from the following links.
Perserve source IP
EKS load balancer support
On EKS, if you are using AWS CNI, which is default for EKS, then you can use aws-alb-ingress-loadbalancer to create ELB & ALB.
While creating loadbalancer you can use below annotation, then traffic is only routed to your pods.
alb.ingress.kubernetes.io/target-type: ip
Reference:
https://github.com/aws/amazon-vpc-cni-k8s
https://github.com/kubernetes-sigs/aws-alb-ingress-controller
https://kubernetes-sigs.github.io/aws-alb-ingress-controller/guide/ingress/annotation/#target-type

Resolving a url as a different url inside kubernetes pod

My pod (pod1) internally can connect to another pod using its service like the following:
pod2-service.namespace.svc.cluster.local
However, I want pod1 to connect to pod2 using a URL like abc.com which is not registered in a DNS. Basically, I want pod1 to resolve abc.com as pod2-service.namespace.svc.cluster.local.
I was looking at hostAliases here:
https://kubernetes.io/docs/concepts/services-networking/add-entries-to-pod-etc-hosts-with-host-aliases/.
However, it needs an IP. How can I do this in Kubernetes?
I think you can use a fixed ip as the service ip of your pod2, then use this ip in your hostalias definition.
There are a couple of things:
StatefulSets where you will always know the pod name and you can find it based on its ordinal index.
Using Pod hostname and subdomain spec field (Only works for standalone pods, afaik)
However, pod to pod doesn't seem to be natively supported by Kubernetes in Deployments, my guess the rationale here is that the pods can constantly change IP addresses and names. You could use Pod default DNS entries but again the DNS entries will vary depending on the IP addresses that are assigned to pods.
The other solution that I can think of for Deployments is to use something like Consul with stub domains, then on each pod you will have to add an initContainer or consul agent sidecar to register its IP with the consul service, every time a pod restarts it will need to overwrite the DNS registration in Consul.
If you don't want to use stub domain there's also the option of using Pod DNS Configs.
you can get the service ip and append to /etc/hosts in pod1 before your application code running.
echo "$(getent hosts pod2-service.namespace.svc.cluster.local | awk '{ print $1 }') abc.com" >> /etc/hosts
Notice: It is pretty hacky because you should guarantee service ip of pod2 is fixed. When service ip changed, pod1 will fail to reslove the host.

Kubernetes Service IP entry in IP tables

Deployed pod using replication controller with replicas set to 3. Cluster has 5 nodes. Created a service (type nodeport) for the pod. Now kube-proxy adds entry about the service into ip-tables of all 5 nodes. Would it not be a overhead if there are 50 nodes in the cluster?
This is not an overhead. Every node needs to be able to communicate with services even if it does not host the pods of that service (ie. it may have pods that connect to that service).
That said, in some very large clusters it was reported that performance of iptables updates might be poor (mind that this is for a very, very big scale). If that is the case, you might prefer to look into solutions like Linkerd (https://linkerd.io/) or Istio (https://istio.io/)