getaddrinfo: Temporary failure in name resolution kubernetes + coredns - kubernetes

We have a service that sends tons of events in bulks. It basically opens multiple http POST connections.
Since we moved the service to kubernetes, we're getaddrinfo: Temporary failure in name resolution errors from time to time. (most calls work but some fail and it's weird.
Can anyone explain why and how to fix?

Check the tinder post, they had a similar problem:
https://medium.com/tinder-engineering/tinders-move-to-kubernetes-cda2a6372f44
and the source for their dns info:
https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts
TLDR: check your arp tables cache gc_* host parameters, try to disable AAAA query in the containers /etc/gai.conf, move the DNS to a daemonset and inject the host IP as dns servers to the pods
Also, to help this and speed up dns resolve, add a ending dot to all domains (ie: database.example.com.), so coredns will try that query directly (one query, 2 with ipv6) instead of trying all the kubernetes search domain list ( about 5, 10 with ipv6). Only leave out that dot where you are querying for kubernetes resources or in apps that do not like it (if they exist, it's a bug, ALL DNS always end with a dot, while most of the time we can leave it out, with it is the correct way, must not fail)

Related

Is it possible to configure a pod to prioritize using `hostNetwork` but still reference internal service endpoints?

I have a statefulset that I need to run using the host network, purely for performance reasons. But I also want to be able to reference service-name endpoints. Is it possible to do this? ClusterFirstWithHostNet does not work because it doesn't prioritize using the host's network. The dnsConfig configuration might be promising, but I don't know how I would configure it to do what I'm asking about.
This is a community wiki answer. Feel free to expand it.
It might be possible if the app can select random port to listen during start and change if port is busy. However, Kubernetes is not involved in the selecting port for the application.
Statefulset requires a headless service, so it doesn't have an IP and works as a set of DNS records in coredns. A record would probably contain the same IP for the replicas on the same node, but SRV record may actually provide a proper endpoint.
For further reference, please take a look at the below sources:
How do I get individual pod hostnames in a Deployment registered and looked up in Kubernetes?
SRV records

kops kubernetes cluster with multiple DNS

There are two parts to this.
I am using kops v1.17.0 to standup kubernetes cluster on ec2 instances. I am followinf these docs for doing so. https://kubernetes.io/docs/setup/production-environment/tools/kops/
on of the points go as follows.
kops has a strong opinion on the cluster name: it should be a valid
DNS name.
this got me confused. Can my cluster serve requests to only one DNS and its subdomains?
I tried this on a domain example.com I created a hosted zone for it. created a cluster named example.com.k8s.local.
I pointed this domain to my clusters load balancer. and I can access example.com. All good till now.
now, I want one of the services in my cluster to be served on abc.com. I created another hosted zone, and a new record set within it which points to this load balancer. I am expecting to visit abc.com and see this service but all I see is nginx 404 not found
Is this happening because of the first point I mentioned or totally separate issue? If it is because of 1st point is there aa way around or one cluster is always tied to one domain in the kops world?
As far as the first part is concerned, Yes I can serve multiple domains from same kubernetes cluster with this setup. upto certain version there was a hard requirement of matching domain name with cluster name, its not the case anymore.
Couple of things you need to consider. while issuing a certificate from ACM, make sure all your domains are listed
example
example.com
*example.com
bar.com
*.bar.com
make sure that all of the domains are validated and are not in pending or any other state.
I think reason for second issue was one of the domains in my certificate generated by ACM was invalid state and thus in pending state.
#jt97 ^^

DNS problem on AWS EKS when running in private subnets

I have an EKS cluster setup in a VPC. The worker nodes are launched in private subnets. I can successfully deploy pods and services.
However, I'm not able to perform DNS resolution from within the pods. (It works fine on the worker nodes, outside the container.)
Troubleshooting using https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/ results in the following from nslookup (timeout after a minute or so):
Server: 172.20.0.10
Address 1: 172.20.0.10
nslookup: can't resolve 'kubernetes.default'
When I launch the cluster in an all-public VPC, I don't have this problem. Am I missing any necessary steps for DNS resolution from within a private subnet?
Many thanks,
Daniel
I feel like I have to give this a proper answer because coming upon this question was the answer to 10 straight hours of debugging for me. As #Daniel said in his comment, the issue I found was with my ACL blocking outbound traffic on UDP port 53 which apparently kubernetes uses to resolve DNS records.
The process was especially confusing for me because one of my pods worked actually worked the entire time since (I think?) it happened to be in the same zone as the kubernetes DNS resolver.
To elaborate on the comment from #Daniel, you need:
an ingress rule for UDP port 53
an ingress rule for UDP on ephemeral ports (e.g. 1025–65535)
I hadn't added (2) and was seeing CoreDNS receiving requests and trying to respond, but the response wasn't getting back to the requester.
Some tips for others dealing with these kinds of issues, turn on CoreDNS logging by adding the log configuration to the configmap, which I was able to do with kubectl edit configmap -n kube-system coredns. See CoreDNS docs on this https://github.com/coredns/coredns/blob/master/README.md#examples This can help you figure out whether the issue is CoreDNS receiving queries or sending the response back.
I ran into this as well. I have multiple node groups, and each one was created from a CloudFormation template. The CloudFormation template created a security group for each node group that allowed the nodes in that group to communicate with each other.
The DNS error resulted from Pods running in separate node groups from the CoreDNS Pods, so the Pods were unable to reach CoreDNS (network communications were only permitted withing node groups). I will make a new CloudFormation template for the node security group so that all my node groups in my cluster can share the same security group.
I resolved the issue for now by allowing inbound UDP traffic on port 53 for each of my node group security groups.
So I been struggling for a couple of hours i think, lost track of time, with this issue as well.
Since i am using the default VPC but with the worker nodes inside the private subnet, it wasn't working.
I went through the amazon-vpc-cni-k8s and found the solution.
We have to sff the environment variable of the aws-node daemonset AWS_VPC_K8S_CNI_EXTERNALSNAT=true.
You can either get the new yaml and apply or just fix it through the dashboard. However for it to work you have to restart the worker node instance so the ip route tables are refreshed.
issue link is here
thankz
Re: AWS EKS Kube Cluster and Route53 internal/private Route53 queries from pods
Just wanted to post a note on what we needed to do to resolve our issues. Noting that YMMV and everyone has different environments and resolutions, etc.
Disclaimer:
We're using the community terraform eks module to deploy/manage vpcs and the eks clusters. We didn't need to modify any security groups. We are working with multiple clusters, regions, and VPC's.
ref:
Terraform EKS module
CoreDNS Changes:
We have a DNS relay for private internal, so we needed to modify coredns configmap and add in the dns-relay IP address
...
ec2.internal:53 {
errors
cache 30
forward . 10.1.1.245
}
foo.dev.com:53 {
errors
cache 30
forward . 10.1.1.245
}
foo.stage.com:53 {
errors
cache 30
forward . 10.1.1.245
}
...
VPC DHCP option sets:
Update with the IP of the above relay server if applicable--requires regeneration of the option set as they cannot be modified.
Our DHCP options set looks like this:
["AmazonProvidedDNS", "10.1.1.245", "169.254.169.253"]
ref: AWS DHCP Option Sets
Route-53 Updates:
Associate every route53 zone with the VPC-ID that you need to associate it with (where our kube cluster resides and the pods will make queries from).
there is also a terraform module for that:
https://www.terraform.io/docs/providers/aws/r/route53_zone_association.html
We had run into a similar issue where DNS resolution times out on some of the pods, but re-creating the pod couple of times resolves the problem. Also its not every pod on a given node showing issues, only some pods.
It turned out to be due to a bug in version 1.5.4 of Amazon VPC CNI, more details here -- https://github.com/aws/amazon-vpc-cni-k8s/issues/641.
Quick solution is to revert to the recommended version 1.5.3 - https://docs.aws.amazon.com/eks/latest/userguide/update-cluster.html
As many others, I've been struggling with this bug a few hours.
In my case the issue was this bug https://github.com/awslabs/amazon-eks-ami/issues/636 that basically sets up an incorrect DNS when you specify endpoint and certificate but not certificate.
To confirm, check
That you have connectivity (NACL and security groups) allowing DNS on TCP and UDP. For me the better way was to ssh into the cluster and see if it resolves (nslookup). If it doesn't resolve (most likely it is either NACL or SG), but check that the DNS nameserver in the node is well configured.
If you can get name resolution in the node, but not inside the pod, check that the nameserver in /etc/resolv.conf points to an IP in your service network (if you see 172.20.0.10, your service network should be 172.20.0.0/24 or so)

Expose each pod in a statefulset to the internet without a custom proxy

I have a StatefulSet with pods server-0, server-1, etc. I want to expose them directly to the internet with URLs like server-0.mydomain.com or like mydomain.com/server-0.
I want to be able to scale the StatefulSet and automatically be able to access the new pods from the internet. For example, if I scale it up to include a server-2, I want mydomain.com/server-2 to route requests to the new pod when it's ready. I don't want to have to also scale some other resource or create another Service to achieve that effect.
I could achieve this with a custom proxy service that just checks the request path and forwards to the correct pod internally, but this seems error-prone and wasteful.
Is there a way to cause an Ingress to automatically route to different pods within a StatefulSet, or some other built-in technique that would avoid custom code?
I don't think you can do it. Being part of the same statefulSet, all pods up to pod-x, are targeted by a service. As you can't define which pod is going to get a request, you can't force "pod-1.yourapp.com" or "yourapp.com/pod-1" to be sent to pod-1. It will be sent to the service, and the service might sent it to pod-4.
Even though if you could, you would need to dynamically update your ingress rules, which can cause a downtime of minutes, easily.
With the custom proxy, I see it impossible too. Note that it would need to basically replace the service behind the pods. If your ingress controller knows that it needs to deliver a packet to a service, now you have to force it to deliver to your proxy. But how?
A Kubernetes service is a set of iptables (or IPVS) rules that will redirect a packet with the ServiceIP as a destination address to ONE OF THE PODS that have the same label.
from Kubernetes Services documentation
The service installs iptables rules which select a backend Pod. By default, the choice of backend is random.
Which refers to the fact that a service is not able to distinguish between different pods in the same set.
If you want to force the selection of a specific Pod out of the set by changing the iprules (fairly simple), or by adding any type of proxy is problematic:
let's say you configured pod-1 and pod-2 (1.1.1.1 and 1.1.1.2 respectively), and you configured iptables rules to DNAT requests with destination pod-1.myserver.com to 1.1.1.1 and same for pod-2. (you may ask why the IP, and it's simply because it's the only way to distinguish between these pods)
This approach will fail whenever a pod restarts, let's say pod-1 failed, Kubernetes won't recreate the same pod with same IP and name, instead will create pod-3 with a different IP and updates the iptables accordingly. As a result, all the packets going toward 1.1.1.1 will be dropped until you update the proxy or iptables again.
In fact, that's one of the reasons why we use service to access pods instead of accessing them directly since the Pod IP can change however the service IP won't.
However, since this very specific part of kubernetes was my work for the last 4 months, I have developed a python script to edit the iptables and to choose a specific pod, my conclusion of that work was it's costy and time-consuming and will impose the server to go offline for a couple of seconds when the pods are changed, you can take a look at the code, it definitely works but its not recommended.
This problem is a kubernetes problem and the solution is changing the source code of Kube-proxy, which is my current work.
I suggest you read my answer explaining how kubernetes services exactly work in this question: Which service is doing load balancing between kubernetes nodes?

How to get the real ip in the request in the pod in kubernetes

I have to get the real ip from the request in my business.actually I got the 10.2.100.1 every time at my test environment. any way to do this ?
This is the same question as GCE + K8S - Accessing referral IP address and How to read client IP addresses from HTTP requests behind Kubernetes services?.
The answer, copied from them, is that this isn't yet possible in the released versions of Kubernetes.
Services go through kube_proxy, which answers the client connection and proxies through to the backend (your web server). The address that you'd see would be the IP of whichever kube-proxy the connection went through.
Work is being actively done on a solution that uses iptables as the proxy, which will cause your server to see the real client IP.
Try to get that service IP which service is associated with that pods.
One very roundabout way right now is to set up an HTTP liveness probe and watch the IP it originates from. Just be sure to also respond to it appropriately or it'll assume your pod is down.