We're having a medium sized Kubernetes cluster. So imagine a situation where approximately 70 pods are being connecting to a one socket server. It works fine most of the time, however, from time to time one or two pods just fail to resolve k8s DNS, and it times out with the following error:
Error: dial tcp: lookup thishost.production.svc.cluster.local on 10.32.0.10:53: read udp 100.65.63.202:36638->100.64.209.61:53: i/o timeout at
What we noticed is that this is not the only service that's failing intermittently. Other services experience that from time to time. We used to ignore it, since it was very random and rate, however in the above case that is very noticeable. The only solution is to actually kill the faulty pod. (Restarting doesn't help)
Has anyone experienced this? Do you have any tips on how to debug it/ fix?
It almost feels as if it's beyond our expertise and is fully related to the internals of the DNS resolver.
Kubernetes version: 1.23.4
Container Network: cilium
this issue most probably will be related to the CNI.
I would suggest following the link to debug the issue:
https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/
and to be able to help you we need more information:
is this cluster on-premise or cloud?
what are you using for CNI?
how many nodes are running and are they all in the same subnet? if yes, dose they have other interfaces?
share the below command result.
kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o wide
when you restart the pod to solve the issue temp does it stay on the same node or does it change?
we are testing out the Ambassador Edge Stack and started with a brand new GKE private cluster in autopilot mode.
We installed from scratch following the quick start tour to get a feeling of it and ended up with the following error
Error from server: error when creating "mapping-test.yaml": conversion webhook for getambassador.io/v3alpha1, Kind=Mapping failed: Post "https://emissary-apiext.emissary-system.svc:443/webhooks/crd-convert?timeout=30s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
We did a few rounds of DNS testing and deployed a few different test pods in different namespaces to validate that kube-dns is working properly, everything looks good at that end. Also the resolv.conf looks good.
Ambassador is using the hostname emissary-apiext.emissary-system.svc:443 (without the cluster.local) which should resolve fine. Doing a lookup with the FQN (with cluster.local) works fine btw.
Any clues?
Thanks a lot and take care.
I think i found the solution, posting here if someone come across this later on.
So i followed this to deploy Ambassador Edge Stack in a Autopilot private cluster. I was getting the same error when i was trying to deploy the Mapping object (step 2.2).
The issue is that the control plane (API Server) is trying to call emissary-apiext.emissary-system.svc:443 but the pods behind it are listening on port 8443 (figured that out by describing the Service).
So i added a firewall rule to allow the GKE control plane to talk to the nodes on port 443.
The firewall rule in question is called gke-gke-ap-xxxxx-master. The xxxx is called the cluster hash and is different for each cluster. To make sure you are editing the proper rule, double check that source IP Range matches the "Control plane address range" from the cluster details page. And that it's the rule that has a name ending with master.
Just edit that rule and add 8443 to the tcp ports. It should work
That sounds like an issue related to the webhooks limitation in GKE Autopilot
Which version of GKE are you on ?
Also there is a limitation with which resources and namespaces we allow webhooks to intercept
Additionally, webhooks which specify one or more of following
resources (and any of their sub-resources) in the rules, will be
rejected:
group: "" resource: nodes
group: "" resource: persistentvolumes
group: certificates.k8s.io resource: certificatesigningrequests
group: authentication.k8s.io resource: tokenreviews
You probably have to check the manifests of Ambassador Edge Stack to figure this out.
Deleting kube-apiserver from kubernetes-master does not prevent kubectl from querying pods. I always understand, kube-apiserver is responsible for communication with the master.
My question: how can kubectl still able to query pods while kube-apiserver is still restarting? Is there any official documentation that covers this behavior?
Your understanding is correct. The Kubernetes API server validates and configures data for the api objects which include pods, services, replication controllers, and others. The API Server services REST operations and provides the frontend to the cluster's shared state through which all other components interact. So if your api-server pod will encounter some issues you will not be able to get your client communicating with it.
What is happening is that when you delete the api-server pod it is being immediately recreated hence your client is able to connect and fetch the data.
To provide an example I have simulated the api-server pod failure by fiddling a bit with kube-apiserver.yaml file in the /etc/kubernetes/manifests:
➜ manifests pwd
/etc/kubernetes/manifests
Immediately once a did that I was no longer able to connect to api-server:
➜ manifests kubectl get pods -A
The connection to the server 10.128.15.230:6443 was refused - did you specify the right host or port?
Getting those manifest in docker desktop could be tricky depends where you run it. Please have a look at this case where answer show solution to that.
I have an EKS cluster setup in a VPC. The worker nodes are launched in private subnets. I can successfully deploy pods and services.
However, I'm not able to perform DNS resolution from within the pods. (It works fine on the worker nodes, outside the container.)
Troubleshooting using https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/ results in the following from nslookup (timeout after a minute or so):
Server: 172.20.0.10
Address 1: 172.20.0.10
nslookup: can't resolve 'kubernetes.default'
When I launch the cluster in an all-public VPC, I don't have this problem. Am I missing any necessary steps for DNS resolution from within a private subnet?
Many thanks,
Daniel
I feel like I have to give this a proper answer because coming upon this question was the answer to 10 straight hours of debugging for me. As #Daniel said in his comment, the issue I found was with my ACL blocking outbound traffic on UDP port 53 which apparently kubernetes uses to resolve DNS records.
The process was especially confusing for me because one of my pods worked actually worked the entire time since (I think?) it happened to be in the same zone as the kubernetes DNS resolver.
To elaborate on the comment from #Daniel, you need:
an ingress rule for UDP port 53
an ingress rule for UDP on ephemeral ports (e.g. 1025–65535)
I hadn't added (2) and was seeing CoreDNS receiving requests and trying to respond, but the response wasn't getting back to the requester.
Some tips for others dealing with these kinds of issues, turn on CoreDNS logging by adding the log configuration to the configmap, which I was able to do with kubectl edit configmap -n kube-system coredns. See CoreDNS docs on this https://github.com/coredns/coredns/blob/master/README.md#examples This can help you figure out whether the issue is CoreDNS receiving queries or sending the response back.
I ran into this as well. I have multiple node groups, and each one was created from a CloudFormation template. The CloudFormation template created a security group for each node group that allowed the nodes in that group to communicate with each other.
The DNS error resulted from Pods running in separate node groups from the CoreDNS Pods, so the Pods were unable to reach CoreDNS (network communications were only permitted withing node groups). I will make a new CloudFormation template for the node security group so that all my node groups in my cluster can share the same security group.
I resolved the issue for now by allowing inbound UDP traffic on port 53 for each of my node group security groups.
So I been struggling for a couple of hours i think, lost track of time, with this issue as well.
Since i am using the default VPC but with the worker nodes inside the private subnet, it wasn't working.
I went through the amazon-vpc-cni-k8s and found the solution.
We have to sff the environment variable of the aws-node daemonset AWS_VPC_K8S_CNI_EXTERNALSNAT=true.
You can either get the new yaml and apply or just fix it through the dashboard. However for it to work you have to restart the worker node instance so the ip route tables are refreshed.
issue link is here
thankz
Re: AWS EKS Kube Cluster and Route53 internal/private Route53 queries from pods
Just wanted to post a note on what we needed to do to resolve our issues. Noting that YMMV and everyone has different environments and resolutions, etc.
Disclaimer:
We're using the community terraform eks module to deploy/manage vpcs and the eks clusters. We didn't need to modify any security groups. We are working with multiple clusters, regions, and VPC's.
ref:
Terraform EKS module
CoreDNS Changes:
We have a DNS relay for private internal, so we needed to modify coredns configmap and add in the dns-relay IP address
...
ec2.internal:53 {
errors
cache 30
forward . 10.1.1.245
}
foo.dev.com:53 {
errors
cache 30
forward . 10.1.1.245
}
foo.stage.com:53 {
errors
cache 30
forward . 10.1.1.245
}
...
VPC DHCP option sets:
Update with the IP of the above relay server if applicable--requires regeneration of the option set as they cannot be modified.
Our DHCP options set looks like this:
["AmazonProvidedDNS", "10.1.1.245", "169.254.169.253"]
ref: AWS DHCP Option Sets
Route-53 Updates:
Associate every route53 zone with the VPC-ID that you need to associate it with (where our kube cluster resides and the pods will make queries from).
there is also a terraform module for that:
https://www.terraform.io/docs/providers/aws/r/route53_zone_association.html
We had run into a similar issue where DNS resolution times out on some of the pods, but re-creating the pod couple of times resolves the problem. Also its not every pod on a given node showing issues, only some pods.
It turned out to be due to a bug in version 1.5.4 of Amazon VPC CNI, more details here -- https://github.com/aws/amazon-vpc-cni-k8s/issues/641.
Quick solution is to revert to the recommended version 1.5.3 - https://docs.aws.amazon.com/eks/latest/userguide/update-cluster.html
As many others, I've been struggling with this bug a few hours.
In my case the issue was this bug https://github.com/awslabs/amazon-eks-ami/issues/636 that basically sets up an incorrect DNS when you specify endpoint and certificate but not certificate.
To confirm, check
That you have connectivity (NACL and security groups) allowing DNS on TCP and UDP. For me the better way was to ssh into the cluster and see if it resolves (nslookup). If it doesn't resolve (most likely it is either NACL or SG), but check that the DNS nameserver in the node is well configured.
If you can get name resolution in the node, but not inside the pod, check that the nameserver in /etc/resolv.conf points to an IP in your service network (if you see 172.20.0.10, your service network should be 172.20.0.0/24 or so)
How can I prevent kube-dns from forwarding request to Google's name servers (8.8.8.8:53 and 8.8.4.4:53)?
I just want to launch pods only for internal use, which means containers in pods are not supposed to connect to the outside at all.
When a Zookeeper client connects to a Zookeeper server using hostname (e.g. zkCli.sh -server zk-1.zk-headless), it takes 10 seconds for the client to change its state from [Connecting] to [Connected].
The reason I suspect kube-dns is that, with pods' IP address, the client gets connected instantly.
When I take a look at the log of kube-dns, I found the following two lines:
07:25:35:170773 1 logs.go:41] skydns: failure to forward request "read udp 10.244.0.13:43455->8.8.8.8:53: i/o timeout"
07:25:39:172847 1 logs.go:41] skydns: failure to forward request "read udp 10.244.0.13:42388->8.8.8.8:53: i/o timeout"
It was around 07:25:30 when the client starts to connect to the server.
I'm running Kubernetes on a private cluster where internal servers are communicating to internet via http_proxy/https_proxy, which means I cannot connect to 8.8.8.8 for name resolution, AFAIK.
I found the followings from https://github.com/skynetservices/skydns:
The default value of an environmental variable named SKYDNS_NAMESERVERS is "8.8.8.8:53,8.8.4.4:53"
I could achieve my purpose by setting no_rec to true
I've been initiating Kubernetes using kubeadm and I couldn't find a way to modify the environmental variable and set the property value of skydns.
How can I prevent kube-dns from forwarding request to the outside of an internal Kubernetes cluster which is deployed by kubeadm?
I don't think there is an option to completely prevent the kube-dns addon from forwarding requests. There certainly isn't an option directly in kubeadm for that.
Your best bet is to edit the kube-dns Deployment (e.g. kubectl edit -n kube-system deploy kube-dns) yourself after kubeadmin has started the cluster and change things to work for you.
You may want to try changing the upstream nameserver to something other than 8.8.8.8 that is accessible by the cluster. You should be able to do that by adding --nameservers=x.x.x.x to the args for the kubedns container.