Kubernetes Node NotReady: ContainerGCFailed / ImageGCFailed context deadline exceeded - kubernetes

Worker node is getting into "NotReady" state with an error in the output of kubectl describe node:
ContainerGCFailed rpc error: code = DeadlineExceeded desc = context deadline exceeded
Environment:
Ubuntu, 16.04 LTS
Kubernetes version: v1.13.3
Docker version: 18.06.1-ce
There is a closed issue on that on Kubernetes GitHub k8 git, which is closed on the merit of being related to Docker issue.
Steps done to troubleshoot the issue:
kubectl describe node - error in question was found(root cause isn't clear).
journalctl -u kubelet - shows this related message:
skipping pod synchronization - [container runtime status check may not have completed yet PLEG is not healthy: pleg has yet to be successful]
it is related to this open k8 issue Ready/NotReady with PLEG issues
Check node health on AWS with cloudwatch - everything seems to be fine.
journalctl -fu docker.service : check docker for errors/issues -
the output doesn't show any erros related to that.
systemctl restart docker - after restarting docker, the node gets into "Ready" state but in 3-5 minutes becomes "NotReady" again.
It all seems to start when I deployed more pods to the node( close to its resource capacity but don't think that it is direct dependency) or was stopping/starting instances( after restart it is ok, but after some time node is NotReady).
Questions:
What is the root cause of the error?
How to monitor that kind of issue and make sure it doesn't happen?
Are there any workarounds to this problem?

What is the root cause of the error?
From what I was able to find it seems like the error happens when there is an issue contacting Docker, either because it is overloaded or because it is unresponsive. This is based on my experience and what has been mentioned in the GitHub issue you provided.
How to monitor that kind of issue and make sure it doesn't happen?
There seem to be no clarified mitigation or monitoring to this. But it seems like the best way would be to make sure your node will not be overloaded with pods. I have seen that it is not always shown on disk or memory pressure of the Node - but this is probably a problem of not enough resources allocated to Docker and it fails to respond in time. Proposed solution is to set limits for your pods to prevent overloading the Node.
In case of managed Kubernetes in GKE (not sure but other vendors probably have similar feature) there is a feature called node auto-repair. Which will not prevent node pressure or Docker related issue but when it detects an unhealthy node it can drain and redeploy the node/s.
If you already have resources and limits it seems like the best way to make sure this does not happen is to increase memory resource requests for pods. This will mean fewer pods per node and the actual used memory on each node should be lower.
Another way of monitoring/recognizing this could be done by SSH into the node check the memory, the processes with PS, monitoring the syslog and command $docker stats --all

I have got the same issue. I have cordoned and evicted the pods.
Rebooted the server. automatically node came into ready state.

Related

Kubernetes memory leak troubleshooting

I upgraded our cluster to 1.20.14 and noticed the memory leaks. We have 5 cronjobs running every minute and I placed them on the same node without any other workload running (other than system pods, e.g. flannel, kube-proxy and node exporter, etc.
I found the memory usage is increasing over time.
The node will become unresponsive when all memory is used. Had to reboot it to release the memory. If I cordon the node, the memory will stay the same level. I think all of our nodes has this issue, but it is amplified when running the cronjob as the pods are creating and deleted frequently with cronjobs.
I also noticed that the pods are not cleaned up under /sys/fs/cgroup/kubepods/besteffort/. I have 8 pods running but 7k directories there. Those pods are also listed with systemd-cgls -a. I'm wonder if it is related.
Here is the screenshot for systemd-cgtop:
Any suggestion on how do I troubleshoot this? Thanks!
System info:
Kuberenetes: 1.20.14
OS: Flatcar Stable 3033.2.3
cgroup: v2
container runtime: containerd
cgroupfs: Systemd
containerd: 1.5.9
The issue is fixed in by the following PRs: https://github.com/kubernetes/kubernetes/pull/100326, https://github.com/kubernetes/utils/pull/228 and https://github.com/kubernetes/kubernetes/pull/106473
Update to 1.21.11 fixed the issue. The fixes are not back ported to 1.20

Kubernetes Deployment/Pod/Container statuses

I am currently working on a monitoring service that will monitor Kubernetes' deployments and their pods. I want to notify users when a deployment is not running the expected amount of replicas and also when pods' containers restart unexpectedly. This may not be the right things to monitor and I would greatly appreciate some feedback on what I should be monitoring.
Anyways, the main question is the differences between all of the Statuses of pods. And when I say Statuses I mean the Status column when running kubectl get pods. The statuses in question are:
- ContainerCreating
- ImagePullBackOff
- Pending
- CrashLoopBackOff
- Error
- Running
What causes pod/containers to go into these states?
For the first four Statuses, are these states recoverable without user interaction?
What is the threshold for a CrashLoopBackOff?
Is Running the only status that has a Ready Condition of True?
Any feedback would be greatly appreciated!
Also, would it be bad practice to use kubectl in an automated script for monitoring purposes? For example, every minute log the results of kubectl get pods to Elasticsearch?
You can see the pod lifecycle details in k8s documentation.
The recommended way of monitoring kubernetes cluster and applications are with prometheus
I will try to tell what I see hidden behind these terms
ContainerCreating
Showing when we wait to image be downloaded and the
container will be created by a docker or another system.
ImagePullBackOff
Showing when we have problem to download the image from a registry. Wrong credentials to log in to the docker hub for example.
Pending
The container starts (if start take time) or started but redinessProbe failed.
CrashLoopBackOff
This status showing when container restarts occur too much often. For example, we have process that tries to read not exists file and crash. Then the container will be recreated by Kube and repeat.
Error
This is pretty clear. We have some errors to run the container.
Running
All is good container running and livenessProbe is OK.

Jenkins X builds fail with "The node was low on resource: [DiskPressure]."

My Jenkins X installation, mid-project, is now becoming very unstable. (Mainly) Jenkins pods are failing to start due to disk pressure.
Commonly, many pods are failing with
The node was low on resource: [DiskPressure].
or
0/4 nodes are available: 1 Insufficient cpu, 1 node(s) had disk pressure, 2 node(s) had no available volume zone.
Unable to mount volumes for pod "jenkins-x-chartmuseum-blah": timeout expired waiting for volumes to attach or mount for pod "jx"/"jenkins-x-chartmuseum-blah". list of unmounted volumes=[storage-volume]. list of unattached volumes=[storage-volume default-token-blah]
Multi-Attach error for volume "pvc-blah" Volume is already exclusively attached to one node and can't be attached to another
This may have become more pronounced with more preview builds for projects with npm and the massive node-modules directories it generates. I'm also not sure if Jenkins is cleaning up after itself.
Rebooting the nodes helps, but not for very long.
Let's approach this from the Kubernetes side.
There are few things you could do to fix this:
As mentioned by #Vasily check what is causing disk pressure on nodes. You may also need to check logs from:
kubeclt logs: kube-scheduler events logs
journalctl -u kubelet: kubelet logs
/var/log/kube-scheduler.log
More about why those logs below.
Check your Eviction Thresholds. Adjust Kubelet and Kube-Scheduler configuration if needed. See what is happening with both of them (logs mentioned earlier might be useful now). More info can be found here
Check if you got a correctly running Horizontal Pod Autoscaler: kubectl get hpa
You can use standard kubectl commands to setup and manage your HPA.
Finally, the volume related errors that you receive indicates that we might have problem with PVC and/or PV. Make sure you have your volume in the same zone as node. If you want to mount the volume to a specific container make sure it is not exclusively attached to another one. More info can be found here and here
I did not test it myself because more info is needed in order to reproduce the whole scenario but I hope that above suggestion will be useful.
Please let me know if that helped.

Heapster status stuck in Container Creating or Pending status

I am new to Kubernetes and started working with it from past one month.
When creating the setup of cluster, sometimes I see that Heapster will be stuck in Container Creating or Pending status. After this happens the only way have found here is to re-install everything from the scratch which has solved our problem. Later if I run the Heapster it would run without any problem. But I think this is not the optimal solution every time. So please help out in solving the same issue when it occurs again.
Heapster image is pulled from the github for our use. Right now the cluster is running fine, So could not send the screenshot of the heapster failing with it's status by staying in Container creating or Pending status.
Suggest any alternative for the problem to be solved if it occurs again.
Thanks in advance for your time.
A pod stuck in pending state can mean more than one thing. Next time it happens you should do 'kubectl get pods' and then 'kubectl describe pod '. However, since it works sometimes the most likely cause is that the cluster doesn't have enough resources on any of its nodes to schedule the pod. If the cluster is low on remaining resources you should get an indication of this by 'kubectl top nodes' and by 'kubectl describe nodes'. (Or with gke, if you are on google cloud, you often get a low resource warning in the web UI console.)
(Or if in Azure then be wary of https://github.com/Azure/ACS/issues/29 )

Minions can't rejoin cluster on reboot of AWS instance

The kubernetes cluster using v1.3.4 starts a master and 2 minions
The cluster starts fine and pods can be started and controlled without issue
As soon as one of the minions is rebooted, or any of the dependent services, such as kubelet is restarted, the minions will not rejoin the cluster
The error from the kubelet service is of the form:
Aug 08 08:21:15 ip-10-16-1-20 kubelet[911]: E0808 08:21:15.955309 911 kubelet.go:2875] Error updating node status, will retry: error getting node "ip-10-16-1-20.us-west-2.compute.internal": nodes "ip-10-16-1-20.us-west-2.compute.internal" not found
The only way, that we can see to rectify this issue at the moment is to tear down the whole cluster and rebuild it
UPDATE:
I had a look at the controller manager log and got the following
W0815 13:36:39.087991 1 nodecontroller.go:433] Unable to find Node: ip-10-16-1-25.us-west-2.compute.internal, deleting all assigned Pods.
W0815 13:37:39.123811 1 nodecontroller.go:433] Unable to find Node: ip-10-16-1-25.us-west-2.compute.internal, deleting all assigned Pods.
E0815 13:37:39.133045 1 nodecontroller.go:434] pods "kube-proxy-ip-10-16-1-25.us-west-2.compute.internal" not found
This is actually a coreos issue, although it is difficult to ascertain what the problem actually is. It is more than likely the low level os host resolution code being called from the aws go layers, but that is purely a guess. By upgrading the coreos ami to a later version solved many of the issues we were facing.