I upgraded our cluster to 1.20.14 and noticed the memory leaks. We have 5 cronjobs running every minute and I placed them on the same node without any other workload running (other than system pods, e.g. flannel, kube-proxy and node exporter, etc.
I found the memory usage is increasing over time.
The node will become unresponsive when all memory is used. Had to reboot it to release the memory. If I cordon the node, the memory will stay the same level. I think all of our nodes has this issue, but it is amplified when running the cronjob as the pods are creating and deleted frequently with cronjobs.
I also noticed that the pods are not cleaned up under /sys/fs/cgroup/kubepods/besteffort/. I have 8 pods running but 7k directories there. Those pods are also listed with systemd-cgls -a. I'm wonder if it is related.
Here is the screenshot for systemd-cgtop:
Any suggestion on how do I troubleshoot this? Thanks!
System info:
Kuberenetes: 1.20.14
OS: Flatcar Stable 3033.2.3
cgroup: v2
container runtime: containerd
cgroupfs: Systemd
containerd: 1.5.9
The issue is fixed in by the following PRs: https://github.com/kubernetes/kubernetes/pull/100326, https://github.com/kubernetes/utils/pull/228 and https://github.com/kubernetes/kubernetes/pull/106473
Update to 1.21.11 fixed the issue. The fixes are not back ported to 1.20
Related
I am using terraform to deploy helm releases. It took about 4 min until I saw a couple of pods show up (this issue is about pods showing up slow not starting up slow).
A couple of things I have checked:
liveness probe -> might not be related as this is about after the pod is created, my current issue is pods being created/showing up slow
request limit -> might not be related as this about after pod being created, the pod has a pending statue
I need some ideas why it behaves like this. thanks.
I've ran into this error message while installing Rancher's RKE2 tools on my Ubuntu 20.04 (Hirsute) Virtualbox (runs on a Windows 10 laptop). I've allocated 128 MB of video memory, 145GB of storage, and 8196MB of memory.
Container runtime network not ready: NetworkReady=false reason NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
My research (Googling and trying to read Rancher's documentation) keeps leading me down rabbit holes but each thing that I read does not clarify the problem or what "CNI plugin not intialized" means. I'm not sure where to go from here.
What I have tried:
Installing the flannel kube
Updating /etc/NetworkDevices/rke2-canal.conf as described in the RKE2 documentation
I also noticed a taint on my node when I run kubectl describe node: node.kubernetes.io/not-ready:NoSchedule Again, I feel like I am chasing my own tail when I research this issue: the people who reported the same taint are not all using the same platform, VM, or Kubernetes distribution as I am, so I am lost as to whether the solutions might apply to me.
Where should I go from here? I'm lost.
we are running multiple kubespray deployed clusters with 10-100 nodes.
with 1.20 kubernetes deperecates dockershim support -> https://github.com/kubernetes/kubernetes/blob/ab32085bf36fc7af1ded30456e2f09399dc1115f/CHANGELOG/CHANGELOG-1.20.md#deprecation
how to change the container runtime to containerd - without removing nodes and without destroying master.
i am not at panick, just wan't to be prepared we are at 1.19 already so 1.22 is not soo faar away.
anyways i tested it with a smaller cluster, and it was way easier as expected.
change: container_manager to containerd.
run the kubespray cluster.yml playbook over all nodes and boom.
only needed to do a simple ansible playbook to uninstall docker et-all, but it also works with docker installed.
Please treat this answer as a friendly advise.
First of all, as suggested in yesterday's fresh article Don't Panic: Kubernetes and Docker:
You do not need to panic :)
Kubernetes is only deprecating Docker as a container runtime after v1.20. They are currently only planning to remove Docker runtime support in the 1.22 release in late 2021(almost year!), so please don't brake your 100 nodes clusters till work solution will appear :)
Several days ago I faced a problem when my nodes were rebooting constantly
My stack:
1 master, 2 workers k8s-cluster built with kubeadm (v1.17.1-00)
Ubuntu 18.04 x86_64 4.15.0-74-generic
Flannel cni plugin (v0.11.0)
Rook (v1.2) cephfs for storage. Ceph was deployed in the same cluster, where my application lives
I was able to run ceph cluster, but when I tried to deploy my application, that was using my rook-volumes, suddenly my pods were starting to die
I got this message when I used kubectl describe pods/name command:
Pod sandbox changed, it will be killed and re-created
In the k8s events I got:
<Node name> has been rebooted
After some time node comes to life but eventually dies in 2-3 minutes.
I tried to drain my node and connect back to my cluster but after that some another node was getting this error.
I looked into the system error logs of a failed node by command journalctl -p 3.
And found, that logs were flooded with these messages: kernel: cache_from_obj: Wrong slab cache. inode_cache but object is from ceph_inode_info.
After googling this problem, I found this issue:
https://github.com/coreos/bugs/issues/2616
It turned out, that cephfs just doesn't work with some versions of Linux kernel!!
For me neither of these worked:
Ubuntu 19.04 x86_64 5.0.0-32-generic
Ubuntu 18.04 x86_64 4.15.0-74-generic
Solution
Cephfs doesn't work with some versions of Linux kernel. Upgrade your kernel. I finally got it working on Ubuntu 18.04 x86_64 5.0.0-38-generic
Github issue, that helped me:
https://github.com/coreos/bugs/issues/2616
This is indeed a tricky issue, I was struggling to find a solution, and I spent A LOT of time trying to understand what was happening. I hope this information will help some one, cause there is not so much information on google.
Worker node is getting into "NotReady" state with an error in the output of kubectl describe node:
ContainerGCFailed rpc error: code = DeadlineExceeded desc = context deadline exceeded
Environment:
Ubuntu, 16.04 LTS
Kubernetes version: v1.13.3
Docker version: 18.06.1-ce
There is a closed issue on that on Kubernetes GitHub k8 git, which is closed on the merit of being related to Docker issue.
Steps done to troubleshoot the issue:
kubectl describe node - error in question was found(root cause isn't clear).
journalctl -u kubelet - shows this related message:
skipping pod synchronization - [container runtime status check may not have completed yet PLEG is not healthy: pleg has yet to be successful]
it is related to this open k8 issue Ready/NotReady with PLEG issues
Check node health on AWS with cloudwatch - everything seems to be fine.
journalctl -fu docker.service : check docker for errors/issues -
the output doesn't show any erros related to that.
systemctl restart docker - after restarting docker, the node gets into "Ready" state but in 3-5 minutes becomes "NotReady" again.
It all seems to start when I deployed more pods to the node( close to its resource capacity but don't think that it is direct dependency) or was stopping/starting instances( after restart it is ok, but after some time node is NotReady).
Questions:
What is the root cause of the error?
How to monitor that kind of issue and make sure it doesn't happen?
Are there any workarounds to this problem?
What is the root cause of the error?
From what I was able to find it seems like the error happens when there is an issue contacting Docker, either because it is overloaded or because it is unresponsive. This is based on my experience and what has been mentioned in the GitHub issue you provided.
How to monitor that kind of issue and make sure it doesn't happen?
There seem to be no clarified mitigation or monitoring to this. But it seems like the best way would be to make sure your node will not be overloaded with pods. I have seen that it is not always shown on disk or memory pressure of the Node - but this is probably a problem of not enough resources allocated to Docker and it fails to respond in time. Proposed solution is to set limits for your pods to prevent overloading the Node.
In case of managed Kubernetes in GKE (not sure but other vendors probably have similar feature) there is a feature called node auto-repair. Which will not prevent node pressure or Docker related issue but when it detects an unhealthy node it can drain and redeploy the node/s.
If you already have resources and limits it seems like the best way to make sure this does not happen is to increase memory resource requests for pods. This will mean fewer pods per node and the actual used memory on each node should be lower.
Another way of monitoring/recognizing this could be done by SSH into the node check the memory, the processes with PS, monitoring the syslog and command $docker stats --all
I have got the same issue. I have cordoned and evicted the pods.
Rebooted the server. automatically node came into ready state.