After some time, I have problems with some of our clusters where auto-delete of orphaned resources stop working. So if I remove a deployment nor the replicaset or the pods are removed, or if I remove a replicaset, a new one is created but the previous pods are still there.
I can't even update some deployments because that will create a new replicaset+pods.
This is an actual problem as we are creating and removing some resources and relying on auto-child removal.
The thing is that, destroying and creating again a cluster makes it working perfectly and we weren't able to trace to something we did that caused the problem.
I tried to upgrade both master and agent nodes to a newer version and restarting kubelet in agent nodes but that doesn't solve anything.
Could anyone knows where could be the problem or which component is in charge of the cascade deletion of orphan resources?
Does this happen to someone else? It happend to us already in 3 different clusters with different Kubernetes version.
I have tested it creating the test deployment in K8s documentation, and then delete it:
kubectl apply -f https://k8s.io/examples/application/deployment.yaml
kubectl delete deployments.apps nginx-deployment
But the pods are still there.
Thanks in advance
The problem was caused by a faulty CRD / Admission Webhook. It could seem strange, but a wrong CRD or a faulty pod acting as webhook will make kube-controller-manager fail for all resources (at least in AKS). After removing the CRD's and the faulty webhook it started to work again. (The reason why the webhook was failing is another different thing)
Related
In my kubernetes cluster, there are multiple deployments in a namespace.
For a specific deployment, there is a need to not allow "CrashLoopBackoff" pods to exist.
So basically, when any pod gets to this state, I would want it to be deleted and later a new pod to be created which is already handled by the ReplicaSet.
I tried with custom controllers, with the thought that the SharedInformer would alert about the state of Pod and then I would delete it from that loop.
However, this brings dependency on the pod on which the custom controller would run.
I also tried searching for any option to be configured in the manifest itself, but could not find any.
I am pretty new to Kuberenetes, so need help in the implementation of this behaviour.
Firstly, you should address the reason why the pod has entered the CrashLoopBackOff state rather than just delete it. If you do this, you'll potentially just recreate the problem again and you'll be deleting pods repeatedly. For example, if your pod is trying to access an external DB and that DB is down, it'll CrashLoop, and deleting and restarting the pod won't help fix that.
Secondly, if you want to do this deleting in an automated manner, an easy way would be to run a CronJob resource that goes through your deployment and deletes the CrashLooped pods. You could set the cronjob to run once an hour or whatever schedule you wish.
Deleting the POD and waiting for the New one is like restarting the deployment or POD.
Kubernetes will auto restart your CrashLoopBackoff POD if failing, you can check the Restart count.
NAME READY STATUS RESTARTS AGE
te-pod-1 0/1 CrashLoopBackOff 2 1m44s
This restarts will be similar to what you have mentioned
when any pod gets to this state, I would want it to be deleted and
later a new pod to be created which is already handled by the
ReplicaSet.
If you want to remove Crashing the POD fully and not look for new POD to come up, you have to rollback the deployment.
If there is any issue with your Replicaset and your POD is crashing it would be useless, any number of times you delete and restart the POD it will crash all time, unless you check logs & debug to solve the real issue in replicaset(Deployment).
I am trying to deploy updates to pods. However I want the current pods to terminate only when all the containers inside the pod have terminated and their process is complete.
The new pods can keep waiting to start untill all container in the old pods have completed. We have a mechanism to stop old pods from picking up new tasks and therefore they should eventually terminate.
It's okay if twice the pods exist at some instance of time. I tried finding solution for this in kubernetes docs but wan't successful. Pointers on how / if this is possible would be helpful.
well I guess then you may have to create a duplicate kind of deployment with new image as required and change the selector in service to new deployment, which will prevent external traffic from entering pre-existing pods and new calls can go to new pods. Then later you can check for something like -
Kubectl top pods -c containers
and if the load appears to be static and low, then preferrably you can delete the old pods related deployment later.
But for this thing everytime the service selectors have to be updated and likely for keeping track of things you can append the git commit hash to the service selector to keep it unique everytime.
But rollback to previous versions if required from inside Kubernetes cluster will be difficult, so preferably you can trigger the wanted build again.
I hope this makes some sense !!
I'm running into an issue managing my Kubernetes pods.
I had a deploy instance which I removed and created a new one. The pod tied to that deploy instance shut down as expected and a new one came up when I created a new deploy, as expected.
However, once I changed the deploy, a second pod began running. I tried to "kubectl delete pod pod-id" but it would just recreate itself again.
I went through the same process again and now I'm stuck with 3 pods, and no deploy. I removed the deploy completely, and I try to delete the pods but they keep recreating themselves. This is an issue because I am exhausting the resources available on my Kubernetes.
Does anyone know how to force remove these pods? I do not know how they are recreating themselves if there's no deploy to go by.
The root cause could be either an existing deployment, replicaset, daemonset, statefulset or a static pod. Check if any of these exist in the affected namespace using kubectl get <RESOURCE-TYPE>
I've had this happen after issuing a rollout restart deployment while a pod was already in an error or creating state, and explicitly deleting the second pod only resulted in a new one getting scheduled (trick birthday candle situation).
I find almost any time I have an issue like this it can be fixed by simply zeroing out the replicaSets in the deployment, applying, then restoring replicaSets to the original value.
I am new to Kubernetes and started working with it from past one month.
When creating the setup of cluster, sometimes I see that Heapster will be stuck in Container Creating or Pending status. After this happens the only way have found here is to re-install everything from the scratch which has solved our problem. Later if I run the Heapster it would run without any problem. But I think this is not the optimal solution every time. So please help out in solving the same issue when it occurs again.
Heapster image is pulled from the github for our use. Right now the cluster is running fine, So could not send the screenshot of the heapster failing with it's status by staying in Container creating or Pending status.
Suggest any alternative for the problem to be solved if it occurs again.
Thanks in advance for your time.
A pod stuck in pending state can mean more than one thing. Next time it happens you should do 'kubectl get pods' and then 'kubectl describe pod '. However, since it works sometimes the most likely cause is that the cluster doesn't have enough resources on any of its nodes to schedule the pod. If the cluster is low on remaining resources you should get an indication of this by 'kubectl top nodes' and by 'kubectl describe nodes'. (Or with gke, if you are on google cloud, you often get a low resource warning in the web UI console.)
(Or if in Azure then be wary of https://github.com/Azure/ACS/issues/29 )
I am running k8s on aws, and I updated the deployment of nginx - which normally, it works fine-, but after this time, the nginx deployment won't show up in "kubectl get deployments".
I want to kill all the pods related to nginx, but they keep reproduce themselves. I deleted all deployments "kubectl delete --all deployments", other pods just got terminated, but not nginx.
I have no idea where I can stop the pods recreating.
any idea where to start ?
check the deployment, replication controller and replica set and remove them.
kubectl get deploy,rc,rs
In modern kubernetes, there is also an annotation kubernetes.io/created-by on the Pod showing its "owner", as seen here, but I can't lay my hands on the documentation link right now. However, I found a pastebin containing a concrete example of the contents of the annotation