using kubectl delete command to remove core-dns pod blocked / No activity - kubernetes

I found my coredns pod throw error: Readiness probe failed: Get http://172.30.224.7:8080/health: net/http: request canceled (Client.Timeout exceeded while awaiting headers) . I am delete pod using this command:
kubectl delete pod coredns-89764d78c-mbcbz -n kube-system
but the command keep waiting and nothing response,how to know the progress of deleting? this is output:
[root#ops001 ~]# kubectl delete pod coredns-89764d78c-mbcbz -n kube-system
pod "coredns-89764d78c-mbcbz" deleted
and the terminal hangs or blocked,when I use browser UI with using kubernetes dashboard the pod exits.how to force delete it? or fix it the right way?

You are deleting a pod which is monitored by deployment controller. That's why when you delete one of the pods, the controller create another to make sure the number of pods equal to the replica count. If you really want to delete the coredns[not recommended], delete the deployment instead of the pods.
$ kubectl delete deployment coredns -n kube-system

Answering another part of your question:
but the command keep waiting and nothing response,how to know the
progress of deleting? this is output:
[root#ops001 ~]# kubectl delete pod coredns-89764d78c-mbcbz -n kube-system
pod "coredns-89764d78c-mbcbz" deleted
and the terminal blocked...
When you're deleting a Pod and you want to see what's going on under the hood, you can additionally provide -v flag and specify the desired verbosity level e.g.:
kubectl delete pod coredns-89764d78c-mbcbz -n kube-system -v 8
If there is some issue with the deletion of specific Pod, it should tell you the details.
I totally agree with #P Ekambaram's comment:
if coredns is not started. you need to check logs and find out why it
is not getting started – P Ekambaram
You can always delete the whole coredns Deployment and re-deploy it again but generally you shouldn't do that. Looking at Pod logs:
kubectl logs coredns-89764d78c-mbcbz -n kube-system
should also tell you some details explaining why it doesn't work properly. I would say that deleting the whole coredns Deployment is a last resort command.

Related

Pod not terminating

We have a Rancher Kubernetes cluster where sometimes the pods get stuck in terminating status when we try to delete the corresponding deployment, as shown below.
$ kubectl get deployments
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
...
storage-manager-deployment 1 0 0 0 1d
...
$ kubectl delete deployments storage-manager-deployment
kubectl delete deployments storage-manager-deployment
deployment.extensions "storage-manager-deployment" deleted
C-c C-c^C
$ kubectl get po
NAME READY STATUS RESTARTS AGE
...
storage-manager-deployment-6d56967cdd-7bgv5 0/1 Terminating 0 23h
...
$ kubectl delete pods storage-manager-deployment-6d56967cdd-7bgv5 --grace-period=0 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "storage-manager-deployment-6d56967cdd-7bgv5" force deleted
C-c C-c^C
Both the delete commands (for the deployment and the pod) get stuck and need to be stopped manually.
We have tried both
kubectl delete pod NAME --grace-period=0 --force
and
kubectl delete pod NAME --now
without any luck.
We have also set fs.may_detach_mounts=1, so it seems that all the similar questions already on StackOverflow don't apply to our problem.
If we check the node on which the incriminated pod runs, it does not appear in the docker ps list.
Any suggestion?
Thanks
Check the pod spec for an array: 'finalizers'
finalizers:
- cattle-system
If this exists, remove it, and the pod will terminate.

"Ghost" kubernetes pod stuck in terminating

The situation
I have a kubernetes pod stuck in "Terminating" state that resists pod deletions
NAME READY STATUS RESTARTS AGE
...
funny-turtle-myservice-xxx-yyy 1/1 Terminating 1 11d
...
Where funny-turtle is the name of the helm release that have since been deleted.
What I have tried
try to delete the pod.
Output: pod "funny-turtle-myservice-xxx-yyy" deleted
Outcome: it still shows up in the same state.
- also tried with --force --grace-period=0, same outcome with extra warning
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
try to read the logs (kubectl logs ...).
Outcome: Error from server (NotFound): nodes "ip-xxx.yyy.compute.internal" not found
try to delete the kubernetes deployment.
but it does not exist.
So I assume this pod somehow got "disconnected" from the aws API, reasoning from the error message that kubectl logs printed.
I'll take any suggestions or guidance to explain what happened here and how I can get rid of it.
EDIT 1
Tried to see if the "ghost" node was still there (kubectl delete node ip-xxx.yyy.compute.internal) but it does not exist.
Try removing the finalizers from the pod:
kubectl patch pod funny-turtle-myservice-xxx-yyy -p '{"metadata":{"finalizers":null}}'
In my case, the solution proposed by the accepted answer did not work, it kept stuck in "Terminating" status. What did the trick for me was:
kubectl delete pods <pod> --grace-period=0 --force
The above solutions did not work in my case, except I didn't try restarting all the nodes.
The error state for my pod was as follows (extra lines omitted):
$ kubectl -n myns describe pod/mypod
Status: Terminating (lasts 41h)
Containers:
runner:
State: Waiting
Reason: ContainerCreating
Last State: Terminated
Reason: ContainerStatusUnknown
Message: The container could not be located when the pod was deleted.
The container used to be Running
Exit Code: 137
$ kubectl -n myns get pod/mypod -o json
"metadata": {
"deletionGracePeriodSeconds": 0,
"deletionTimestamp": "2022-06-07T22:17:20Z",
"finalizers": [
"actions.summerwind.dev/runner-pod"
],
I removed the entry under finalizers (leaving finalizers as empty array) and then the pod was finally gone.
$ kubectl -n myns edit pod/mypod
pod/mypod edited
In my case nothing worked, no logs, no delete, absolutely nothing. I had to restart all the nodes, then the situation cleared up, no more Terminating pods.

How to list Kubernetes recently deleted pods?

Is there a way to get some details about Kubernetes pod that was deleted (stopped, replaced by new version).
I am investigating bug. I have logs with my pod name. That pod does not exist anymore, it was replaced by another one (with different configuration). New pod resides in same namespace, replication controller and service as old one.
Commands like
kubectl get pods
kubectl get pod <pod-name>
work only with current pods (live or stopped).
How I could get more details about old pods? I would like to see
when they were created
which environment variables they had when created
why and when they were stopped
As of today, kubectl get pods -a is deprecated, and as a result you cannot get deleted pods.
What you can do though, is to get a list of recently deleted pod names - up to 1 hour in the past unless you changed the ttl for kubernetes events - by running:
kubectl get event -o custom-columns=NAME:.metadata.name | cut -d "." -f1
You can then investigate further issues within your logging pipeline if you have one in place.
As far as I know you cannot get the Pod details once the Pod is deleted. Can I know what is the usecase?
Example:
if a Pod is created using kubectl run busybox-test-pod-status --image=busybox --restart=Never -- /bin/false
you will have a Pod with status terminated:error
if a Pod is created using kubectl run busybox-test-pod-status --image=busybox --restart=Never -- /bin/true
you will have a Pod with status terminated:Completed
if a container in a Pod restarts: the Pod will be alive and you can get the logs of previous container (only the previous container) using
kubectl logs --container <container name> --previous=true <pod name>
if you doing an upgrade of you app and you are creating Pods using Deployments. If the update deployment "say a new image", the Pod will be terminated and new Pod will be created. You can get the Pod details from the Deployment's YAML. if you want to get details of previous Pod you have see "spec" section of previous Deployment's YAML
You can try kubectl logs --previous to list the logs of a previously stopped pod
http://kubernetes.io/docs/user-guide/kubectl/kubectl_logs/
You may also want to check out these debugging tips
http://kubernetes.io/docs/user-guide/debugging-pods-and-replication-controllers/
There is a way to find out why pods were deleted and who deleted them.
The only way to find out something is to set the ttl for k8s to be greater than the default 1h and search through the events:
kubectl get event -o custom-columns=NAME:.metadata.name | cut -d "." -f1
If your container has previously crashed, you can access the previous container’s crash log with:
kubectl logs --previous ${POD_NAME} ${CONTAINER_NAME}
There is this flag:
-a, --show-all=false: When printing, show all resources (default hide terminated pods.)
But this may not help in all cases of old pods.
kubectl get pods -a
you will get the list of running pods and the terminated pods in case you are searching for this
If you want to see all the previously deleted pods and you are trying to fetch the previous pods.
Command line:
kubectl get pods
in which you will get all the pod details, because every service has one or more pods and they have unique ip address
Here you can check the lifecycle of pods and what phases of pod has.
https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle
and you can see the previous pod logs by typing a command:
kubectl logs --previous

Kubernetes pod gets recreated when deleted

I have started pods with command
$ kubectl run busybox \
--image=busybox \
--restart=Never \
--tty \
-i \
--generator=run-pod/v1
Something went wrong, and now I can't delete this Pod.
I tried using the methods described below but the Pod keeps being recreated.
$ kubectl delete pods busybox-na3tm
pod "busybox-na3tm" deleted
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
busybox-vlzh3 0/1 ContainerCreating 0 14s
$ kubectl delete pod busybox-vlzh3 --grace-period=0
$ kubectl delete pods --all
pod "busybox-131cq" deleted
pod "busybox-136x9" deleted
pod "busybox-13f8a" deleted
pod "busybox-13svg" deleted
pod "busybox-1465m" deleted
pod "busybox-14uz1" deleted
pod "busybox-15raj" deleted
pod "busybox-160to" deleted
pod "busybox-16191" deleted
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default busybox-c9rnx 0/1 RunContainerError 0 23s
You need to delete the deployment, which should in turn delete the pods and the replica sets https://github.com/kubernetes/kubernetes/issues/24137
To list all deployments:
kubectl get deployments --all-namespaces
Then to delete the deployment:
kubectl delete -n NAMESPACE deployment DEPLOYMENT
Where NAMESPACE is the namespace it's in, and DEPLOYMENT is the name of the deployment. If NAMESPACE is default, leave off the -n option altogether.
In some cases it could also be running due to a job or daemonset.
Check the following and run their appropriate delete command.
kubectl get jobs
kubectl get daemonsets.app --all-namespaces
kubectl get daemonsets.extensions --all-namespaces
Instead of trying to figure out whether it is a deployment, deamonset, statefulset... or what (in my case it was a replication controller that kept spanning new pods :)
In order to determine what it was that kept spanning up the image I got all the resources with this command:
kubectl get all
Of course you could also get all resources from all namespaces:
kubectl get all --all-namespaces
or define the namespace you would like to inspect:
kubectl get all -n NAMESPACE_NAME
Once I saw that the replication controller was responsible for my trouble I deleted it:
kubectl delete replicationcontroller/CONTROLLER_NAME
If your pod has name like name-xxx-yyy, it could be controlled by a replicasets.apps named name-xxx, you should delete that replicaset first before deleting the pod:
kubectl delete replicasets.apps name-xxx
Obviously something is respawning the pod. While a lot of the other answers have you looking at everything (replica sets, jobs, deployments, stateful sets, ...) to find what may be respawning the pod, you can instead just look at the pod to see what spawned it. For example do:
$ kubectl describe pod $mypod | grep 'Controlled By:'
Controlled By: ReplicaSet/foobar
This tells you exactly what created the pod. You can then go and delete that.
Look out for stateful sets as well
kubectl get sts --all-namespaces
to delete all the stateful sets in a namespace
kubectl --namespace <yournamespace> delete sts --all
to delete them one by one
kubectl --namespace ag1 delete sts mssql1
kubectl --namespace ag1 delete sts mssql2
kubectl --namespace ag1 delete sts mssql3
This will provide information about all the pods,deployments, services and jobs
in the namespace.
kubectl get pods,services,deployments,jobs
pods can either be created by deployments or jobs
kubectl delete job [job_name]
kubectl delete deployment [deployment_name]
If you delete the deployment or job then restart of the pods can be stopped.
Many answers here tells to delete a specific k8s object, but you can delete multiple objects at once, instead of one by one:
kubectl delete deployments,jobs,services,pods --all -n <namespace>
In my case, I'm running OpenShift cluster with OLM - Operator Lifecycle Manager. OLM is the one who controls the deployment, so when I deleted the deployment, it was not sufficient to stop the pods from restarting.
Only when I deleted OLM and its subscription, the deployment, services and pods were gone.
First list all k8s objects in your namespace:
$ kubectl get all -n openshift-submariner
NAME READY STATUS RESTARTS AGE
pod/submariner-operator-847f545595-jwv27 1/1 Running 0 8d
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/submariner-operator-metrics ClusterIP 101.34.190.249 <none> 8383/TCP 8d
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/submariner-operator 1/1 1 1 8d
NAME DESIRED CURRENT READY AGE
replicaset.apps/submariner-operator-847f545595 1 1 1 8d
OLM is not listed with get all, so I search for it specifically:
$ kubectl get olm -n openshift-submariner
NAME AGE
operatorgroup.operators.coreos.com/openshift-submariner 8d
NAME DISPLAY VERSION
clusterserviceversion.operators.coreos.com/submariner-operator Submariner 0.0.1
Now delete all objects, including OLMs, subscriptions, deployments, replica-sets, etc:
$ kubectl delete olm,svc,rs,rc,subs,deploy,jobs,pods --all -n openshift-submariner
operatorgroup.operators.coreos.com "openshift-submariner" deleted
clusterserviceversion.operators.coreos.com "submariner-operator" deleted
deployment.extensions "submariner-operator" deleted
subscription.operators.coreos.com "submariner" deleted
service "submariner-operator-metrics" deleted
replicaset.extensions "submariner-operator-847f545595" deleted
pod "submariner-operator-847f545595-jwv27" deleted
List objects again - all gone:
$ kubectl get all -n openshift-submariner
No resources found.
$ kubectl get olm -n openshift-submariner
No resources found.
After taking an interactive tutorial I ended up with a bunch of pods, services, deployments:
me#pooh ~ > kubectl get pods,services
NAME READY STATUS RESTARTS AGE
pod/kubernetes-bootcamp-5c69669756-lzft5 1/1 Running 0 43s
pod/kubernetes-bootcamp-5c69669756-n947m 1/1 Running 0 43s
pod/kubernetes-bootcamp-5c69669756-s2jhl 1/1 Running 0 43s
pod/kubernetes-bootcamp-5c69669756-v8vd4 1/1 Running 0 43s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 37s
me#pooh ~ > kubectl get deployments --all-namespaces
NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
default kubernetes-bootcamp 4 4 4 4 1h
docker compose 1 1 1 1 1d
docker compose-api 1 1 1 1 1d
kube-system kube-dns 1 1 1 1 1d
To clean up everything, delete --all worked fine:
me#pooh ~ > kubectl delete pods,services,deployments --all
pod "kubernetes-bootcamp-5c69669756-lzft5" deleted
pod "kubernetes-bootcamp-5c69669756-n947m" deleted
pod "kubernetes-bootcamp-5c69669756-s2jhl" deleted
pod "kubernetes-bootcamp-5c69669756-v8vd4" deleted
service "kubernetes" deleted
deployment.extensions "kubernetes-bootcamp" deleted
That left me with (what I think is) an empty Kubernetes cluster:
me#pooh ~ > kubectl get pods,services,deployments
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 8m
In some cases the pods will still not go away even when deleting the deployment. In that case to force delete them you can run the below command.
kubectl delete pods podname --grace-period=0 --force
When the pod is recreating automatically even after the deletion of the pod manually, then those pods have been created using the Deployment.
When you create a deployment, it automatically creates ReplicaSet and Pods. Depending upon how many replicas of your pod you mentioned in the deployment script, it will create those number of pods initially.
When you try to delete any pod manually, it will automatically create those pod again.
Yes, sometimes you need to delete the pods with force. But in this case force command doesn’t work.
Instead of removing NS you can try removing replicaSet
kubectl get rs --all-namespaces
Then delete the replicaSet
kubectl delete rs your_app_name
The root cause for the question asked was the deployment/job/replicasets spec attribute strategy->type which defines what should happen when the pod will be destroyed (either implicitly or explicitly). In my case, it was Recreate.
As per #nomad's answer, deleting the deployment/job/replicasets is the simple fix to avoid experimenting with deadly combos before messing up the cluster as a novice user.
Try the following commands to understand the behind the scene actions before jumping into debugging :
kubectl get all -A -o name
kubectl get events -A | grep <pod-name>
In my case I deployed via a YAML file like kubectl apply -f deployment.yaml and the solution appears to be to delete via kubectl delete -f deployment.yaml
Firstly list the deployments
kubectl get deployments
After that delete the deployment
kubectl delete deployment <deployment_name>
If you have a job that continues running, you need to search the job and delete it:
kubectl get job --all-namespaces | grep <name>
and
kubectl delete job <job-name>
You can do kubectl get replicasets check for old deployment based on age or time
Delete old deployment based on time if you want to delete same current running pod of application
kubectl delete replicasets <Name of replicaset>
I also faced the issue, I have used below command to delete deployment.
kubectl delete deployments DEPLOYMENT_NAME
but still pods was recreating, So I crossed check the Replica Set by using below command
kubectl get rs
then edit the replicaset to 1 to 0
kubectl edit rs REPICASET_NAME
With deployments that have stateful sets (or services, jobs, etc.) you can use this command:
This command terminates anything that runs in the specified <NAMESPACE>
kubectl -n <NAMESPACE> delete replicasets,deployments,jobs,service,pods,statefulsets --all
And forceful
kubectl -n <NAMESPACE> delete replicasets,deployments,jobs,service,pods,statefulsets --all --cascade=true --grace-period=0 --force
There is basically two ways to remove PODS
kubectl scale --replicas=0 deploy name_of_deployment.
This will set the number of replica to 0 and hence it will not restart the pods again.
Use helm to uninstall the chart which you have implemented in your pipeline.
Do not delete the deployment directly, instead use helm to uninstall the chart which will remove all objects it created.
The fastest solution for me was installing Lens IDE and removing the service under de DEPLOYMENTS tab. Just delete from this tab and the replica will be deleted too.
Best regards
Kubernetes always works in the format like:
deployments >>> replicasets >>> pods
first edit deployment with 0 replicas and then scale deployment with desired replicas(run below command).You will see new replicaset has been created and pods will also run with desired count.
*
IN-Linux:~ anuragmanikkame$ kubectl scale deploy tomcat -n
dev-namespace --replicas=2 deployment.extensions/tomcat scaled
I experienced a similar problem: after deleting the deployment (kubectl delete deploy <name>), the pods kept "Running" and where automatically re-created after deletion (kubectl delete po <name>).
It turned out that the associated replica set was not deleted automatically for some reason, and after deleting that (kubectl delete rs <name>), it was possible to delete the pods.
This has happened to me with some broken 'helm' installs. You might have a bit of a messed up deployment. If none of the previous suggestions work, look for a daemonset and delete that.
eg
kubectl get daemonset --namespace
then delete daemonset
kubectl delete daemonset --namespace <NAMESPACE> --all --force
then try to delete the pods.
kubectl delete pod --namespace <NAMESPACE> --all --force
Check if pods are gone.
kubectl get pods --all-namespaces
In my case I use these below
kubectl get all --all-namespaces
kubectl delete deployment statefulset-deploymentnament(choose your deployment name)
kubectl delete sts -n default(choose your namespace) --all
kubectl get pods --all-namespaces
Problem got resolved

Error while creating pods in Kubernetes

I have installed Kubernetes in Ubuntu server using instructions here. I am trying to create pods using kubectl run hello-minikube --image=gcr.io/google_containers/echoserver:1.4 --hostport=8000 --port=8080 as listed in the example. However, when I do kubectl get pod I get the status of the container as pending. I further did kubectl describe pod for debugging and I see the message:
FailedScheduling pod (hello-minikube-3383150820-1r4f7) failed to fit in any node fit failure on node (minikubevm): PodFitsHostPorts.
I am further trying to delete this pod by kubectl delete pod hello-minikube-3383150820-1r4f7 but when I further do kubectl get pod I see another pod with prefix "hello-minikube-3383150820-" that I havent created. Does anyone know how to fix this problem? Thank you in advance.
The PodFitsHostPorts predicate is failing because you have something else on your nodes using port 8000. You might be able to find what it is by running kubectl describe svc.
kubectl run creates a deployment object (you can see it with kubectl describe deployments) which makes sure that you always keep the intended number of replicas of the pod running (in this case 1). When you delete the pod, the deployment controller automatically creates another for you. If you want to delete the deployment and the pods it keeps creating, you can run kubectl delete deployments hello-minikube.