How to debug why my pods are pending in GCE - kubernetes

I'#m trying to get a pod running on GCE. The pod has an init container, and is created by me applying a manifest with a deployment that creates 1 replica of the pod.
When I look at my workloads on the cloud console, I can see that under 'Active revisions' my deployment is in the state of 'Pods are pending', and under 'Managed pods', the status is 'PodsInitializing'.
The container logs are empty, and the audit logs contain a single entry for the creation of the deployment.
My pods seem to be stuck in the above state, and I'm not really sure why. How do I go about debugging that?
Edit:
kubectl get pods --namespace=my-namespace
Outputs:
NAME READY STATUS RESTARTS AGE
my-pod-v77jm 0/1 Init:0/1 0 55m
But when I run:
kubectl describe pod my-pod-v77jm
I get
Error from server (NotFound): pods "my-pod-v77jm" not found

If you have access to kube-api via kubectl:
Use describe see details about the pod and containers
kubectl describe myPod --namespace mynamespace
To view container logs (including init containers)
kubectl logs myPod --namespace mynamespace -c initContainerName
You can get more information about pod statuses and how to debug init containers here

Related

Does `kubectl log deployment/name` get all pods or just one pod?

I need to see the logs of all the pods in a deployment with N worker pods
When I do kubectl logs deployment/name --tail=0 --follow the command syntax makes me assume that it will tail all pods in the deployment
However when I go to process I don't see any output as expected until I manually view the logs for all N pods in the deployment
Does kubectl log deployment/name get all pods or just one pod?
Yes, if you run kubectl logs with a deployment, it will return the logs of only one pod from the deployment.
However, you can accomplish what you are trying to achieve by using the -l flag to return the logs of all pods matching a label.
For example, let's say you create a deployment using:
kubectl create deployment my-dep --image=nginx --replicas=3
Each of the pods gets a label app=my-dep, as seen here:
$ kubectl get pods -l app=my-dep
NAME READY STATUS RESTARTS AGE
my-dep-6d4ddbf4f7-8jnsw 1/1 Running 0 6m36s
my-dep-6d4ddbf4f7-9jd7g 1/1 Running 0 6m36s
my-dep-6d4ddbf4f7-pqx2w 1/1 Running 0 6m36s
So, if you want to get the combined logs of all pods in this deployment you can use this command:
kubectl logs -l app=my-dep
only one pod seems to be the answer.
i went here How do I get logs from all pods of a Kubernetes replication controller? and it seems that the command kubectl logs deployment/name only shows one pod of N
also when you do execute the kubectl logs on a deployment it does say it only print to console that it is for one pod (not all the pods)

How to see logs of terminated pods

I am running selenium hubs and my pods are getting terminated frequently. I would like to look at the logs of the pods which are terminated. How to do it?
NAME READY STATUS RESTARTS AGE
chrome-75-0-0e5d3b3d-3580-49d1-bc25-3296fdb52666 0/2 Terminating 0 49s
chrome-75-0-29bea6df-1b1a-458c-ad10-701fe44bb478 0/2 Terminating 0 23s
chrome-75-0-8929d8c8-1f7b-4eba-96f2-918f7a0d77f5 0/2 ContainerCreating 0 7s
kubectl logs chrome-75-0-8929d8c8-1f7b-4eba-96f2-918f7a0d77f5
Error from server (NotFound): pods "chrome-75-0-8929d8c8-1f7b-4eba-96f2-918f7a0d77f5" not found
$ kubectl logs chrome-75-0-8929d8c8-1f7b-4eba-96f2-918f7a0d77f5 --previous
Error from server (NotFound): pods "chrome-75-0-8929d8c8-1f7b-4eba-96f2-918f7a0d77f5" not found
Running kubectl logs -p will fetch logs from existing resources at API level. This means that terminated pods' logs will be unavailable using this command.
As mentioned in other answers, the best way is to have your logs centralized via logging agents or directly pushing these logs into an external service.
Alternatively and given the logging architecture in Kubernetes, you might be able to fetch the logs directly from the log-rotate files in the node hosting the pods. However, this option might depend on the Kubernetes implementation as log files might be deleted when the pod eviction is triggered.
From kubernetes docs:
Examples
# Return snapshot logs from pod nginx with only one container
kubectl logs nginx
# Return snapshot of previous terminated ruby container logs from pod web-1
kubectl logs -p -c ruby web-1
# Begin streaming the logs of the ruby container in pod web-1
kubectl logs -f -c ruby web-1
# Display only the most recent 20 lines of output in pod nginx
kubectl logs --tail=20 nginx
# Show all logs from pod nginx written in the last hour
kubectl logs --since=1h nginx
Options
-c, --container="": Print the logs of this container
-f, --follow[=false]: Specify if the logs should be streamed.
--limit-bytes=0: Maximum bytes of logs to return. Defaults to no limit.
-p, --previous[=false]: If true, print the logs for the previous instance of the container in a pod if it exists.
--since=0: Only return logs newer than a relative duration like 5s, 2m, or 3h. Defaults to all logs. Only one of since-time / since may be used.
--since-time="": Only return logs after a specific date (RFC3339). Defaults to all logs. Only one of since-time / since may be used.
--tail=-1: Lines of recent log file to display. Defaults to -1, showing all log lines.
--timestamps[=false]: Include timestamps on each line in the log output
That is just a simple way of doing it. But in production , I would send all the logs of all the pods to a centeral Log management system such as ELK by deploying a log sending client on the kubernetes cluster as daemon-set such as fluentbit , that will keep sending logs to ELk where I am able to filter things base don the namespace , pod , container , or any other label.
kubectl get event -o custom-columns=NAME:.metadata.name -n <namespace> --no-headers
use the above command to get the list of terminated pods in your namespace and use
kubectl logs -f pod-name -n <namespace> -p
to see the terminated pod's logs
P.S.: The above command to fetch terminated pod details will give you the pods which were terminated 1 hour ag
You can try --previous flag on the logs
i.e.
kubectl --namespace namespace logs pod_name --previous
This will show the logs from dump pod logs (stdout) for a previous container accroding to kubernetes docs
A combination of a flag --previous and a container name for a container that was terminated with reason: CrashLoopBackOff:
First find a pod in a namespace. Its status is CrashLoopBackOff:
kubectl get pods -n namespace_name
NAME READY STATUS RESTARTS AGE
crashing_pod_name 0/9 Init:CrashLoopBackOff 17 (105s ago) 63m
Then use describe to find out a name of a container that failed:
kubectl describe pod -n namespace_name crashing_pod_name
Find a name of a container that is terminated with a reason CrashLoopBackOff
Then list logs:
kubectl logs -n namespace_name crashing_pod_name -c failing_container_name --previous

How to kill pods on Kubernetes local setup

I am starting exploring runnign docker containers with Kubernetes. I did the following
Docker run etcd
docker run master
docker run service proxy
kubectl run web --image=nginx
To cleanup the state, I first stopped all the containers and cleared the downloaded images. However I still see pods running.
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
web-3476088249-w66jr 1/1 Running 0 16m
How can I remove this?
To delete the pod:
kubectl delete pods web-3476088249-w66jr
If this pod is started via some replicaSet or deployment or anything that is creating replicas then find that and delete that first.
kubectl get all
This will list all the resources that have been created in your k8s cluster. To get information with respect to resources created in your namespace kubectl get all --namespace=<your_namespace>
To get info about the resource that is controlling this pod, you can do
kubectl describe web-3476088249-w66jr
There will be a field "Controlled By", or some owner field using which you can identify which resource created it.
When you do kubectl run ..., that's a deployment you create, not a pod directly. You can check this with kubectl get deploy. If you want to delete the pod, you need to delete the deployment with kubectl delete deploy DEPLOYMENT.
I would recommend you to create a namespace for testing when doing this kind of things. You just do kubectl create ns test, then you do all your tests in this namespace (by adding -n test). Once you have finished, you just do kubectl delete ns test, and you are done.
If you defined your object as Pod then
kubectl delete pod <--all | pod name>
will remove all of the generated Pod. But, If wrapped your Pod to Deployment object then running the command above only will trigger a re-creation of them.
In that case, you need to run
kubectl delete deployment <--all | deployment name>
That will also remove the Service object that is related to the deleted Deployment

Kubernetes pod gets recreated when deleted

I have started pods with command
$ kubectl run busybox \
--image=busybox \
--restart=Never \
--tty \
-i \
--generator=run-pod/v1
Something went wrong, and now I can't delete this Pod.
I tried using the methods described below but the Pod keeps being recreated.
$ kubectl delete pods busybox-na3tm
pod "busybox-na3tm" deleted
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
busybox-vlzh3 0/1 ContainerCreating 0 14s
$ kubectl delete pod busybox-vlzh3 --grace-period=0
$ kubectl delete pods --all
pod "busybox-131cq" deleted
pod "busybox-136x9" deleted
pod "busybox-13f8a" deleted
pod "busybox-13svg" deleted
pod "busybox-1465m" deleted
pod "busybox-14uz1" deleted
pod "busybox-15raj" deleted
pod "busybox-160to" deleted
pod "busybox-16191" deleted
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default busybox-c9rnx 0/1 RunContainerError 0 23s
You need to delete the deployment, which should in turn delete the pods and the replica sets https://github.com/kubernetes/kubernetes/issues/24137
To list all deployments:
kubectl get deployments --all-namespaces
Then to delete the deployment:
kubectl delete -n NAMESPACE deployment DEPLOYMENT
Where NAMESPACE is the namespace it's in, and DEPLOYMENT is the name of the deployment. If NAMESPACE is default, leave off the -n option altogether.
In some cases it could also be running due to a job or daemonset.
Check the following and run their appropriate delete command.
kubectl get jobs
kubectl get daemonsets.app --all-namespaces
kubectl get daemonsets.extensions --all-namespaces
Instead of trying to figure out whether it is a deployment, deamonset, statefulset... or what (in my case it was a replication controller that kept spanning new pods :)
In order to determine what it was that kept spanning up the image I got all the resources with this command:
kubectl get all
Of course you could also get all resources from all namespaces:
kubectl get all --all-namespaces
or define the namespace you would like to inspect:
kubectl get all -n NAMESPACE_NAME
Once I saw that the replication controller was responsible for my trouble I deleted it:
kubectl delete replicationcontroller/CONTROLLER_NAME
If your pod has name like name-xxx-yyy, it could be controlled by a replicasets.apps named name-xxx, you should delete that replicaset first before deleting the pod:
kubectl delete replicasets.apps name-xxx
Obviously something is respawning the pod. While a lot of the other answers have you looking at everything (replica sets, jobs, deployments, stateful sets, ...) to find what may be respawning the pod, you can instead just look at the pod to see what spawned it. For example do:
$ kubectl describe pod $mypod | grep 'Controlled By:'
Controlled By: ReplicaSet/foobar
This tells you exactly what created the pod. You can then go and delete that.
Look out for stateful sets as well
kubectl get sts --all-namespaces
to delete all the stateful sets in a namespace
kubectl --namespace <yournamespace> delete sts --all
to delete them one by one
kubectl --namespace ag1 delete sts mssql1
kubectl --namespace ag1 delete sts mssql2
kubectl --namespace ag1 delete sts mssql3
This will provide information about all the pods,deployments, services and jobs
in the namespace.
kubectl get pods,services,deployments,jobs
pods can either be created by deployments or jobs
kubectl delete job [job_name]
kubectl delete deployment [deployment_name]
If you delete the deployment or job then restart of the pods can be stopped.
Many answers here tells to delete a specific k8s object, but you can delete multiple objects at once, instead of one by one:
kubectl delete deployments,jobs,services,pods --all -n <namespace>
In my case, I'm running OpenShift cluster with OLM - Operator Lifecycle Manager. OLM is the one who controls the deployment, so when I deleted the deployment, it was not sufficient to stop the pods from restarting.
Only when I deleted OLM and its subscription, the deployment, services and pods were gone.
First list all k8s objects in your namespace:
$ kubectl get all -n openshift-submariner
NAME READY STATUS RESTARTS AGE
pod/submariner-operator-847f545595-jwv27 1/1 Running 0 8d
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/submariner-operator-metrics ClusterIP 101.34.190.249 <none> 8383/TCP 8d
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/submariner-operator 1/1 1 1 8d
NAME DESIRED CURRENT READY AGE
replicaset.apps/submariner-operator-847f545595 1 1 1 8d
OLM is not listed with get all, so I search for it specifically:
$ kubectl get olm -n openshift-submariner
NAME AGE
operatorgroup.operators.coreos.com/openshift-submariner 8d
NAME DISPLAY VERSION
clusterserviceversion.operators.coreos.com/submariner-operator Submariner 0.0.1
Now delete all objects, including OLMs, subscriptions, deployments, replica-sets, etc:
$ kubectl delete olm,svc,rs,rc,subs,deploy,jobs,pods --all -n openshift-submariner
operatorgroup.operators.coreos.com "openshift-submariner" deleted
clusterserviceversion.operators.coreos.com "submariner-operator" deleted
deployment.extensions "submariner-operator" deleted
subscription.operators.coreos.com "submariner" deleted
service "submariner-operator-metrics" deleted
replicaset.extensions "submariner-operator-847f545595" deleted
pod "submariner-operator-847f545595-jwv27" deleted
List objects again - all gone:
$ kubectl get all -n openshift-submariner
No resources found.
$ kubectl get olm -n openshift-submariner
No resources found.
After taking an interactive tutorial I ended up with a bunch of pods, services, deployments:
me#pooh ~ > kubectl get pods,services
NAME READY STATUS RESTARTS AGE
pod/kubernetes-bootcamp-5c69669756-lzft5 1/1 Running 0 43s
pod/kubernetes-bootcamp-5c69669756-n947m 1/1 Running 0 43s
pod/kubernetes-bootcamp-5c69669756-s2jhl 1/1 Running 0 43s
pod/kubernetes-bootcamp-5c69669756-v8vd4 1/1 Running 0 43s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 37s
me#pooh ~ > kubectl get deployments --all-namespaces
NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
default kubernetes-bootcamp 4 4 4 4 1h
docker compose 1 1 1 1 1d
docker compose-api 1 1 1 1 1d
kube-system kube-dns 1 1 1 1 1d
To clean up everything, delete --all worked fine:
me#pooh ~ > kubectl delete pods,services,deployments --all
pod "kubernetes-bootcamp-5c69669756-lzft5" deleted
pod "kubernetes-bootcamp-5c69669756-n947m" deleted
pod "kubernetes-bootcamp-5c69669756-s2jhl" deleted
pod "kubernetes-bootcamp-5c69669756-v8vd4" deleted
service "kubernetes" deleted
deployment.extensions "kubernetes-bootcamp" deleted
That left me with (what I think is) an empty Kubernetes cluster:
me#pooh ~ > kubectl get pods,services,deployments
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 8m
In some cases the pods will still not go away even when deleting the deployment. In that case to force delete them you can run the below command.
kubectl delete pods podname --grace-period=0 --force
When the pod is recreating automatically even after the deletion of the pod manually, then those pods have been created using the Deployment.
When you create a deployment, it automatically creates ReplicaSet and Pods. Depending upon how many replicas of your pod you mentioned in the deployment script, it will create those number of pods initially.
When you try to delete any pod manually, it will automatically create those pod again.
Yes, sometimes you need to delete the pods with force. But in this case force command doesn’t work.
Instead of removing NS you can try removing replicaSet
kubectl get rs --all-namespaces
Then delete the replicaSet
kubectl delete rs your_app_name
The root cause for the question asked was the deployment/job/replicasets spec attribute strategy->type which defines what should happen when the pod will be destroyed (either implicitly or explicitly). In my case, it was Recreate.
As per #nomad's answer, deleting the deployment/job/replicasets is the simple fix to avoid experimenting with deadly combos before messing up the cluster as a novice user.
Try the following commands to understand the behind the scene actions before jumping into debugging :
kubectl get all -A -o name
kubectl get events -A | grep <pod-name>
In my case I deployed via a YAML file like kubectl apply -f deployment.yaml and the solution appears to be to delete via kubectl delete -f deployment.yaml
Firstly list the deployments
kubectl get deployments
After that delete the deployment
kubectl delete deployment <deployment_name>
If you have a job that continues running, you need to search the job and delete it:
kubectl get job --all-namespaces | grep <name>
and
kubectl delete job <job-name>
You can do kubectl get replicasets check for old deployment based on age or time
Delete old deployment based on time if you want to delete same current running pod of application
kubectl delete replicasets <Name of replicaset>
I also faced the issue, I have used below command to delete deployment.
kubectl delete deployments DEPLOYMENT_NAME
but still pods was recreating, So I crossed check the Replica Set by using below command
kubectl get rs
then edit the replicaset to 1 to 0
kubectl edit rs REPICASET_NAME
With deployments that have stateful sets (or services, jobs, etc.) you can use this command:
This command terminates anything that runs in the specified <NAMESPACE>
kubectl -n <NAMESPACE> delete replicasets,deployments,jobs,service,pods,statefulsets --all
And forceful
kubectl -n <NAMESPACE> delete replicasets,deployments,jobs,service,pods,statefulsets --all --cascade=true --grace-period=0 --force
There is basically two ways to remove PODS
kubectl scale --replicas=0 deploy name_of_deployment.
This will set the number of replica to 0 and hence it will not restart the pods again.
Use helm to uninstall the chart which you have implemented in your pipeline.
Do not delete the deployment directly, instead use helm to uninstall the chart which will remove all objects it created.
The fastest solution for me was installing Lens IDE and removing the service under de DEPLOYMENTS tab. Just delete from this tab and the replica will be deleted too.
Best regards
Kubernetes always works in the format like:
deployments >>> replicasets >>> pods
first edit deployment with 0 replicas and then scale deployment with desired replicas(run below command).You will see new replicaset has been created and pods will also run with desired count.
*
IN-Linux:~ anuragmanikkame$ kubectl scale deploy tomcat -n
dev-namespace --replicas=2 deployment.extensions/tomcat scaled
I experienced a similar problem: after deleting the deployment (kubectl delete deploy <name>), the pods kept "Running" and where automatically re-created after deletion (kubectl delete po <name>).
It turned out that the associated replica set was not deleted automatically for some reason, and after deleting that (kubectl delete rs <name>), it was possible to delete the pods.
This has happened to me with some broken 'helm' installs. You might have a bit of a messed up deployment. If none of the previous suggestions work, look for a daemonset and delete that.
eg
kubectl get daemonset --namespace
then delete daemonset
kubectl delete daemonset --namespace <NAMESPACE> --all --force
then try to delete the pods.
kubectl delete pod --namespace <NAMESPACE> --all --force
Check if pods are gone.
kubectl get pods --all-namespaces
In my case I use these below
kubectl get all --all-namespaces
kubectl delete deployment statefulset-deploymentnament(choose your deployment name)
kubectl delete sts -n default(choose your namespace) --all
kubectl get pods --all-namespaces
Problem got resolved

Error while creating pods in Kubernetes

I have installed Kubernetes in Ubuntu server using instructions here. I am trying to create pods using kubectl run hello-minikube --image=gcr.io/google_containers/echoserver:1.4 --hostport=8000 --port=8080 as listed in the example. However, when I do kubectl get pod I get the status of the container as pending. I further did kubectl describe pod for debugging and I see the message:
FailedScheduling pod (hello-minikube-3383150820-1r4f7) failed to fit in any node fit failure on node (minikubevm): PodFitsHostPorts.
I am further trying to delete this pod by kubectl delete pod hello-minikube-3383150820-1r4f7 but when I further do kubectl get pod I see another pod with prefix "hello-minikube-3383150820-" that I havent created. Does anyone know how to fix this problem? Thank you in advance.
The PodFitsHostPorts predicate is failing because you have something else on your nodes using port 8000. You might be able to find what it is by running kubectl describe svc.
kubectl run creates a deployment object (you can see it with kubectl describe deployments) which makes sure that you always keep the intended number of replicas of the pod running (in this case 1). When you delete the pod, the deployment controller automatically creates another for you. If you want to delete the deployment and the pods it keeps creating, you can run kubectl delete deployments hello-minikube.