Reasons of Pod Status Failed - kubernetes

If Pod's status is Failed, Kubernetes will try to create new Pods until it reaches terminated-pod-gc-threshold in kube-controller-manager. This will leave many Failed Pods in a cluster and need to be cleaned up.
Are there other reasons except Evicted that will cause Pod Failed?

There can be many causes for the POD status to be FAILED. You just need to check for problems(if there exists any) by running the command
kubectl -n <namespace> describe pod <pod-name>
Carefully check the EVENTS section where all the events those occurred during POD creation are listed. Hopefully you can pinpoint the cause of failure from there.
However there are several reasons for POD failure, some of them are the following:
Wrong image used for POD.
Wrong command/arguments are passed to the POD.
Kubelet failed to check POD liveliness(i.e., liveliness probe failed).
POD failed health check.
Problem in network CNI plugin (misconfiguration of CNI plugin used for networking).
For example:
In the above example, the image "not-so-busybox" couldn't be pulled as it doesn't exist, so the pod FAILED to run. The pod status and events clearly describe the problem.

Simply do this:
kubectl get pods <pod_name> -o yaml
And in the output, towards the end, you can see something like this:
This will give you a good idea of where exactly did the pod fail and what happened.

PODs will not survive scheduling failures, node failures, or other evictions, such as lack of resources, or in the case of node maintenance.
Pods should not be created manually but almost always via controllers like Deployments (self-healing, replication etc).
Reason why pod failed or was terminated can be obtain by
kubectl describe pod <pod_name>
Others situation I have encountered when pod Failed:
Issues with image (not existing anymore)
When pod is attempting to access i.e ConfigMap or Secrets but it is not found in namespace.
Liveness Probe Failure
Persistent Volume fails to mount
Validation Error
In addition, eviction is based on resources - EvictionPolicy
It can be also caused by DRAINing the Node/Pod. You can read about DRAIN here.

Related

Kubernetes on GCP - Getting message "Does not have minimum availability"

I deployed the images of microservices currency-conversion and currency-exchange on Google cloud but in the Kubernetes Engine, I see that the pods/replica sets are not available.
When I check under Workload tab, I see that the service shows a message "Does not have minimum availability"
I added additional availability zone to increase the resources but that did not help.
How do I fix this ?
Many reasons could be there behind failure:
Low Resources so POD are not starting or pending
Liveness or Readiness failing for PODs
Configmap or secret which POD require to start is not available
You can describe the POD or check the logs of POD to debug more issue
kubectl describe pod <POD name> -n <Namespace name>
The pod is crashing hence why you're getting "Does not mean minimum availability"
You should look at the logs of the container first and see why its crashing
kubectl logs -n default {name of pod}

what should I do to find the pod evicted reason

Today when I checked the kubernetes cluster, some of the pod shows the status was evicted. But I only see the evicted status and could not found the detail logs why the pod was evicted. Disk Pressure? CPU pressure? what should I do to found the reason of the pod evicted?
you can try to look at logs of that particular pod.
Do a describe on that pod and see if you find anything.
kubectl get pods -o wide
try the above command to see on which node it was running and run a describe on that node and you find at-least some information related to the eviction.
Eviction is a process where a Pod assigned to a Node is asked for termination. One of the most common cases in Kubernetes is Preemption, where in order to schedule a new Pod in a Node with limited resources, another Pod needs to be terminated to leave resources to the first one.
So, to answer your question, the pod would have got evicted with limited CPU or memory resources allocated.

How do I know why my SonarQube helm chart is getting auto-killed by Kubernetes

This question is about logging/monitoring.
I'm running a 3 node cluster on AKS, with 3 orgs, Dev, Test and Prod. The chart worked fine in Dev, but the same chart keeps getting killed by Kubernetes in Test, and it keeps getting recreated, and re-killed. Is there a way to extract details on why this is happening? All I see when I describe the pod is Reason: Killed
Please tell me more details on this or can give some suggestions. Thanks!
List Events sorted by timestamp
kubectl get events --sort-by=.metadata.creationTimestamp
There might be various reasons for it to be killed, e.g. not sufficient resources or failed liveness probe.
For SonarQube there is a liveness and readiness probe configured so it might fail. Also as described in helm's chart values:
If an ingress path other than the root (/) is defined, it should be reflected here
A trailing "/" must be included
You can also check if there are sufficient resources on node:
check what node are pods running on: kubectl get pods -test and
then run kubectl describe node <node-name> to check if there is no
disk/ memory pressure.
You can also run kubectl logs <pod-name> and kubectl describe pod <pod-name> that might give you some insight of kill reason.

kubectl get pod status always ContainerCreating

k8s version: 1.12.1
I created pod with api on node and allocated an IP (through flanneld). When I used the kubectl describe pod command, I could not get the pod IP, and there was no such IP in etcd storage.
It was only a few minutes later that the IP could be obtained, and then kubectl get pod STATUS was Running.
Has anyone ever encountered this problem?
Like MatthiasSommer mentioned in comment, process of creating pod might take a while.
If POD will stay for a longer time in ContainerCreating status you can check what is stopping it change to status Running by command:
kubectl describe pod <pod_name>
Why creating of pod may take a longer time?
Depends on what is included in manifest, pod can share namespace, storage volumes, secrets, assignin resources, configmaps etc.
kube-apiserver validates and configures data for api objects.
kube-scheduler needs to check and collect resurces requrements, constraints, etc and assign pod to the node.
kubelet is running on each node and is ensures that all containers fulfill pod specification and are healty.
kube-proxy is also running on each node and it is responsible for network on pod.
As you see there are many requests, validates, syncs and it need a while to create pod fulfill all requirements.

Error while creating pods in Kubernetes

I have installed Kubernetes in Ubuntu server using instructions here. I am trying to create pods using kubectl run hello-minikube --image=gcr.io/google_containers/echoserver:1.4 --hostport=8000 --port=8080 as listed in the example. However, when I do kubectl get pod I get the status of the container as pending. I further did kubectl describe pod for debugging and I see the message:
FailedScheduling pod (hello-minikube-3383150820-1r4f7) failed to fit in any node fit failure on node (minikubevm): PodFitsHostPorts.
I am further trying to delete this pod by kubectl delete pod hello-minikube-3383150820-1r4f7 but when I further do kubectl get pod I see another pod with prefix "hello-minikube-3383150820-" that I havent created. Does anyone know how to fix this problem? Thank you in advance.
The PodFitsHostPorts predicate is failing because you have something else on your nodes using port 8000. You might be able to find what it is by running kubectl describe svc.
kubectl run creates a deployment object (you can see it with kubectl describe deployments) which makes sure that you always keep the intended number of replicas of the pod running (in this case 1). When you delete the pod, the deployment controller automatically creates another for you. If you want to delete the deployment and the pods it keeps creating, you can run kubectl delete deployments hello-minikube.