I deployed the images of microservices currency-conversion and currency-exchange on Google cloud but in the Kubernetes Engine, I see that the pods/replica sets are not available.
When I check under Workload tab, I see that the service shows a message "Does not have minimum availability"
I added additional availability zone to increase the resources but that did not help.
How do I fix this ?
Many reasons could be there behind failure:
Low Resources so POD are not starting or pending
Liveness or Readiness failing for PODs
Configmap or secret which POD require to start is not available
You can describe the POD or check the logs of POD to debug more issue
kubectl describe pod <POD name> -n <Namespace name>
The pod is crashing hence why you're getting "Does not mean minimum availability"
You should look at the logs of the container first and see why its crashing
kubectl logs -n default {name of pod}
Related
I am working on deploying a certain pod to GKE but I am having an unhealthy state for my backend services.
The deployment went through via helm install process but the ingress reports a certain warning error that says Some backend services are in UNHEALTHY state. I have tried to access the logs but do not know exactly what to look out for. Also, I already have liveness and readiness probes running.
What could I do to make the ingress come back to a healthy state? Thanks
Picture of warning error on GKE UI
Without more details it is hard to determine the exact cause.
As first point I want to mention, that your error message is Some backend services are in UNHEALTHY state, not All backend services are in UNHEALTHY state. It indicates that only a few of your backends are affected.
There might be tons of reasons, if you are using GCP Ingress or Nginx Ingress, your configuration of externalTrafficPolicy, if you are using preemptive nodes, your livenessProbe and readinessProbe, health checks, etc.
In your scenario, only a few backends are affected, the only thing with current information I can suggest you some debug options.
Using $ kubectl get po -n <namespace> check if all your pods are working correctly, that all containers within pods are Ready and pod status is Running. Eventually check logs of suspicious pod $ kubectl logs <podname> -c <containerName>. In general you should check all pods the load balancer is pointing to,
Confirm if livenessProbe and readinessProbe are configured properly and response is 200,
Describe your ingress $ kubectl describe ingress <yourIngressName> and check backends,
Check if you've configured your health checks properly according to GKE Ingress for HTTP(S) Load Balancing - Health Checks guide.
If you still won't be able to solve this issue with above debug options, please provide more details about your env with logs (without private information).
Useful links:
kubernetes unhealthy ingress backend
GKE Ingress shows unhealthy backend services
In GKE you can define BackendConfig. To define custom health checks. you can configure this using the below link to make the ingress backend to be in a HEALTHY state.
https://cloud.google.com/kubernetes-engine/docs/how-to/ingress-features#direct_health
If you have kubectl access to your pods, you can run kubectl get pod, and then kubctl logs -f <pod-name>.
Review the logs and find the error(s).
This question is about logging/monitoring.
I'm running a 3 node cluster on AKS, with 3 orgs, Dev, Test and Prod. The chart worked fine in Dev, but the same chart keeps getting killed by Kubernetes in Test, and it keeps getting recreated, and re-killed. Is there a way to extract details on why this is happening? All I see when I describe the pod is Reason: Killed
Please tell me more details on this or can give some suggestions. Thanks!
List Events sorted by timestamp
kubectl get events --sort-by=.metadata.creationTimestamp
There might be various reasons for it to be killed, e.g. not sufficient resources or failed liveness probe.
For SonarQube there is a liveness and readiness probe configured so it might fail. Also as described in helm's chart values:
If an ingress path other than the root (/) is defined, it should be reflected here
A trailing "/" must be included
You can also check if there are sufficient resources on node:
check what node are pods running on: kubectl get pods -test and
then run kubectl describe node <node-name> to check if there is no
disk/ memory pressure.
You can also run kubectl logs <pod-name> and kubectl describe pod <pod-name> that might give you some insight of kill reason.
If Pod's status is Failed, Kubernetes will try to create new Pods until it reaches terminated-pod-gc-threshold in kube-controller-manager. This will leave many Failed Pods in a cluster and need to be cleaned up.
Are there other reasons except Evicted that will cause Pod Failed?
There can be many causes for the POD status to be FAILED. You just need to check for problems(if there exists any) by running the command
kubectl -n <namespace> describe pod <pod-name>
Carefully check the EVENTS section where all the events those occurred during POD creation are listed. Hopefully you can pinpoint the cause of failure from there.
However there are several reasons for POD failure, some of them are the following:
Wrong image used for POD.
Wrong command/arguments are passed to the POD.
Kubelet failed to check POD liveliness(i.e., liveliness probe failed).
POD failed health check.
Problem in network CNI plugin (misconfiguration of CNI plugin used for networking).
For example:
In the above example, the image "not-so-busybox" couldn't be pulled as it doesn't exist, so the pod FAILED to run. The pod status and events clearly describe the problem.
Simply do this:
kubectl get pods <pod_name> -o yaml
And in the output, towards the end, you can see something like this:
This will give you a good idea of where exactly did the pod fail and what happened.
PODs will not survive scheduling failures, node failures, or other evictions, such as lack of resources, or in the case of node maintenance.
Pods should not be created manually but almost always via controllers like Deployments (self-healing, replication etc).
Reason why pod failed or was terminated can be obtain by
kubectl describe pod <pod_name>
Others situation I have encountered when pod Failed:
Issues with image (not existing anymore)
When pod is attempting to access i.e ConfigMap or Secrets but it is not found in namespace.
Liveness Probe Failure
Persistent Volume fails to mount
Validation Error
In addition, eviction is based on resources - EvictionPolicy
It can be also caused by DRAINing the Node/Pod. You can read about DRAIN here.
I'm using digital ocean kubernetes cluster service and have deployed 9 nodes in cluster but when i'm trying to deploy kafka zookeeper pods few pods get deployed other remain in pending state. i've tried doing
kubectl describe pods podname -n namespace
it shows
its not getting assigned to any nodes
check if your deployment/statefulset might have some node Selectors and/or node/pod affinity that might prevent it from running .
also it would be helpful to see more parts of the pod decribe since it might give more details.
there is a message on your print screen about the PersistentVolume Claims so I would also check the status of the pvc objects to check if they are bound or not.
good luck
I have a Kubernetes cluster on Google Cloud Platform. The Kubernetes cluster contains a deployment which has one pod. The pod has two containers. I have observed that the pod has been replaced by a new pod and the entire data is wiped out. I am not able to identify the reason behind it.
I have tried the below two commands:
kubectl logs [podname] -c [containername] --previous
**Result: ** previous terminated container [containername] in pod [podname] not found
kubectl get pods
Result: I see that the number of restarts for my pod equals 0.
Is there anything I could do to get the logs from my old pod?
Try below command to see the pod info
kubectl describe po
Not many chances you will retrieve this information, but try next:
1) If you know your failed container id - try to find old logs here
/var/lib/docker/containers/<container id>/<container id>-json.log
2) look at kubelet's logs:
journalctl -u kubelet