How to find the pod that led to an error in GKE - kubernetes

If I look at my logs in GCP logs, I see for instance that I got a request that gave 500
log_message: "Method: some_cloud_goo.Endpoint failed: INTERNAL_SERVER_ERROR"
I would like to quickly go to that pod and do a kubectl logs on it. But I did not find a way to do this.
I am fairly new to k8s and GKE, any way to traceback the pod that handled that request?

You could run command "kubectl get pods " on each node to check the status of all pods and could figure out accordingly by running for detail description of an error " kubectl describe pod pod-name"

As mentioned in #Neelam answer, you can can get the pod names with the command kubectl get pods -A and log into all your pods to find the error.
Or, alternatively, you could deploy a custom monitoring system like Elastic GKE Logging available in GCP github Click-to-deploy.
See here to install from MarketPlace with few clicks.
It is a free alternative to have a complete monitoring system and you can filter your logs in Kibana dashboard after deployed.

Related

deployed a service on k8s but not showing any pods weven when it failed

I have deployed a k8s service, however its not showing any pods. This is what I see
kubectl get deployments
It should create on the default namespace
kubectl get nodes (this shows me nothing)
How do I troubleshoot a failed deployment. The test-control-plane is the one deployed by kind this is the k8s one I'm using.
kubectl get nodes
If above command is not showing anything which mean there is no Nodes in your cluster so where your workload will run ?
You need to have at least one worker node in K8s cluster so deployment can schedule the POD on it and run the application.
You can check worker node using same command
kubectl get nodes
You can debug more and check the reason of issue further using
kubectl describe deployment <name of your deployment>
To find out what really went wrong, first follow the steps described in Harsh Manvar in his answer. Perhaps obtaining that information can help you find the problem. If not, check the logs of your deployment. Try to list your pods and see which ones did not boot properly, then check their logs.
You can also use the kubectl describe on pods to see in more detail what went wrong. Since you are using kind, I include a list of known errors for you.
You can also see this visual guide on troubleshooting Kubernetes deployments and 5 Tips for Troubleshooting Kubernetes Deployments.

How to debug a kubernetes cluster?

As the question shows, I have very low knowledge about kubernetes. Following a tutorial, I made a Kubernetes cluster to run a web app on a local server using Minikube. I have applied the kubernetes components and they are running but the Web-Server does not respond to HTTP requests. My problem is that all the system that I have created is like a black box for me and I have literally no idea how to open it and see where the problem is. Can you explain how I can debug such implementaions in a wise way. Thanks.
use a tool like https://github.com/kubernetes/kubernetes-dashboard
You can install kubectl and kubernetes-dashboard in a k8s cluster (https://kubernetes.io/docs/tasks/tools/install-kubectl/), and then use the kubectl command to query information about a pod or container, or use the kubernetes-dashboard web UI to query information about the cluster.
For more information, please refer to https://kubernetes.io/
kubectl get pods
will show you all your pods and their status. A quick check to make sure that all is at least running.
If there are pods that are unhealthy, then
kubectl describe pod <pod name>
will give some more information.. eg image not found etc
kubectl log <pod name> --all
is often the next step , use -f to follow the logs as you exercise your api.
It is possible to link up images running in a pod with most ide debuggers, but instructions will differ depending on language and ide used...

Check failed pods logs in a Kubernetes cluster

I have a Kubernetes cluster, in which different pods are running in different namespaces. How do I know if any pod failed?
Is there any single command to check the failed pod list or restated pod list?
And reason for the restart(logs)?
Depends if you want to have detailed information or you just want to check a few last failed pods.
I would recommend you to read about Logging Architecture.
In case you would like to have this detailed information you should use 3rd party software, as its described in Kubernetes Documentation - Logging Using Elasticsearch and Kibana or another one FluentD.
If you are using Cloud environment you can use Integrated with Cloud Logging tools (i.e. in Google Cloud Platform you can use Stackdriver).
In case you want to check logs to find reason why pod failed, it's good described in K8s docs Debug Running Pods.
If you want to get logs from specific pod
$ kubectl logs ${POD_NAME} -n {NAMESPACE}
First, look at the logs of the affected container:
$ kubectl logs ${POD_NAME} ${CONTAINER_NAME}
If your container has previously crashed, you can access the previous container's crash log with:
$ kubectl logs --previous ${POD_NAME} ${CONTAINER_NAME}
Additional information you can obtain using
$ kubectl get events -o wide --all-namespaces | grep <your condition>
Similar question was posted in this SO thread, you can check if for more details.
This'll work: kubectl get pods --all-namespaces | | grep -Ev '([0-9]+)/\1'
Also, Lens is pretty good in these situations.
Most of the times, the reason for app failure is printed in the lasting logs of the previous pod. You can see them by simply putting --previous flag along with your kubectl logs ... cmd.

Restart server running inside Kubernetes Node

I am having a IBM cloud powered kubernetes cluster. That cluster currently have only 1 node.
I verified running the command kubectl get nodes.
There are few servers which are running in that node. I want to restart one of those server.
How can I get into the node and perform a restart for the required server?
I tried ssh, but this link says it cannot be done directly.
Seems like your main questions are:
"how to restart a pod", "how to ssh to a entity in which my service is running" and "how to see if I deleted a Pod".
First of all, most of this questions are already answered on StackOverflow. Second of all you need to get familiar with Kubernetes basic terminology and how things work in here. You can do that in any Kubernetes introduction or in documentation.
Answering the questions:
1) About restarting you can find information here. Or if you have running deployment, deleting a pod will result in pod recreation.
2) you can use kubectl execas described here:
kubectl exec -ti pod_name sh(or bash)
3) to see your pods, run kubectl get pods after you run kubectl delete pod name -n namespace you can run kubectl get pods -w to see changing status of deleted pod and new one being spawned. Or you will notice that there is a new pod running but with different NAME.

Kubernetes 1.11 could not find heapster for metrics

I'm using Kubernetes 1.11 on Digital Ocean, when I try to use kubectl top node I get this error:
Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)
but as stated in the doc, heapster is deprecated and no longer required from kubernetes 1.10
If you are running a newer version of Kubernetes and still receiving this error, there is probably a problem with your installation.
Please note that to install metrics server on kubernetes, you should first clone it by typing:
git clone https://github.com/kodekloudhub/kubernetes-metrics-server.git
then you should install it, WITHOUT GOING INTO THE CREATED FOLDER AND WITHOUT MENTIONING AN SPECIFIC YAML FILE , only via:
kubectl create -f kubernetes-metrics-server/
In this way all services and components are installed correctly and you can run:
kubectl top nodes
or
kubectl top pods
and get the correct result.
For kubectl top node/pod to work you either need the heapster or the metrics server installed on your cluster.
Like the warning says: heapster is being deprecated so the recommended choice now is the metrics server.
So follow the directions here to install the metrics server