I need to create a shell-script which examine the cluster
Status.**
I saw that the kubectl describe-nodes provides lots of data
I can output it to json and then parse it but maybe it’s just overkill.
Is there a simple way to with kubectl command to get the status of the cluster ? just if its up / down
The least expensive way to check if you can reach the API server is kubectl version. In addition kubectl cluster-info gives you some more info.
In addition to Michael's answer, that would only tell you about the API server or master and internal services like KubeDns etc, but not the nodes.
It depends on your need and definition of "status" here. You could run kubectl cluster-info followed by kubectl get nodes and check the STATUS column for all nodes using parsing tools like awk, jq or kubectl's own -o jsonpath option to verify that all nodes are ready.
The below command would display the health of scheduler, controller and etcd
kubectl get cs
Command below lists Kubernetes core components like, etcd, controller, scheduler, kube-proxy, core-dns, network plugin. All those pods should be running to be sure that Kubernetes is healthy.
kubectl get pod -n kube-system
Finally deploy one front-end and back-end Pod and verify the inter-pod communication to ensure that cluster is up and working correctly.
Below are the commands to get cluster status based on requirements:
To get information regarding where your Kubernetes master is running at, CoreDNS is running at, kubernetes-dasboard is running at, use
kubectl cluster-info
To get detailed information to further debug and diagnose cluster problem, use kubectl cluster-info dump
To get only the health status for your node use, kubectl get componentstatus or kubectl get cs
*To show detailed information about a resource use kubectl describe node <node>
Related
Kubectl describe nodes ?
like wise do we have any commands like mentioned below to describe cluster information ?
kubectl describe cluster
"Kubectl describe <api-resource_type> <api_resource_name> "command is used to describe a specific resources running in your kubernetes cluster, Actually you need to verify different components separately as a developer to check your pods, nodes services and other tools that you have applied/created.
If you are the cluster administrator and you are asking about useful command to check the actual kube-system configuration it depends on your k8s cluster type for example if you are using "kubeadm" package to initialize k8s cluster on premises you can check and change the default cluster configuration using this command :
kubeadm config print init-defaults
after initializing your cluster all main server configurations files a.k.a manifests are located here /etc/kubernetes/manifests (and they are Realtime updated, change anything and the cluster will redeploy it automatically)
Useful kubectl commands :
For cluster infos (api-server domain and dns) run:
kubectl cluster-info
Either ways you can list all api-resources and check it one by one using these commands
kuectl api-resources (list all api-resources names and types)
kubectl get <api_resource_name> (specific to your cluster)
kubectl explain <api_resource_name> (explain the resource object with docs link)
For extra infos you can add specific flags, examples:
kubectl get nodes -o wide
kubectl get pods -n <specific-name-space> -o wide
kubectl describe pods <pod_name>
...
For more informations about the kubectl command line check the kubectl_cheatsheet
We are using Airflow to schedule Spark job on Kubernetes. Recently, I have encountered a scenario where:
airflow received error 404 with message "pods pod-name not found"
I manually checked that POD was actually working fine at that time. In fact, I was able to collect logs using kubectl logs -f -n namespace podname
What happened due to this is that airflow created another POD for running the same job which resulted in race condition.
Airflow is using Kubernetes Python client's read_namespaced_pod API()
def read_pod(self, pod):
"""Read POD information"""
try:
return self._client.read_namespaced_pod(pod.metadata.name, pod.metadata.namespace)
except BaseHTTPError as e:
raise AirflowException(
'There was an error reading the kubernetes API: {}'.format(e)
)
I believe read_namespaced_pod() calls Kubernetes API. In order to investigate this further, I would like to like check logs of Kubernetes API server.
Can you please share steps to check what is happening on Kubernetes side ?
Note: Kubernetes version is 1.18 and Airflow version is 1.10.10.
Answering the question from the perspective of logs/troubleshooting:
I believe read_namespaced_pod() calls Kubernetes API. In order to investigate this further, I would like to like check logs of Kubernetes API server.
Yes, you are correct, this function calls the Kubernetes API. You can check the logs of Kubernetes API server by running:
$ kubectl logs -n kube-system KUBERNETES_API_SERVER_POD_NAME
I would also consider checking the kube-controller-manager:
$ kubectl logs -n kube-system KUBERNETES_CONTROLLER_MANAGER_POD_NAME
The example output of it:
I0413 12:33:12.840270 1 event.go:291] "Event occurred" object="default/nginx-6799fc88d8" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: nginx-6799fc88d8-kchp7"
A side note!
Above commands will work assuming that your kubernetes-apiserver and kubernetes-controller-manager Pod is visible to you
Can you please share steps to check what is happening on Kubernetes side ?
This question targets the basics of troubleshooting/logs checking.
For that you can use following commands (and the ones mentioned earlier):
$ kubectl get RESOURCE RESOURCE_NAME:
example: $ kubectl get pod airflow-pod-name
also you can add -o yaml for more information
$ kubectl describe RESOURCE RESOURCE_NAME:
example: $ kubectl describe pod airflow-pod-name
$ kubectl logs POD_NAME:
example: $ kubectl logs airflow-pod-name
Additional resources:
Kubernetes.io: Docs: Concepts: Cluster administration: Logging Architecture
Kubernetes.io: Docs: Tasks: Debug application cluster: Debug cluster
Hi i'm a newbie for k8s and i was wondering where and how kubectl sends requests to the kube api-server.
So for example, if i'm sending a request such as "kubectl get pods --all-namespaces"(and my default kubernetes endpoints is set as "192.168.64.2:8443"), my understanding is that this would translate to a https request such as "https://192.168.64.2:8443/api/v1/pods......etc" and kubectl would use authentication stored in .kube/config file. Am i right?
And i also have a metrics-server up and running on endpoint "172.17.0.8:4443" but how does kubectl know to use this ip when i run "kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes/<NODE_NAME> | jq"? are all kubectl commands directed to one ip?
Thanks in advance.
Authentication is one of the steps that kubectl achieves. You could see what happens when a command run with verbose ability. for example,
kubectl get pods -v9 --all-namespaces
Kubernetes know resource definitions and their implementation, you could check resource types with,
kubectl api-resources
So Kubernetes api-server knows which resources are metric-server and how that could call.
All kubectl requests go to the api-server. The api-server can either answer by itself or it delegates to other components, for example an extension api server.
I have created a Kubernetes cluster and one of instance in the cluster is inactive
I want to review the configured Kubernetes Engine cluster of an inactive configuration by which command should I check?
Should I use this "kubectl config get-contexts"?
or
kubectl config use-context and kubectl config view?
Am beginner to cloud please anyone explains?
The kubectl config get-context will not help you debug why the instance is failing. Basically it will just show you the list ot contexts. A context is a group of cluster access parameters. Each context contains a Kubernetes cluster, a user, and a namespace. The current context is the cluster that is currently the default for kubectl . On other hand the kubectl config view will just print you kubeconfig settings.
The best way to start is the Kubernestes official documentation. It provides a good basic steps for troubleshoouting your cluster. Some of the steps can be applied to GKE as well as the Kubeadm or Minikube clusters.
If you're using GKE, then you can read the nodes logs from Stackdriver. This document is excellent start when you want to check the logs directly in the log viewer.
If one of your instaces report NotReady after listing them with kubectl get nodes I suggest to ssh to that instances and check kubernetes components (kubelet and kube-proxy). You can view the GKE nodes from the instances page.
Kube Proxy logs:
/var/log/kube-proxy.log
If you want to check the kubelet logs, they're a unit in systemd in COS that can be accessed using jorunactl.
Kubelet logs:
sudo journalctl -u kubelet
For further debugging it is worth mentioning that that GKE master is a node inside a Google managed project and it is different from your cluster project.
For the detailed master logs you will have open a google support ticket. Here is more information about how GKE cluster architecture works, in case there's something related to the api-server.
Let me know if that was helpful.
You can run below command to check status of all the nodes of a kubernetes cluster. Pleases note if you are using GKE managed service you will not be able to see status of master nodes, you will only see status of worker nodes.
kubectl get nodes -o wide
kubectl describe node nodename
You can also run below command to check status of control plane components.
kubectl get componentstatus
You can use the below command to get list of all the nodes in GKE cluster:
kubectl get nodes -o wide
Once you have the list of nodes, you can describe the node to get the events"
kubectl describe node <Node-Name>
Based on the events you can debug the node.
I have installed Kubernetes in Ubuntu server using instructions here. I am trying to create pods using kubectl run hello-minikube --image=gcr.io/google_containers/echoserver:1.4 --hostport=8000 --port=8080 as listed in the example. However, when I do kubectl get pod I get the status of the container as pending. I further did kubectl describe pod for debugging and I see the message:
FailedScheduling pod (hello-minikube-3383150820-1r4f7) failed to fit in any node fit failure on node (minikubevm): PodFitsHostPorts.
I am further trying to delete this pod by kubectl delete pod hello-minikube-3383150820-1r4f7 but when I further do kubectl get pod I see another pod with prefix "hello-minikube-3383150820-" that I havent created. Does anyone know how to fix this problem? Thank you in advance.
The PodFitsHostPorts predicate is failing because you have something else on your nodes using port 8000. You might be able to find what it is by running kubectl describe svc.
kubectl run creates a deployment object (you can see it with kubectl describe deployments) which makes sure that you always keep the intended number of replicas of the pod running (in this case 1). When you delete the pod, the deployment controller automatically creates another for you. If you want to delete the deployment and the pods it keeps creating, you can run kubectl delete deployments hello-minikube.