How to get node resource reserved/capacity in Kubernetes (kubelet flags/configuration)? - kubernetes

There is a documentation article here explaining on how one can reserve resources on a node for system use.
What I did not manage to figure out is how can one get these values? If I understand things correctly kubectl top nodes will return available resources, but I would like to see kube-reserved, system-reserved and eviction-threshold as well.
Is it possible?

by checking the kubelet's flag, we can get the values of kube-reserved, system-reserved and eviction-threshold.
ssh into the $NODE and ps aufx | grep kubelet will list out the running kubelet and its flag.
kube-reserved and system-reserved values are only useful for scheduling as scheduler can see the allocatable resources.

To see your eviction-threshold (evictionHard or systemReserved) after login on master node first start the kubectl proxy in the background using the following command:
kubectl proxy --port=8001 &
After that run the following command to see your desired node config (replace your node name in variable.eg VAR="worker-2")
VAR="NODE_NAME"; curl -sSL "http://localhost:8001/api/v1/nodes/$VAR/proxy/configz"
You shoul see a result look like:
"evictionHard":{"imagefs.available":"15%","memory.available":"100Mi","nodefs.available":"10%","nodefs.inodesFree":"5%"},
"systemReserved":{"cpu":"600m","memory":"0.5Gi"}
Enjoy ;)

Related

Check failed pods logs in a Kubernetes cluster

I have a Kubernetes cluster, in which different pods are running in different namespaces. How do I know if any pod failed?
Is there any single command to check the failed pod list or restated pod list?
And reason for the restart(logs)?
Depends if you want to have detailed information or you just want to check a few last failed pods.
I would recommend you to read about Logging Architecture.
In case you would like to have this detailed information you should use 3rd party software, as its described in Kubernetes Documentation - Logging Using Elasticsearch and Kibana or another one FluentD.
If you are using Cloud environment you can use Integrated with Cloud Logging tools (i.e. in Google Cloud Platform you can use Stackdriver).
In case you want to check logs to find reason why pod failed, it's good described in K8s docs Debug Running Pods.
If you want to get logs from specific pod
$ kubectl logs ${POD_NAME} -n {NAMESPACE}
First, look at the logs of the affected container:
$ kubectl logs ${POD_NAME} ${CONTAINER_NAME}
If your container has previously crashed, you can access the previous container's crash log with:
$ kubectl logs --previous ${POD_NAME} ${CONTAINER_NAME}
Additional information you can obtain using
$ kubectl get events -o wide --all-namespaces | grep <your condition>
Similar question was posted in this SO thread, you can check if for more details.
This'll work: kubectl get pods --all-namespaces | | grep -Ev '([0-9]+)/\1'
Also, Lens is pretty good in these situations.
Most of the times, the reason for app failure is printed in the lasting logs of the previous pod. You can see them by simply putting --previous flag along with your kubectl logs ... cmd.

Ensure services exist

I am going to deploy Keycloak on my K8S cluster and as a database I have chosen PostgreSQL.
To adjust the business requirements, we have to add additional features to Keycloak, for example custom theme, etc. That means, for every changes on Keycloak we are going to trigger CI/CD pipeline. We use Drone for CI and ArgoCD for CD.
In the pipeline, before it hits the CD part, we would like to ensure, that PostgreSQL is up and running.
The question is, does it exist a tool for K8S, that we can validate, if particular services are up and running.
"Up and running" != "Exists"
1: To check if a service exists, just do a kubectl get service <svc>
2: To check if it has active endpoints do kubectl get endpoints <svc>
3: You can also check if backing pods are in ready state.
2 & 3 requires readiness probe to be properly configured on the pod/deployment
Radek is right in his answer but I would like to expand on it with the help of the official docs. To make sure that the service exists and is working properly you need to:
Make sure that Pods are actually running and serving: kubectl get pods -o go-template='{{range .items}}{{.status.podIP}}{{"\n"}}{{end}}'
Check if Service exists: kubectl get svc
Check if Endopints exist: kubectl get endopints
If needed, check if the Service is working by DNS name: nslookup hostnames (from a Pod in the same Namespace) or nslookup hostnames.<namespace> (if it is in a different one)
If needed, check if the Service is working by IP: for i in $(seq 1 3); do
wget -qO- <IP:port>
done
Make sure that the Service is defined correctly: kubectl get service <service name> -o json
Check if the kube-proxy working: ps auxw | grep kube-proxy
If any of the above is causing a problem, you can find the troubleshooting steps in the link above.
Regarding your question in the comments: I don't think there is a n easier way considering that you need to make sure that everything is working fine. You can skip some of the steps but that would depend on your use case.
I hope it helps.

What is the way to make kubernetes nodes have `providerID` spec after creation (manually)?

I'm expecting that kubectl get nodes <node> -o yaml to show the spec.providerID (see reference below) once the kubelet has been provided the additional flag --provider-id=provider://nodeID. I've used /etc/default/kubelet file to add more flags to the command line when kubelet is start/restarted. (On a k8s 1.16 cluster) I see the additional flags via a systemctl status kubelet --no-pager call, so the file is respected.
However, I've not seen the value get returned by kubectl get node <node> -o yaml call. I was thinking it had to be that the node was already registered, but I think kubectl re-registers when it starts up. I've seen the log line via journalctl -u kubelet suggest that it has gone through registration.
How can I add a provider ID to a node manually?
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.17/#nodespec-v1-core
How a kubelet is configured on the node itself is separate (AFAIK) from its definition in the master control plane, which is responsible for updating state in the central etcd store; so it's possible for these to fall out of sync. i.e., you need to communicate to the control place to update its records.
In addition to Subramanian's suggestion, kubectl patch node would also work, and has the added benefit of being easily reproducible/scriptable compared to manually editing the YAML manifest; it also leaves a "paper trail" in your shell history should you need to refer back. Take your pick :) For example,
$ kubectl patch node my-node -p '{"spec":{"providerID":"foo"}}'
node/my-node patched
$ kubectl describe node my-node | grep ProviderID
ProviderID: foo
Hope this helps!
You can edit the node config and append providerID information under spec section.
kubectl edit node <Node Name>
...
spec:
podCIDR:
providerID:

Changing the CPU Manager Policy in Kubernetes

I'm trying to change the CPU Manager Policy for a Kubernetes cluster that I manage, as described here however, I've run into numerous issues while doing so.
The cluster is running in DigitalOcean and here is what I've tried so far.
1. Since the article mentions that --cpu-manager-policy is a kubelet option I assume that I cannot change it via the API Server and have to change it manually on each node. (Is this assumption BTW?)
2. I ssh into one of the nodes (droplets in DigitalOcean lingo) and run kubelet --cpu-manager-policy=static command as described in the kubelet CLI reference here. It gives me the message Flag --cpu-manager-policy has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
3. So I check the file pointed at by the --config flag by running ps aux | grep kubelet and find that its /etc/kubernetes/kubelet.conf.
4. I edit the file and add a line cpuManagerPolicy: static to it, and also kubeReserved and systemReserved because they become required fields if specifying cpuManagerPolicy.
5. Then I kill the process that was running the process and restart it. A couple other things showed up (delete this file and drain the node etc) that I was able to get through and was able to restart the kubelet ultimately
I'm a little lost about the following things
How do I need to do this for all nodes? My cluster has 12 of them and doing all of these steps for each seems very inefficient.
Is there any way I can set these params from the globally i.e. cluster-wide rather than doing node by node?
How can I even confirm that what I did actually changed the CPU Manager policy?
One issue with Dynamic Configuration is that in case the node fails to restart, the API does not give a reasonable response back that tells you what you did wrong, you'll have to ssh into the node and tail the kubelet logs. Plus, you have to ssh into every node and set the --dynamic-config-dir flag anyways.
The folllowing worked best for me
SSH into the node. Edit
vim /etc/systemd/system/kubelet.service
Add the following lines
--cpu-manager-policy=static \
--kube-reserved=cpu=1,memory=2Gi,ephemeral-storage=1Gi \
--system-reserved=cpu=1,memory=2Gi,ephemeral-storage=1Gi \
We need to set the --kube-reserved and --system-reserved flags because they're prerequisties to setting the --cpu-manager-policy flag
Then drain the node and delete the following folder
rm -rf /var/lib/kubelet/cpu_manager_state
Restart the kubelet
sudo systemctl daemon-reload
sudo systemctl stop kubelet
sudo systemctl start kubelet
uncordon the node and check the policy. This assumes that you're running kubectl proxy on port 8001.
curl -sSL "http://localhost:8001/api/v1/nodes/${NODE_NAME}/proxy/configz" | grep cpuManager
If you use a newer k8s version and kubelet is configured by a kubelet configuration file, eg:config.yml. you can just follow the same steps of #satnam mentioned above. but instead of adding --kube-reserved --system-reserved --cpu-manager-policy, you need to add kubeReserved systemReserved cpuManagerPolicy in your config.yml. for example:
systemReserved:
cpu: "1"
memory: "100m"
kubeReserved:
cpu: "1"
memory: "100m"
cpuManagerPolicy: "static"
Meanwhile, be sure your CPUManager is enabled.
It might not be the global way of doing stuff, but I think it will be much more comfortable than what you are currently doing.
First you need to run
kubectl proxy --port=8001 &
Download the configuration:
NODE_NAME="the-name-of-the-node-you-are-reconfiguring"; curl -sSL "http://localhost:8001/api/v1/nodes/${NODE_NAME}/proxy/configz" | jq '.kubeletconfig|.kind="KubeletConfiguration"|.apiVersion="kubelet.config.k8s.io/v1beta1"' > kubelet_configz_${NODE_NAME}
Edit it accordingly, and push the configuration to control plane. You will see a valid response if everything went well. Then you will have to edit the configuration so the Node starts to use the new ConfigMap. There are many more possibilities, for example you can go back to default settings if anything goes wrong.
This process is described with all the details in this documentation section.
Hope this helps.

kubernetes pods spawn across all servers but kubectl only shows 1 running and 1 pending

I have new setup of Kubernetes and I created replication with 2. However what I see when I do " kubectl get pods' is that one is running another is "pending". Yet when I go to my 7 test nodes and do docker ps I see that all of them are running.
What I think is happening is that I had to change the default insecure port from 8080 to 7080 (the docker app actually runs on 8080), however I don't know how to tell if I am right, or where else to look.
Along the same vein, is there any way to setup config for kubectl where I can specify the port. Doing kubectl --server="" is a bit annoying (yes I know I can alias this).
If you changed the API port, did you also update the nodes to point them at the new port?
For the kubectl --server=... question, you can use kubectl config set-cluster to set cluster info in your ~/.kube/config file to avoid having to use --server all the time. See the following docs for details:
http://kubernetes.io/v1.0/docs/user-guide/kubectl/kubectl_config.html
http://kubernetes.io/v1.0/docs/user-guide/kubectl/kubectl_config_set-cluster.html
http://kubernetes.io/v1.0/docs/user-guide/kubectl/kubectl_config_set-context.html
http://kubernetes.io/v1.0/docs/user-guide/kubectl/kubectl_config_use-context.html