I have kubernetes cluster and every thing work fine. after some times I drain my worker node and reset it and join it again to master but
#kubectl get nodes
NAME STATUS ROLES AGE VERSION
ubuntu Ready master 159m v1.14.0
ubuntu1 Ready,SchedulingDisabled <none> 125m v1.14.0
ubuntu2 Ready,SchedulingDisabled <none> 96m v1.14.0
what should i do?
To prevent a node from scheduling new pods use:
kubectl cordon <node-name>
Which will cause the node to be in the status: Ready,SchedulingDisabled.
To tell is to resume scheduling use:
kubectl uncordon <node-name>
More information about draining a node can be found here. And manual node administration here
I fixed it using:
kubectl uncordon <node-name>
scheduling disabled represents that node got into maintenance mode , as you drained nodes of the pods which has prerequisite that the node must be in maintenance . check the spec of that nodes
kubetcl get nodes node-name -o yaml
taints:
- effect: NoSchedule
key: node.kubernetes.io/unschedulable
timeAdded: "2022-06-29T13:05:11Z"
unschedulable: true
so when you want it to be back to schedulable state you will have to put node out of maintenance means uncordon it
kubectl uncordon <node-name>
I know it's similar to above answers but one should know what's happening behind the scenes .
For Openshift (see oc administrator commands) you can use: oc adm uncordon <node-name>
Here's a quick way to reschedule all nodes:
oc get nodes --no-headers | awk '{print $1}' | xargs oc adm uncordon
Related
I have created a google Kubernetes engine with autoscale enabled with minimum and maximum nodes. A few days ago I deployed couple of servers on production which increased the nodes count as expected. but when I deleted those deployments I expect it to resize the nodes which are to scale down. I waited more than an hour but it still did not scale down.
All my other pods are controlled by replica controller since I deployed with kind: deployment.
All my statefulset pods are using PVC as volume.
I'm not sure what prevented the nodes to scale down so I manually scaled the nodes for now. Since I made the changes manually I can not get the autoscaler logs now.
Does anyone know what could be the issue here?
GKE version is 1.16.15-gke.4300
As mentioned in this link
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-types-of-pods-can-prevent-ca-from-removing-a-node
I'm not using any local storage.
pods not having PodDisruptionBudget(don't know what is that)
Pods are created by deployments (helm charts)
only thing is I don't have "cluster-autoscaler.kubernetes.io/safe-to-evict": "true" this annotation. is this must?
I have tested Cluster Autoscaler on my GKE cluster. It work's bit differently than you expected.
Backgorund
You can enable autoscaling using command or enable it during creation like it's described in this documentation.
In Cluster Autoscaler documentation you can find various information like Operation criteria, Limitations, etc.
As I mentioned in comment section, Cluster Autoscaler - Frequently Asked Questions won't work if will encounter one of below situation:
Pods with restrictive PodDisruptionBudget.
Kube-system pods that:
are not run on the node by default, *
don't have a pod disruption budget set or their PDB is too restrictive (since CA 0.6).
Pods that are not backed by a controller object (so not created by deployment, replica set, job, statefulset etc). *
Pods with local storage. *
Pods that cannot be moved elsewhere due to various constraints (lack of resources, non-matching node selectors or affinity, matching anti-affinity, etc)
Pods that have the following annotation set:
"cluster-autoscaler.kubernetes.io/safe-to-evict": "false"
For my tests I've used 6 nodes, with autoscaling range 1-6 and nginx application with requests cpu: 200m and memory: 128Mi.
As OP mentioned that is not able to provide autoscaler logs, I will paste my logs from Logs Explorer. Description of how they can be achieved is in Viewing cluster autoscaler events documentation.
In those logs you should search noScaleDown events. You will find there a few information, however the most important is:
reason: {
parameters: [
0: "kube-dns-66d6b7c877-hddgs"
]
messageId: "no.scale.down.node.pod.kube.system.unmovable"
As it's described in NoScaleDown node-level reasons for "no.scale.down.node.pod.kube.system.unmovable":
Pod is blocking scale down because it's a non-daemonset, non-mirrored, non-pdb-assigned kube-system pod. See the Kubernetes Cluster Autoscaler FAQ for more details.
Solution
If you want to make Cluster Autoscaler work on GKE, you have to create Disruptions with proper information, how to create it can be found in How to set PDBs to enable CA to move kube-system pods?
kubectl create poddisruptionbudget <pdb name> --namespace=kube-system --selector app=<app name> --max-unavailable 1
where you have to specify the correct selector and --max-unavailable or --min-available depends on your needs. For more details, please read Specifying a PodDisruptionBudget documentation.
Tests
$ kubectl get deploy,nodes
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-deployment 16/16 16 16 66m
NAME STATUS ROLES AGE VERSION
node/gke-cluster-1-default-pool-6d42fa0a-1ckn Ready <none> 11m v1.16.15-gke.6000
node/gke-cluster-1-default-pool-6d42fa0a-2j4j Ready <none> 11m v1.16.15-gke.6000
node/gke-cluster-1-default-pool-6d42fa0a-388n Ready <none> 3h33m v1.16.15-gke.6000
node/gke-cluster-1-default-pool-6d42fa0a-5x35 Ready <none> 3h33m v1.16.15-gke.6000
node/gke-cluster-1-default-pool-6d42fa0a-pdfk Ready <none> 3h33m v1.16.15-gke.6000
node/gke-cluster-1-default-pool-6d42fa0a-wqtm Ready <none> 11m v1.16.15-gke.6000
$ kubectl get pdb -A
NAMESPACE NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
kube-system kubedns 1 N/A 1 43m
Scaledown deployment
$ kubectl scale deploy nginx-deployment --replicas=2
deployment.apps/nginx-deployment scaled
After a while (~10-15 minutes) in the event viewer you will find the Decision event and inside you will find information that the node was deleted.
...
scaleDown: {
nodesToBeRemoved: [
0: {
node: {
mig: {
zone: "europe-west2-c"
nodepool: "default-pool"
name: "gke-cluster-1-default-pool-6d42fa0a-grp"
}
name: "gke-cluster-1-default-pool-6d42fa0a-wqtm"
Number of nodes decreased:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-cluster-1-default-pool-6d42fa0a-2j4j Ready <none> 30m v1.16.15-gke.6000
gke-cluster-1-default-pool-6d42fa0a-388n Ready <none> 3h51m v1.16.15-gke.6000
gke-cluster-1-default-pool-6d42fa0a-5x35 Ready <none> 3h51m v1.16.15-gke.6000
gke-cluster-1-default-pool-6d42fa0a-pdfk Ready <none> 3h51m v1.16.15-gke.6000
Another place where you can confirm it's scaling down is kubectl get events --sort-by='.metadata.creationTimestamp'
Output:
5m16s Normal NodeNotReady node/gke-cluster-1-default-pool-6d42fa0a-wqtm Node gke-cluster-1-default-pool-6d42fa0a-wqtm status is now: NodeNotReady
4m56s Normal NodeNotReady node/gke-cluster-1-default-pool-6d42fa0a-1ckn Node gke-cluster-1-default-pool-6d42fa0a-1ckn status is now: NodeNotReady
4m Normal Deleting node gke-cluster-1-default-pool-6d42fa0a-wqtm because it does not exist in the cloud provider node/gke-cluster-1-default-pool-6d42fa0a-wqtm Node gke-cluster-1-default-pool-6d42fa0a-wqtm event: DeletingNode
3m55s Normal RemovingNode node/gke-cluster-1-default-pool-6d42fa0a-wqtm Node gke-cluster-1-default-pool-6d42fa0a-wqtm event: Removing Node gke-cluster-1-default-pool-6d42fa0a-wqtm from Controller
3m50s Normal Deleting node gke-cluster-1-default-pool-6d42fa0a-1ckn because it does not exist in the cloud provider node/gke-cluster-1-default-pool-6d42fa0a-1ckn Node gke-cluster-1-default-pool-6d42fa0a-1ckn event: DeletingNode
3m45s Normal RemovingNode node/gke-cluster-1-default-pool-6d42fa0a-1ckn Node gke-cluster-1-default-pool-6d42fa0a-1ckn event: Removing Node gke-cluster-1-default-pool-6d42fa0a-1ckn from Controller
Conclusion
By default, kube-system pods prevent CA from removing nodes on which they are running. Users can manually add PDBs for the kube-system pods that can be safely rescheduled elsewhere. It can be achieved using:
kubectl create poddisruptionbudget <pdb name> --namespace=kube-system --selector app=<app name> --max-unavailable 1
List of possible reasons why CA won't autoscale can be found in Cluster Autoscaler - Frequently Asked Questions.
To verify which pods could still block CA downscale, you can use Autoscaler Events.
When using kubelet kubeconfig, only the worker node is displayed but the master node is not displayed, Like the following output on the aws eks worker node:
kubectl get node --kubeconfig /var/lib/kubelet/kubeconfig
NAME STATUS ROLES AGE VERSION
ip-172-31-12-2.ap-east-1.compute.internal Ready <none> 30m v1.18.9-eks-d1db3c
ip-172-31-42-138.ap-east-1.compute.internal Ready <none> 4m7s v1.18.9-eks-d1db3c
For some reasons, I need to hide the information of other worker and master nodes, and only display the information of the worker node where the kubectl command is currently executed.
what should i do ?
I really appreciate your help.
After I update the backend code (pushing update to gcr.io), I delete the pod. Usually a new pod spins up.
But after today the whole cluster just breaks down. I really cannot comprehend what is happening here (I did not touch any of the other items).
I am really looking in the dark here. Where do I start looking?
I see that the logs show:
0/2 nodes are available: 2 node(s) had taints that the pod didn't tolerate.
when I look this up:
kubectl describe node | grep -i taint
Taints: node.kubernetes.io/unreachable:NoSchedule
Taints: node.kubernetes.io/unreachable:NoSchedule
But I have no clue what this is or how they even get there.
EDIT:
It looks like I need to remove the taints, but I am not able to (taint not found?)
kubectl taint nodes --all node-role.kubernetes.io/unreachable-
taint "node-role.kubernetes.io/unreachable" not found
taint "node-role.kubernetes.io/unreachable" not found
Likely problem with the nodes. Debug with some of these (sample):
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 1d v1.14.2
k8s-node1 NotReady <none> 1d v1.14.2
k8s-node2 NotReady <none> 1d v1.14.2 <-- Does it say NotReady?
$ kubectl describe node k8s-node1
...
# Do you see something like this? What's the event message?
MemoryPressure...
DiskPressure...
PIDPressure...
Check if the kubelet is running on every node (it might be crashing and restarting)
ssh k8s-node1
# ps -Af | grep kubelet
# systemctl status kubelet
# journalctl -xeu kubelet
Nuclear option:
If you are using a node pool, delete your nodes and let the autoscaler restart brand new nodes.
Related question/answer.
✌️
I want to know how to get events which are running on a specific node.
In my case my k8s cluster is made up of 3 worker nodes (node1,node2,node3). I want to get a list of all the events that are getting executed on node2.
I know i can get namespace specific events by:
kubectl get event --namespace default
Is there a way/option to get something like:
kubectl get event --nodename node2
This should work
kubectl get events --all-namespaces | grep -i node01
this command gives me pod scheduled events too
master $ kubectl get events --all-namespaces | grep -i node01
default 46s Normal Scheduled pod/nginx-
dashrath Successfully assigned default/nginx-
dashrath to node01
default 10m Normal Scheduled pod/nginx
Successfully assigned default/nginx to node01
default 11m Normal NodeHasSufficientMemory node/node01
Node node01 status is now: NodeHasSufficientMemory
This is what works
$ kubectl get events --all-namespaces -o wide | grep -i node01
I'm aware that running a pod(s) on a master node(s) is against kubernetes best practices! Nevertheless, in my virtual environment, I'd like to run a pod on a master node. How can I do that?
I found a solution. You can remove taint that prohibits kubernetes scheduler to schedule pods on a master node(s).
# Get all nodes.
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
compute01 Ready compute 16d v1.15.0
master Ready master 16d v1.15.0
web01 Ready web 16d v1.15.0
# Check if there is a NoSchedule taint on master node.
$ kubectl get node master -o json
...
"taints": [
{
"effect": "NoSchedule",
"key": "node-role.kubernetes.io/master"
}
]
...
# Delete node-role.kubernetes.io/master taint from all nodes that have it.
$ kubectl taint nodes --all node-role.kubernetes.io/master-
node "node/master" untainted
taint "node-role.kubernetes.io/master" not found
taint "node-role.kubernetes.io/master" not found
If you want make you master node schedulable again then, you will have to recreate deleted taint with bellow command.
$ kubectl taint node master node-role.kubernetes.io/master=:NoSchedule
node/master tainted