Node power off, but still showing daemonset pod running - kubernetes

One worker node was powered off, get node status by kubectl get nodes shows that node is NotReady.
But kubectl get po -o wide --all-namespaces|egrep 'daemonSet-pod|node-hostname', it shows some DaemonSet pods still running on NotReady node and cannot connect to these pods.
Why Daemonset pod still show Running even if the node NotReady?

After kubernetes version 1.13, Tolerations like node.kubernetes.io/not-ready are added automatically to DaemonSets. That means DaemonSet pods will not be evicted when there are node problems like you describe.

Related

How to reschedule the pod from node in kubernetes ( baremetal servers )?

Kubernetes nodes are getting unscheduled while i initiate the drain or cordon but the pods which is available on the node are not getting moved to different node immediately ?
i mean, these pods are not created by daemonset.
So, how come, Application running pod can make 100% available when a node getting faulty or with some issues ?
any inputs ?
command used :
To drain / cordon to make the node unavailable:
kubectl drain node1
kubectl cordon node1
To check the node status :
kubectl get nodes
To check the pod status before / after cordon or drain :
kubectl get pods -o wide
kubectl describe pod <pod-name>
Surprising part is , even node is unavailable, the pod status showing always running. :-)
Pods by itself doesn't migrate to another node.
You can use workload resources to create and manage multiple Pods for you. A controller for the resource handles replication and rollout and automatic healing in case of Pod failure. For example, if a Node fails, a controller notices that Pods on that Node have stopped working and creates a replacement Pod. The scheduler places the replacement Pod onto a healthy Node.
Some examples of controllers are:
deployment
daemonset
statefulsets
Check this link to more information.

How to simulate nodeNotReady for a node in Kubernetes

My ceph cluster is running on AWS with 3 masters 3 workers configuration. When I do kubectl get nodes it shows me all the nodes in the ready state.
Is there is any way I can simulate manually to get nodeNotReady error for a node?.
just stop kebelet service on one of the node that you want to see as NodeNotReady
If you just want NodeNotReady you can delete the CNI you have installed.
kubectl get all -n kube-system find the DaemonSet of your CNI and delete it or just do a reverse of installing it: kubectl delete -f link_to_your_CNI_yaml
You could also try to overwhelm the node with too many pods (resources). You can also share your main goal so we can adjust the answer.
About the answer from P Ekambaram you could just ssh to a node and then stop the kubelet.
To do that in kops you can just:
ssh -A admin#Node_PublicDNS_name
systemctl stop kubelet
EDIT:
Another way is to overload the Node which will cause: System OOM encountered and that will result in Node NotReady state.
This is just one of the ways of how to achieve it:
SSH into the Node you want to get into NotReady
Install Stress
Run stress: stress --cpu 8 --io 4 --hdd 10 --vm 4 --vm-bytes 1024M --timeout 5m (you can adjust the values of course)
Wait till Node crash.
After you stop the stress the Node should get back to healthy state automatically.
Not sure what is the purpose to simulate NotReady
if the purpose is to not schedule any new pods then you can use kubectl cordon node
NODE_NAME This will add the unschedulable taint to it and prevent new pods from being scheduled there.
If the purpose is to evict existing pod then you can use kubectl drain NODE_NAME
In general you can play with taints and toleration to achieve your goal related to the above and you can much more with those!
Now NotReady status comes from the taint node.kubernetes.io/not-ready Ref
Which is set by
In version 1.13, the TaintBasedEvictions feature is promoted to beta and enabled by default, hence the taints are automatically added by the NodeController
Therefore if you want to manually set that taint kubectl taint node NODE_NAME node.kubernetes.io/not-ready=:NoExecute the NodeController will reset it automatically!
So to absolutely see the NotReady status this is the best way
Lastly, if you want to remove your networking in a particular node then you can taint it like this kubectl taint node NODE_NAME dedicated/not-ready=:NoExecute

How to autoscale with GKE

I have a GKE cluster with an autoscale node pool.
After adding some pods, the cluster starts autoscale and creates a new node but the old running pods start to crash randomly:
I don't think it's directly related to autoscaling unless some of your old nodes are being removed. The autoscaling is triggered by adding more pods but most likely, there is something with your application or connectivity to external services (db for example). I would check the what's going on in the pod logs:
$ kubectl logs <pod-id-that-is-crashing>
You can also check for any other event in the pods or deployment (if you are using a deployment)
$ kubectl describe deployment <deployment-name>
$ kubectl describe pod <pod-id> -c <container-name>
Hope it helps!

Does kubectl drain remove pod first or create pod first

Kubernetes version 1.12.3. Does kubectl drain remove pod first or create pod first.
You can use kubectl drain to safely evict all of your pods from a node before you perform maintenance on the node (e.g. kernel upgrade, hardware maintenance, etc.)
When kubectl drain return successfuly it means it has removed all the pods successfully from that node and it is safe to bring that node down(physically shut off, or start maintainence)
Now if you turn on the machine and want to schedule pods again on that node you need to run:
kubectl uncordon <node name>
So, kubectl drain removes pods from the node and don't schedule any pods on that until you uncordon that node
kubectl drain will ignore certain system pods on the node that cannot be killed.
The given node will be marked unscheduled to prevent new pods from arriving.
When you are ready to put the node back into service, use kubectl uncordon, which will make the node schedulable again.
For for details use command:
kubectl drain --help
With this I hope you will get information which you are looking.

Remove Daemonset pod from a node

I have a running DaemonSet which is running on all nodes. I want to remove it from a node in order to completely drain it, as kubectl drain doesn't get rid of them. Without deleting my DaemonSet, what's a good way to temporarily remove those pods from the node? I've tried draining it and deleting the DaemonSet pods, but the DaemonSet will still reschedule them, disregarding that the node is set as Unschedulable: true.
You need to use --ignore-daemonsets key when you drain kubernetes node:
--ignore-daemonsets=false: Ignore DaemonSet-managed pods.
So, in order to drain kubernetes node with DaemonSets in cluster, you need to execute:
kubectl drain <node_name> --ignore-daemonsets
If you need to Remove DaemonSet pod from a node completely, you can specify a .spec.template.spec.nodeSelector in DaemonSet (the DaemonSet controller will create Pods on nodes which match that node selector) and set that label to all nodes except the one you need to completely drain.