Evaluate affinity rules - kubernetes

I updated a statefulset deployment and the deleted pods of that statefulset are pending forever. Thus I described the pods and saw that they can not be scheduled on nodes because the nodes didn't match the pod affinity/anti-affinity rules. This statefulset however has no affinity rules at all.
My question
How can I evaluate the affinity rules of my statefulset, so that I can see what affinity rules are hindering these pods from starting?
I believe it must be a different deployment which hinders these pods from starting up, but I am clueless which deployment it might be.

check this in order to determine the possible root cause
check if your nodes have taints (kubectl describe node {Node_Name} | grep Taint), if it is the case look for tolerations in order to schedule a workload in a specific node.
you have in the definition the field nodeName and is being pointed to an no existing node.
as Prateek Jain recommended above check your pod with describe in order to see what exactly is being overriden in your definition.

Statefulsets pods might be preventing the deletion because you may have some pv-protection, the best way to troubleshoot that situation is running kubectl get events -n ${yournamespace}, any event on your namespace will be listed.
Try to see if any warning or error message is displayed.
NOTE: If you get too many events, try to filter using --field-selector=type!=Normal,reason!=Unhealthy
✌

Related

Google cloud system kube-system namespaced deployed daemonsets not working

We are having problem with several deployments in our cluster that do not seem to be working. But I am a bit apprehensive in touching these, since they are part of the kube-system namespace. I am also unsure as what the correct approach to getting them into an OK state is.
I currently have two daemonsets that have warnings with the message
DaemonSet has no nodes selected
See images below. Does anyone have any idea what the correct approach is?
A DaemonSet is creating a pod in each node of your Kubernetes cluster.
If the Kubernetes scheduler cannot schedule any pod, there are several possibilities:
Pod spec has a too high memory requests resource for the memory node capacity, look at the value of spec.containers[].resources.requests.memory
The nodes may have a taint, so DaemonSet declaration must have a toleration (kubernetes documentation about taint and toleration)
The pod spec may have a nodeSelector field (kubernetes documentation about node selector)
The pod spec may have an enforced node affinity or anti-affinity (kubernetes documentation about node affinity)
If Pod Security Policies are enabled on the cluster, a security policy may be blocking access to a resource that the pod needs to run
There are not the only solutions possible. More generally, a good start would be to look at the events associated to the daemon set:
> kubectl describe daemonsets NAME_OF_YOUR_DAEMON_SET

How do I debug kubernetes scheduling?

I have added podAntiAffinity to my DeploymentConfig template.
However, pods are being scheduled on nodes that I expected would be excluded by the rules.
How can I view logs of the kubernetes scheduler to understand why it chose the node it did for a given pod?
PodAntiAffinity has more to do with other pods than nodes specifically. That is, PodAntiAffinity specifies which nodes to exclude based on what pods are already scheduled on that node. And even here you can make it a requirement vs. just a preference. To directly pick the node on which a pod is/is not scheduled, you want to use NodeAffinity. The guide.

How to convert Daemonsets to kind Deployment

I have already deployed pods using Daemonsets with nodeselector. My requirements is I need to use kind Deployment but at the same time I would want to retain Daemonsets functionality
.I have nodeselector defined so that same pod should be installed in labelled node.
How to achieve your help is appreciated.
My requirements is pod should be placed automatically based on nodeselector but with kind Deployment
In otherwords
Using Replication controller when I schedule 2 (two) replicas of a pod I expect 1 (one) replica each in each Nodes (VMs). Instead I find both replicas are created in same node This will make 1 Node a single point of failure which I need to avoid.
I have labelled two nodes properly. And I could see both pods spawned on single node. How to achieve both pods always schedule on both nodes?
Look into affinity and anti-affinity, specifically, inter-pod affinity and anti-affinity.
From official documentation:
Inter-pod affinity and anti-affinity allow you to constrain which nodes your pod is eligible to be scheduled based on labels on pods that are already running on the node rather than based on labels on nodes. The rules are of the form “this pod should (or, in the case of anti-affinity, should not) run in an X if that X is already running one or more pods that meet rule Y”.

Can podaffinity schedule two pods to run on the same node?

Both pods are scheduled on same node with podaffinity, each pod on a different namespace. Once I try to deploy both of them on same namespace, podaffinity fails, and one one pod is running while the other one remains pending with podaffinity error.
Thanks!
From your comment, I suspect that you have a label collision that is only apparent when you try to run the pods in the same namespace.
Take a look at your nodeSelectorTerms and matchExpressions
From the docs:
If you specify multiple matchExpressions associated with nodeSelectorTerms, then the pod can be scheduled onto a node only if all matchExpressions can be satisfied.

What's the purpose of Kubernetes DaemonSet when replication controllers have node anti-affinity

DaemonSet is a Kubernetes beta resource that can ensure that exactly one pod is scheduled to a group of nodes. The group of nodes is all nodes by default, but can be limited to a subset using nodeSelector or the Alpha feature of node affinity/anti-affinity.
It seems that DaemonSet functionality can be achieved with replication controllers/replica sets with proper node affinity and anti-affinity.
Am I missing something? If that's correct should DaemonSet be deprecated before it even leaves Beta?
As you said, DaemonSet guarantees one pod per node for a subset of the nodes in the cluster. If you use ReplicaSet instead, you need to
use the node affinity/anti-affinity and/or node selector to control the set of nodes to run on (similar to how DaemonSet does it).
use inter-pod anti-affinity to spread the pods across the nodes.
make sure the number of pods > number of node in the set, so that every node has one pod scheduled.
However, ensuring (3) is a chore as the set of nodes can change over time. With DaemonSet, you don't have to worry about that, nor would you need to create extra, unschedulable pods. On top of that, DaemonSet does not rely on the scheduler to assign its pods, which makes it useful for cluster bootstrap (see How Daemon Pods are scheduled).
See the "Alternative to DaemonSet" section in the DaemonSet doc for more comparisons. DaemonSet is still the easiest way to run a per-node daemon without external tools.