Will pods running on a PreferNoSchedule node migrate to an untainted node? - kubernetes

If a single Kubernetes cluster is built and runs some number of pods, however the single node carries a PreferNoSchedule taint, it would would make sense to migrate these pods and workloads to more suitable, untainted nodes if they are added to the cluster.
Will this happen automatically in >= 1.6 or will it need to be triggered? How is it triggered?

In this scenario, there will be no action triggered towards the kube-scheduler to schedule pods even though a new worker is added to a cluster.
For the pods to be moved to a new worker, we need to trigger a new pod scheduling requirement.
Simple solution would be to scale down to 0 and scale up to the needed number of pods for each deployment.
kubectl scale --replicas=<expected_replica_num> deployment <deployment_name>

As far as I know, this doesn't happen automatically with node taints. You can trigger it using kubectl rollout restart deployment/<name>.
I was unable to find sufficient literature for this in official Kubernetes documentation. The best I could find is kubernetes-sigs/descheduler

Related

Kubernetes StatefulSets - run pod on every worker node

What is the easiest way to run a single Pod on every available worker node as part of the StatefulSet. So, a one to one mapping.
Am I right to say every Pod will run on a different Node by default with a StatefulSet? In which case is it sufficient to add x pods to the SS where x Worker nodes exist in the cluster?
Thanks.
Use DaemonSet instead.
A DaemonSet ensures that all (or some) Nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected. Deleting a DaemonSet will clean up the Pods it created.
If you really want to use statefulSet, you can take a look at features like nodeSelector or Affinity and Anti-affinity.

Difference between daemonsets and deployments

In Kelsey Hightower's Kubernetes Up and Running, he gives two commands :
kubectl get daemonSets --namespace=kube-system kube-proxy
and
kubectl get deployments --namespace=kube-system kube-dns
Why does one use daemonSets and the other deployments?
And what's the difference?
Kubernetes deployments manage stateless services running on your cluster (as opposed to for example StatefulSets which manage stateful services). Their purpose is to keep a set of identical pods running and upgrade them in a controlled way. For example, you define how many replicas(pods) of your app you want to run in the deployment definition and kubernetes will make that many replicas of your application spread over nodes. If you say 5 replica's over 3 nodes, then some nodes will have more than one replica of your app running.
DaemonSets manage groups of replicated Pods. However, DaemonSets attempt to adhere to a one-Pod-per-node model, either across the entire cluster or a subset of nodes. A Daemonset will not run more than one replica per node. Another advantage of using a Daemonset is that, if you add a node to the cluster, then the Daemonset will automatically spawn a pod on that node, which a deployment will not do.
DaemonSets are useful for deploying ongoing background tasks that you need to run on all or certain nodes, and which do not require user intervention. Examples of such tasks include storage daemons like ceph, log collection daemons like fluentd, and node monitoring daemons like collectd
Lets take the example you mentioned in your question: why iskube-dns a deployment andkube-proxy a daemonset?
The reason behind that is that kube-proxy is needed on every node in the cluster to run IP tables, so that every node can access every pod no matter on which node it resides. Hence, when we make kube-proxy a daemonset and another node is added to the cluster at a later time, kube-proxy is automatically spawned on that node.
Kube-dns responsibility is to discover a service IP using its name and only one replica of kube-dns is enough to resolve the service name to its IP. Hence we make kube-dns a deployment, because we don't need kube-dns on every node.

How to deploy specific pod to all nodes including master, but only for specific pod

I have a security pod that needs to run everywhere including master. I do not want, however, master to run any other (non kubernetes) pods.
I know I can taint master node, and I know I can setup affinity for a pod. Yet (unless I am misunderstanding something) that isn't quite what I want.
What I want is to setup affinity in a way that this security pod runs on every single node including master as a part of same daemon set. It is important that I only have a single definition due to how this security pod gets deployed.
Can this be done?
I am running Kubernetes 1.8
I think this is more or less duplicate to this question.
What you need is a combination of two features:
DaemonSet will allow you to schedule Pod to run on every node
Tolerations in the DaemonSet Pods will allow this workload to run even on the node which has the master taint.
That way your security pods will run everywhere even on the master with the taint because they can tolerate it. I think there is an example directly on the DaemonSet website.
But other pods without this toleration will not be scheduled on master because they do not tolerate the taint.

How do I debug kubernetes scheduling?

I have added podAntiAffinity to my DeploymentConfig template.
However, pods are being scheduled on nodes that I expected would be excluded by the rules.
How can I view logs of the kubernetes scheduler to understand why it chose the node it did for a given pod?
PodAntiAffinity has more to do with other pods than nodes specifically. That is, PodAntiAffinity specifies which nodes to exclude based on what pods are already scheduled on that node. And even here you can make it a requirement vs. just a preference. To directly pick the node on which a pod is/is not scheduled, you want to use NodeAffinity. The guide.

How to migrate the pods automatically to another node in kubernetes?

I am a new cookie to kubernetes . I am wondering if kubernetes have automatically switch the pods to another node if that node resources are on critical.
For example if Pod A , Pod B , Pod C is running on Node A and Pod D is running on Node B. The resources of Node A used by pods would be high. In these case whether kubernetes will migrate the any of the pods running in Node A to Node B.
I have learnt about node affinity and node selector which is used to run the pods in certain nodes. It would be helpfull if kubernetes offer this feature to migrate the pods to another node automatically if resources are used highly.
Can any one know how can we achieve this in kubernetes ?
Thanks
-S
Yes, Kubernetes can migrate the pods to another node automatically if resources are used highly. The pod would be killed and a new pod would be started on another node. You would probably want to learn about Quality of Service Classes, to understand which pod would be killed first.
That said, you may want to read about Automatic Horizontal Pod Autoscaling. This may give you more control.
With Horizontal Pod Autoscaling, Kubernetes automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization (or, with alpha support, on some other, application-provided metrics).
With increase of load it makes more sense to spin up a new pod rather than moving pod between different nodes to avoid distraction of currently running processes inside pod on busy node.
you can do node selector in deployment and move the node
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/