Many of the run-throughs for deploying Kubernetes master nodes suggest you use --register-schedulable=false to prevent user pods being scheduled to the master node (e.g. https://coreos.com/kubernetes/docs/latest/deploy-master.html). On a very small Kubernetes cluster it seems somewhat a wasteful of compute resources to effectively prevent an entire node from being used for pod scheduling unless absolutely essential.
The answer to this question (Will (can) Kubernetes run Docker containers on the master node(s)?) suggests that it is indeed possible to run user pods on a master node - but doesn't address whether there are any issues associated with allowing this.
The only information that I've been able to find to date that suggests there might be issues associated with allowing this is that it appears that pods on master nodes communicate insecurely (see http://kubernetes.io/docs/admin/master-node-communication/ and https://github.com/kubernetes/kubernetes/issues/13598). I assume that this would potentially allow a rogue pod running on a master node to access/hijack Kubernetes functionality not normally accessible to pods on non-master nodes. Probably not a big deal with if only running pods/containers developed internally - although I guess there's always the possibility of someone hacking access to a pod/container and thereby gaining access to the master node.
Does this sound like a viable potential risk associated with this scenario (allowing user pods to run on a Kubernetes master node)? Are there any other potential issues associated with such a setup?
Running pods on the master node is definitely possible.
The security risk you mention is one issue, but if you configure service accounts, it isn't actually much different for all deployed pods to have secure remote access to the apiserver vs. insecure local access.
Another issue is resource contention. If you run a rogue pod on your master node that disrupts the master components, it can destabilize your entire cluster. Clearly this is a concern for production deployments, but if you are looking to maximize utilization of a small number of nodes in a development / experimentation environment, then it should be fine to run a couple of extra pods on the master.
Finally, you need to make sure the master node has a sufficiently large pod cidr allocated to it. In some deployments, the master only gets a /30 which isn't going to allow you to run very many pods.
Now Kubernetes and some Kubernetes distribution have what it calls taint.
taint can decide if the master can run a pod or not.
although running the pod on the master node is not the best practice but it's possible to do so. medium
in Kubernetes, we can read the explanation about taint here and I believe this is also related to scheduler
in Kubernetes or K3S, we can check if the nodes set the taint or not by describing the nodes.
# kubectl describe nodes | grep Taints
Taints: node.kubernetes.io/unreachable:NoExecute
Taints: node.kubernetes.io/unreachable:NoSchedule
Taints: node.kubernetes.io/unreachable:NoExecute
Taints: <none>
NoSchedule: Pods that do not tolerate this taint are not scheduled on the node.
PreferNoSchedule: Kubernetes avoids scheduling Pods that do not tolerate this taint onto the node.
NoExecute: Pod is evicted from the node if it is already running on the node, and is not scheduled onto the node if it is not yet running on the node.
source
if you want to specify one of your nodes, rather master or agent, just mention the nodes
# kubectl describe nodes agent3 | grep Taints
Taints: <none>
# kubectl describe nodes master | grep Taints
Taints: <none>
this is how you apply the taint to your nodes
kubectl taint nodes agent1 key1=value1:NoSchedule
kubectl taint nodes agent2 key1=value1:NoExecute
when your nodes are not running automatically it will show NoSchedule or NoExecute, make sure to check your nodes before checking the taint.
#robert have given a clear answer. I'm just trying to explain in a metaphorical way with a real-time example.
Your company's MANAGER is a better coder. If he starts coding, your company's MANAGER kind of work will be stalled/less efficient, because he can handle one thing in an efficient way. that will put your entire company at risks.
To operate efficiently, Hire more devs to code and don't make your MANAGER code(in order to get the works for the amount you are paying him).
Related
I'm a little confused, I've been ramping up on Kubernetes and I've been reading about all the different objects ReplicaSet, Deployment, Service, Pods etc.
In the documentation it mentions that the kubelet manages liveness and readiness checks which are defined in our ReplicaSet manifests.
Reference: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
If this is the case does the kubelet also manage the replicas? Or does that stay with the controller?
Or do I have it all wrong and it's the kubelet that is creating and managing all these resources on a pod?
Thanks in advance.
Basically kubelet is called "node agent" that runs on each node. It get notified through kube apiserver, then it start the container through container runtime, it works in terms of Pod Spec. It ensures the containers described in the Pod Specs are running and healthy.
The flow of kubelet tasks is like: kube apiserver <--> kubelet <--> CRI
To ensure whether the pod is running healthy it uses liveness probe, if it gets an error it restarts the pod.
kubelet does not maintain replicas, replicas are maintained by replicaset. As k8s doc said: A ReplicaSet's purpose is to maintain a stable set of replica Pods running at any given time. As such, it is often used to guarantee the availability of a specified number of identical Pods.
See more of ReplicaSet
For more info you can see: kubelet
When starting your journey with Kubernetes it is important to understand its main components for both Control Planes and Worker Nodes.
Based on your question we will focus on two of them:
kube-controller-manager:
Logically, each controller is a separate process, but to reduce
complexity, they are all compiled into a single binary and run in a
single process.
Some types of these controllers are:
Node controller: Responsible for noticing and responding when nodes go down.
Job controller: Watches for Job objects that represent one-off tasks, then creates Pods to run those tasks to completion.
Endpoints controller: Populates the Endpoints object (that is, joins Services & Pods).
Service Account & Token controllers: Create default accounts and API access tokens for new namespaces.
kubelet:
An agent that runs on each node in the cluster. It makes sure that
containers are running in a Pod.
The kubelet takes a set of PodSpecs that are provided through various
mechanisms and ensures that the containers described in those PodSpecs
are running and healthy. The kubelet doesn't manage containers which
were not created by Kubernetes.
So answering your question:
If this is the case does the kubelet also manage the replicas? Or does
that stay with the controller?
No, replication can be managed by the Replication Controller, a ReplicaSet or a more recommended Deployment. Kubelet runs on Nodes and makes sure that the Pods are running according too their PodSpecs.
You can find synopsis for kubelet and kube-controller-manager in the linked docs.
EDIT:
There is one exception however in a form of Static Pods:
Static Pods are managed directly by the kubelet daemon on a specific
node, without the API server observing them. Unlike Pods that are
managed by the control plane (for example, a Deployment); instead, the
kubelet watches each static Pod (and restarts it if it fails).
Note that it does not apply to multiple replicas.
In Kelsey Hightower's Kubernetes Up and Running, he gives two commands :
kubectl get daemonSets --namespace=kube-system kube-proxy
and
kubectl get deployments --namespace=kube-system kube-dns
Why does one use daemonSets and the other deployments?
And what's the difference?
Kubernetes deployments manage stateless services running on your cluster (as opposed to for example StatefulSets which manage stateful services). Their purpose is to keep a set of identical pods running and upgrade them in a controlled way. For example, you define how many replicas(pods) of your app you want to run in the deployment definition and kubernetes will make that many replicas of your application spread over nodes. If you say 5 replica's over 3 nodes, then some nodes will have more than one replica of your app running.
DaemonSets manage groups of replicated Pods. However, DaemonSets attempt to adhere to a one-Pod-per-node model, either across the entire cluster or a subset of nodes. A Daemonset will not run more than one replica per node. Another advantage of using a Daemonset is that, if you add a node to the cluster, then the Daemonset will automatically spawn a pod on that node, which a deployment will not do.
DaemonSets are useful for deploying ongoing background tasks that you need to run on all or certain nodes, and which do not require user intervention. Examples of such tasks include storage daemons like ceph, log collection daemons like fluentd, and node monitoring daemons like collectd
Lets take the example you mentioned in your question: why iskube-dns a deployment andkube-proxy a daemonset?
The reason behind that is that kube-proxy is needed on every node in the cluster to run IP tables, so that every node can access every pod no matter on which node it resides. Hence, when we make kube-proxy a daemonset and another node is added to the cluster at a later time, kube-proxy is automatically spawned on that node.
Kube-dns responsibility is to discover a service IP using its name and only one replica of kube-dns is enough to resolve the service name to its IP. Hence we make kube-dns a deployment, because we don't need kube-dns on every node.
I have a security pod that needs to run everywhere including master. I do not want, however, master to run any other (non kubernetes) pods.
I know I can taint master node, and I know I can setup affinity for a pod. Yet (unless I am misunderstanding something) that isn't quite what I want.
What I want is to setup affinity in a way that this security pod runs on every single node including master as a part of same daemon set. It is important that I only have a single definition due to how this security pod gets deployed.
Can this be done?
I am running Kubernetes 1.8
I think this is more or less duplicate to this question.
What you need is a combination of two features:
DaemonSet will allow you to schedule Pod to run on every node
Tolerations in the DaemonSet Pods will allow this workload to run even on the node which has the master taint.
That way your security pods will run everywhere even on the master with the taint because they can tolerate it. I think there is an example directly on the DaemonSet website.
But other pods without this toleration will not be scheduled on master because they do not tolerate the taint.
We use local persistent storage as storage backend for SOLR pods. The pods are redundantly scheduled to multiple kubernetes nodes. If one of the nodes go down there are always enough instances on other nodes.
How can we drain these nodes (without "migrating" the SOLR pods to other nodes) in case we want to do a maintenance on a node? The most important thing for us would be that kube-proxy would no longer send new requests to the pods on the node in question so that after some time we could do the maintenance without interrupting service for running requests.
We tried cordon but cordon will only make sure no new pods are scheduled to a node. Drain does not seem to work with pods with local persistent volumes.
You can check out pod anti-affinity.
https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
These constructs allow you to repel or attract pods when certain conditions are met.
In your case the pod anti-affinity 'requiredDuringSchedulingIgnoredDuringExecution' maybe your best bet. I haven't personally used it yet, i hope it can lead you to the right direction.
I am a new cookie to kubernetes . I am wondering if kubernetes have automatically switch the pods to another node if that node resources are on critical.
For example if Pod A , Pod B , Pod C is running on Node A and Pod D is running on Node B. The resources of Node A used by pods would be high. In these case whether kubernetes will migrate the any of the pods running in Node A to Node B.
I have learnt about node affinity and node selector which is used to run the pods in certain nodes. It would be helpfull if kubernetes offer this feature to migrate the pods to another node automatically if resources are used highly.
Can any one know how can we achieve this in kubernetes ?
Thanks
-S
Yes, Kubernetes can migrate the pods to another node automatically if resources are used highly. The pod would be killed and a new pod would be started on another node. You would probably want to learn about Quality of Service Classes, to understand which pod would be killed first.
That said, you may want to read about Automatic Horizontal Pod Autoscaling. This may give you more control.
With Horizontal Pod Autoscaling, Kubernetes automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization (or, with alpha support, on some other, application-provided metrics).
With increase of load it makes more sense to spin up a new pod rather than moving pod between different nodes to avoid distraction of currently running processes inside pod on busy node.
you can do node selector in deployment and move the node
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/