kubernetes: specifying maxUnavailable in both Deployment and PDB - kubernetes

Assuming I have a Deployment with a specific value set to the .spec.strategy.rollingUpdate.maxUnavailable field.
Then I deploy a PodDisruptionBudget attached to the deployment above, setting its spec.maxUnavailable field to a value different to the above.
Which one will prevail?

By interpreting the documentation, it seems that it depends on the event.
For a rolling update, the Deployment's maxUnavailable will be in effect, even if the PodDisruptionBudget specifies a smaller value.
But for an eviction, the PodDisruptionBudget's maxUnavailable will prevail, even if the Deployment specifies a smaller value.
The documentation does not explicitly compare these two settings, but from the way the documentation is written, it can be deduced that these are separate settings for different events that don't interact with each other.
For example:
Updating a Deployment
Output of kubectl explain deploy.spec.strategy.rollingUpdate.maxUnavailable
Specifying a PodDisruptionBudget
Output of kubectl explain pdb.spec.maxUnavailable
Also, this is more in the spirit of how Kubernetes works. The Deployment Controller is not going to read a field of a PodDisruptionBudget, and vice versa.
But to be really sure, you would just need to try it out.

I believe they updated the docs clarifying your doubt:
Involuntary disruptions cannot be prevented by PDBs; however they do count against the budget.
Pods which are deleted or unavailable due to a rolling upgrade to an application do count against the disruption budget, but workload resources (such as Deployment and StatefulSet) are not limited by PDBs when doing rolling upgrades. Instead, the handling of failures during application updates is configured in the spec for the specific workload resource
Caution: Not all voluntary disruptions are constrained by Pod Disruption Budgets. For example, deleting deployments or pods bypasses Pod Disruption Budgets.

Related

Rolling update to achieve zero down time vertical pod autoscaler in Kubernetes

Kubernetes vertical pod autoscaler (autoscale memory, cpu resources of pods) necessitates a restart of the pod to be able to use the newly assigned resources which might add small window of unavailability.
My question is that if the deployment of the pod is running a rolling update would that ensure zero down time, and zero window of unavailability when the VPA recommendation is applied.
Thank you.
From the official documentation:
Rolling updates allow Deployments' update to take place with zero downtime by incrementally updating Pods instances with new ones. The new Pods will be scheduled on Nodes with available resources.
In this documentation, you will find a very good rolling update overview:
Rolling updates allow the following actions:
Promote an application from one environment to another (via container image updates)
Rollback to previous versions
Continuous Integration and Continuous Delivery of applications with zero downtime
Here you can find information about Rolling update deployment:
The Deployment updates Pods in a rolling update fashion when .spec.strategy.type==RollingUpdate. You can specify maxUnavailable and maxSurge to control the rolling update process.
Additionally, you can add another 2 fields: Max Unavailable and Max Surge.
.spec.strategy.rollingUpdate.maxUnavailable is an optional field that specifies the maximum number of Pods that can be unavailable during the update process.
.spec.strategy.rollingUpdate.maxSurge is an optional field that specifies the maximum number of Pods that can be created over the desired number of Pods.
Now it's up to you how you set these values. Here are some options:
Deploy by adding a Pod, then remove an old one: maxSurge = 1, maxUnavailable = 0. With this configuration, Kubernetes will spin up an additional Pod, then stop an “old” one down.
Deploy by removing a Pod, then add a new one: maxSurge = 0, maxUnavailable = 1. In that case, Kubernetes will first stop a Pod before starting up a new one.
Deploy by updating pods as fast as possible: maxSurge = 1, maxUnavailable = 1. This configuration drastically reduce the time needed to switch between application versions, but combines the cons from both the previous ones.
See also:
good article about zero downtime
guide with examples
Yes. The default RollingUpdate behavior for Deployment should automatically do that. It brings up some new replicas first,then delete some old replicas once the new ones are ready. You can control how many pod can be unavailable at once or how many new pod will be created using maxUnavailable and maxSurge field. You can tune these variable to achieve your zero downtime goal.
Ref:
https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#rolling-update-deployment
https://kubernetes.io/blog/2018/04/30/zero-downtime-deployment-kubernetes-jenkins/

changing Pod priority without restarting the Pod?

I am trying to change priority of an existing Kubernetes Pod using 'patch' command, but it returns error saying that this is not one of the fields that can be modified. I can patch the priority in the Deployment spec, but it would cause the Pod to be recreated (following the defined update strategy).
The basic idea is to implement a mechanism conceptually similar to nice levels (for my application), so that certain Pods can be de-prioritized based on certain conditions (by my controller), and preempted by the default scheduler in case of resource congestion. But I don't want them to be restarted if there is no congestion.
Is there a way around it, or there is something inherent in the way scheduler works that would prevent something like this from working properly?
Priority values are applied to a pod based on the priority value of the PriorityClass assigned to their deployment at the time that the pod is scheduled. Any changes made to the PriorityClass will not be applied to pods which have already been scheduled, so you would have to redeploy the pod for the priority to take effect anyway.
As far I know,
Pod priority will work on when pod is getting scheduled.
First you need to create PriorityClasses
Create Pod with priorityClassName and mention priorityclass in pod definition.
If you are trying to add priority to already scheduled pod I will not work.
For reference: https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#pod-priority

Limiting the maximum number of pods per deployment within a namespace / cluster

I am attempting to enforce a limit on the number of pods per deployment within a namespace / cluster from a single configuration point. I have two non-production environments, and I want to deploy fewer replicas on these as they do not need the redundancy.
I've read this but I'm unable to find out how to do it on the deployment level. Is it possible to enforce this globally without going into each one of them?
Since the replicas is a fixed value on a deployment YAML file, you may better to use helm to make a template for adding some flexibility in your application.
Are you looking for something like this,
$ kubectl autoscale deployment <deployment-name> --min=1 --max=3 --cpu-percent=80 --namespace=<dev|staging>
Here, you can set the upper limit of the pod count for your deployment.

How does k8s know which pod to update?

I'm currently getting started with Kubernetes, and so far, I have a question that I could not find answered anywhere.
Until know, I have learned what containers, pods, and replica sets are. I basically understand the things, but one thing I did not get is: If I update a manifest of a pod (or of a replica set), and re-POST it to k8s - how does k8s know which existing manifest this refers to?
Is this matching done by the manifest's name, i.e. by the name of the pod or the replica set? Or …?
In other words: If I update a manifest, how does k8s know that it is an updated one, and how does it detect which one is the one with the previous version?
You are right, k8s uses metadata.name for identifying resources. That name is unique per resource type (Pod/ReplicaSet/...) and namespace.
Well, for starters lets get things straight. When you update manifest, it is obvious what to update in the first place - the object you updated - ie. Deployment or ReplicaSet. Now, when that is updated, the RollingUpdate kicks in, and this is what I assume you wonder about as well as in general how ownership of pod is established. If you make a kubectl get pod -o yaml you can find a keys like ownerReferences, pod-template-hash and kubernetes.io/created-by which should be rather obvious when you see the content. In the other direction (so not from the Pod but from Deployment) you have a selector field which defines what labels are used to filter pods to find the right ones.

Is there the concept of uploading a Deployment without causing pods to start?

(I am (all things considered) a Kubernetes rookie.)
I know that kubectl create -f myDeployment.yaml will send my deployment specification off to the cluster to be reified, and if it says to start three replicas of its contained pod template then Kubernetes will set about starting up three pods.
I wonder: is there a Kubernetes concept or practice of somehow uploading the deployment for reference later and then "activating" it later? Perhaps by, say, changing replicas from zero to some positive number? If this is not a meaningful question, or this isn't the Right Way To Think About Things, I'd appreciate pointers as well.
I don't think you idea would work well with Kubernetes. Firstly, there so no way of "pausing" a Deployment or any other ReplicationController or ReplicaSet, besides setting the replicas to 0, as you mentioned.
The next issue is, that the YAML you would get from the apiserver isn't the same as you created. The controller manager adds some annotations, default values and statuses. So it would be hard to verify the Deployment that way.
IMO a better way to verify Deployments is to add them to a version control system and peer-review the YAML files. Then you can create or update is on the apiserver with kubectl apply -f myDeployment.yaml. If the Deployment is wrong in term of syntax, then kubectl will complain about it and you could patch the Deployment accordingly. This also simplifies the update procedure of Deployments.
Deployment can be paused, please refer https://kubernetes.io/docs/user-guide/deployments/#pausing-and-resuming-a-deployment , or see information with kubectl rollout pause -h.
You can adjust replicas of a paused deployment, but changes on pod template will not trigger a rollout. If the deployment is paused in the middle of a rollout, then it will not continue until you resume it.