Kubernetes Automated Rollbacks - kubernetes

I was trying to find some alternative for docker-swarm rollback command, which allows you to specify rollback strategy in the deployment file.
In k8s ideally it should use readinessProbe, and if didn't pass failureThreshold it should rollback, before starting deployment of the next pod (to avoid downtime).
Currently, in my deployment script, I'm using hook kubectl rollout status deployment $DEPLOYMENT_NAME || kubectl rollout undo deployment $DEPLOYMENT_NAME, which works, but it's not ideal because first rollout command will trigger error way after unhealthy pod will be deployed and a healthy one will be destroyed which will cause downtime.
Ideally, it shouldn't even kill current pod before a new one will pass readinessProbe

There is no specific rollback strategy in a Kubernetes deployment. You could try a combination of RollingUpdate with max unavailable (aka Proportional Scaling), then at some point pause your deployment resume if everything looks good and then rollback if something went wrong.
The recommended way is really to use another deployment as canary split the traffic through a load balancer between canary and non-canary, then if everything goes well upgrade the non-canary and shut down the canary. If something goes wrong shutdown the canary and keep the non-canary until the issue is fixed.
Another strategy is to use something like Istio that facilitates canary deployments.

Related

How will a scheduled (rolling) restart of a service be affected by an ongoing upgrade (and vice versa)

Due to a memory leak in one of our services I am planning to add a k8s CronJob to schedule a periodic restart of the leaking service. Right now we do not have the resources to look into the mem leak properly, so we need a temporary solution to quickly minimize the issues caused by the leak. It will be a rolling restart, as outlined here:
How to schedule pods restart
I have already tested this in our test cluster, and it seems to work as expected. The service has 2 replicas in test, and 3 in production.
My plan is to schedule the CronJob to run every 2 hours.
I am now wondering: How will the new CronJob behave if it should happen to execute while a service upgrade is already running? We do rolling upgrades to achieve zero downtime, and we sometimes roll out upgrades several times a day. I don't want to limit the people who deploy upgrades by saying "please ensure you never deploy near to 08:00, 10:00, 12:00 etc". That will never work in the long term.
And vice versa, I am also wondering what will happen if an upgrade is started while the CronJob is already running and the pods are restarting.
Does kubernetes have something built-in to handle this kind of conflict?
This answer to the linked question recommends using kubectl rollout restart from a CronJob pod. That command internally works by adding an annotation to the deployment's pod spec; since the pod spec is different, it triggers a new rolling upgrade of the deployment.
Say you're running an ordinary redeployment; that will change the image: setting in the pod spec. At about the same time, the kubectl rollout restart happens that changes an annotation setting in the pod spec. The Kubernetes API forces these two changes to be serialized, so the final deployment object will always have both changes in it.
This question then reduces to "what happens if a deployment changes and needs to trigger a redeployment, while a redeployment is already running?" The Deployment documentation covers this case: it will start deploying new pods on the newest version of the pod spec and treat all older ones as "old", so a pod with the intermediate state might only exist for a couple of minutes before getting replaced.
In short: this should work consistently and you shouldn't need to take any special precautions.

Kubernetes Deployment Rolling Updates

I have an application that I deploy on Kubernetes.
This application has 4 replicas and I'm doing a rolling update on each deployment.
This application has a graceful shutdown which can take tens of minutes (it has to wait for running tasks to finish).
My problem is that during updates, I have over-capacity since all the older version pods are stuck at "Terminating" status while all the new pods are created.
During the updates, I end up running with 8 containers and it is something I'm trying to avoid.
I tried to set maxSurge to 0, but this setting doesn't take into consideration the "Terminating" pods, so the load on my servers during the deployment is too high.
The behaviour I'm trying to get is that new pods will only get created after the old version pods finished successfully, so at all times I'm not exceeding the number of replicas I set.
I wonder if there is a way to achieve such behaviour.
What I ended up doing is creating a StatefulSet with podManagementPolicy: Parallel and updateStrategy to OnDelete.
I also set terminationGracePeriodSeconds to the maximum time it takes for a pod to terminate.
As a part of my deployment process, I apply the new StatefulSet with the new image and then delete all the running pods.
This way all the pods are entering Terminating state and whenever a pod finished its task and terminated a new pod with the new image will replace it.
This way I'm able to keep a static number of replicas during the whole deployment process.
Let me suggest the following strategy:
Deployments implement the concept of ready pods to aide rolling updates. Readiness probes allow the deployment to gradually update pods while giving you the control to determine when the rolling update can proceed.
A Ready pod is one that is considered successfully updated by the Deployment and will no longer count towards the surge count for deployment. A pod will be considered ready if its readiness probe is successful and spec.minReadySeconds have passed since the pod was created. The default for these options will result in a pod that is ready as soon as its containers start.
So, what you can do, is implement (if you haven't done so yet) a readiness probe for your pods in addition to setting the spec.minReadySeconds to a value that will make sense (worst case) to the time that it takes for your pods to terminate.
This will ensure rolling out will happen gradually and in coordination to your requirements.
In addition to that, don't forget to configure a deadline for the rollout.
By default, after the rollout can’t make any progress in 10 minutes, it’s considered as failed. The time after which the Deployment is considered failed is configurable through the progressDeadlineSeconds property in the Deployment spec.

Skipping a pod deployment in statefulset

I have a stateful set of pods, and due to the stateful nature of them one of them cannot be recreated due to some state error that deleting it wouldn't help.
Since it's a stateful set kubernetes will block creation of additional pods until it's able to get the stuck one running.
Statefulsets has podManagementPolicy: "Parallel" but it cannot be changed in runtime.
The question is if there's a way to make kubernetes skip the stuck one?
I belive you are looking for a WA for an known issue which is still open
StatefulSet will continue to wait for the broken Pod to become Ready (which never happens) before it will attempt to revert it back to the working configuration.
In term of upgrade found this on git hub below from official doc
The Pods in the StatefulSet are updated in reverse ordinal order. The StatefulSet controller terminates each Pod, and waits for it to transition to Running and Ready prior to updating the next Pod.
Note that, even though the StatefulSet controller will not proceed to update the next Pod until its ordinal successor is Running and Ready, it will restore any Pod that fails during the update to its current version. Pods that have already received the update will be restored to the updated version, and Pods that have not yet received the update will be restored to the previous version. In this way, the controller attempts to continue to keep the application healthy and the update consistent in the presence of intermittent failures.
Read Forced Rollback
When using Rolling Updates with the default Pod Management Policy (OrderedReady), it’s possible to get into a broken state that requires manual intervention to repair.

Is it possible to make restartPolicy never during a kubernetes deployment, but only during the deployment?

Whenever I do a Kubernetes deployment with some sort of configuration error, the pod ends up in CrashLoopBackOff, constantly restarting the (totally broken) pod. What I would like is for any sort of errors during a deployment to immediately fail the deployment, rather than just blindly retrying until the deployment times out.
Deploy with restartPolicy: never and then use kubectl patch to modify the restart policy of that deployment.
To avoid continuous restart attempt of failing pod there is one open issue.
Also there is one open pull request to add this feature which is about to get merged, where you will have ability to specify max retries for restart policy OnFailure.
Till this feature get merged and released, kubectl patch seems to be the only way.
You can first deploy your cluster with restartPolicy: never, then use kubectl patch to modify the restart policy of the running deployment.

Is there the concept of uploading a Deployment without causing pods to start?

(I am (all things considered) a Kubernetes rookie.)
I know that kubectl create -f myDeployment.yaml will send my deployment specification off to the cluster to be reified, and if it says to start three replicas of its contained pod template then Kubernetes will set about starting up three pods.
I wonder: is there a Kubernetes concept or practice of somehow uploading the deployment for reference later and then "activating" it later? Perhaps by, say, changing replicas from zero to some positive number? If this is not a meaningful question, or this isn't the Right Way To Think About Things, I'd appreciate pointers as well.
I don't think you idea would work well with Kubernetes. Firstly, there so no way of "pausing" a Deployment or any other ReplicationController or ReplicaSet, besides setting the replicas to 0, as you mentioned.
The next issue is, that the YAML you would get from the apiserver isn't the same as you created. The controller manager adds some annotations, default values and statuses. So it would be hard to verify the Deployment that way.
IMO a better way to verify Deployments is to add them to a version control system and peer-review the YAML files. Then you can create or update is on the apiserver with kubectl apply -f myDeployment.yaml. If the Deployment is wrong in term of syntax, then kubectl will complain about it and you could patch the Deployment accordingly. This also simplifies the update procedure of Deployments.
Deployment can be paused, please refer https://kubernetes.io/docs/user-guide/deployments/#pausing-and-resuming-a-deployment , or see information with kubectl rollout pause -h.
You can adjust replicas of a paused deployment, but changes on pod template will not trigger a rollout. If the deployment is paused in the middle of a rollout, then it will not continue until you resume it.