How will a scheduled (rolling) restart of a service be affected by an ongoing upgrade (and vice versa) - kubernetes

Due to a memory leak in one of our services I am planning to add a k8s CronJob to schedule a periodic restart of the leaking service. Right now we do not have the resources to look into the mem leak properly, so we need a temporary solution to quickly minimize the issues caused by the leak. It will be a rolling restart, as outlined here:
How to schedule pods restart
I have already tested this in our test cluster, and it seems to work as expected. The service has 2 replicas in test, and 3 in production.
My plan is to schedule the CronJob to run every 2 hours.
I am now wondering: How will the new CronJob behave if it should happen to execute while a service upgrade is already running? We do rolling upgrades to achieve zero downtime, and we sometimes roll out upgrades several times a day. I don't want to limit the people who deploy upgrades by saying "please ensure you never deploy near to 08:00, 10:00, 12:00 etc". That will never work in the long term.
And vice versa, I am also wondering what will happen if an upgrade is started while the CronJob is already running and the pods are restarting.
Does kubernetes have something built-in to handle this kind of conflict?

This answer to the linked question recommends using kubectl rollout restart from a CronJob pod. That command internally works by adding an annotation to the deployment's pod spec; since the pod spec is different, it triggers a new rolling upgrade of the deployment.
Say you're running an ordinary redeployment; that will change the image: setting in the pod spec. At about the same time, the kubectl rollout restart happens that changes an annotation setting in the pod spec. The Kubernetes API forces these two changes to be serialized, so the final deployment object will always have both changes in it.
This question then reduces to "what happens if a deployment changes and needs to trigger a redeployment, while a redeployment is already running?" The Deployment documentation covers this case: it will start deploying new pods on the newest version of the pod spec and treat all older ones as "old", so a pod with the intermediate state might only exist for a couple of minutes before getting replaced.
In short: this should work consistently and you shouldn't need to take any special precautions.

Related

Kubernetes Deployment Terminate Oldest Pod

I'm using Azure Kubernetes Service to run a Go application that pulls from RabbitMQ, runs some processing, and returns it. The pods scale to handle an increase of jobs. Pretty run-of-the-mill stuff.
The HPA is setup like this:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
production Deployment/production 79%/80% 2 10 10 4d11h
staging Deployment/staging 2%/80% 1 2 1 4d11h
What happens is as the HPA scales up and down, there are always 2 pods that will stay running. We've found that after running for so long, the Go app on those pods will time out. Sometimes that's days, sometimes it weeks. Yes, we could probably dig into the code and figure this out, but it's kind of a low priority for that team.
Another solution I've thought of would be to have the HPA remove the oldest pods first. This would mean that the oldest pod would never be more than a few hours old. A first-in, first-out model.
However, I don't see any clear way to do that. It's entirely possible that it isn't, but it seems like something that could work.
Am I missing something? Is there a way to make this work?
In my opinion(I also mentioned in comment) - the most simple(not sure about elegance) way is to have some cronjob that will periodically clean timed out pods.
One CronJob object is like one line of a crontab (cron table) file. It
runs a job periodically on a given schedule, written in Cron format.
CronJobs are useful for creating periodic and recurring tasks, like
running backups or sending emails. CronJobs can also schedule
individual tasks for a specific time, such as scheduling a Job for
when your cluster is likely to be idle.
Cronjobs examples and howto:
How To Use Kubernetes’ Job and CronJob
Kubernetes: Delete pods older than X days
https://github.com/dignajar/clean-pods <-- real example

Kubernetes rolling deploy: terminate a pod only when there are no containers running

I am trying to deploy updates to pods. However I want the current pods to terminate only when all the containers inside the pod have terminated and their process is complete.
The new pods can keep waiting to start untill all container in the old pods have completed. We have a mechanism to stop old pods from picking up new tasks and therefore they should eventually terminate.
It's okay if twice the pods exist at some instance of time. I tried finding solution for this in kubernetes docs but wan't successful. Pointers on how / if this is possible would be helpful.
well I guess then you may have to create a duplicate kind of deployment with new image as required and change the selector in service to new deployment, which will prevent external traffic from entering pre-existing pods and new calls can go to new pods. Then later you can check for something like -
Kubectl top pods -c containers
and if the load appears to be static and low, then preferrably you can delete the old pods related deployment later.
But for this thing everytime the service selectors have to be updated and likely for keeping track of things you can append the git commit hash to the service selector to keep it unique everytime.
But rollback to previous versions if required from inside Kubernetes cluster will be difficult, so preferably you can trigger the wanted build again.
I hope this makes some sense !!

Skipping a pod deployment in statefulset

I have a stateful set of pods, and due to the stateful nature of them one of them cannot be recreated due to some state error that deleting it wouldn't help.
Since it's a stateful set kubernetes will block creation of additional pods until it's able to get the stuck one running.
Statefulsets has podManagementPolicy: "Parallel" but it cannot be changed in runtime.
The question is if there's a way to make kubernetes skip the stuck one?
I belive you are looking for a WA for an known issue which is still open
StatefulSet will continue to wait for the broken Pod to become Ready (which never happens) before it will attempt to revert it back to the working configuration.
In term of upgrade found this on git hub below from official doc
The Pods in the StatefulSet are updated in reverse ordinal order. The StatefulSet controller terminates each Pod, and waits for it to transition to Running and Ready prior to updating the next Pod.
Note that, even though the StatefulSet controller will not proceed to update the next Pod until its ordinal successor is Running and Ready, it will restore any Pod that fails during the update to its current version. Pods that have already received the update will be restored to the updated version, and Pods that have not yet received the update will be restored to the previous version. In this way, the controller attempts to continue to keep the application healthy and the update consistent in the presence of intermittent failures.
Read Forced Rollback
When using Rolling Updates with the default Pod Management Policy (OrderedReady), it’s possible to get into a broken state that requires manual intervention to repair.

Apply changed limits in Kubernetes?

I changed the limits (default requested amount of CPU) on my Kubernetes cluster. Of course the new limits don't affect already running Pods. So, how can I apply the new (lower) limits to already running Pods.
Is there any way to update the limits in the running Pods without restarting them?
If I have to restart the Pods, how can this be done without deleting and recreating them? (I am really using pure Pods, no Depoyments or so)
You need to restart the Pods:
You can't update the resources field of a running Pod. The update would be rejected.
You need to create new Pods and delete the old ones. You can create the new ones first and delete the old ones when the new ones are running, if this allows you to avoid downtime.
You can do that only when you run it as a deployment, or atleast run a pod with RestartPolicy as RestartAlways, so you can always scale down to zero and scale up to 1 for a safe restart. In your case, considering that you just run your pod using kubectl run without any restartpolicy, or a restartpolicy as Never, i would run another pod, test and kill the already running ones. Expect better answers from any one.
You cant change the properties of a running pod. It would reject the changes. Rather you can create a deployment whose rolling update feature ensures,
one pod will be running during the update of limits.
There wont be any downtime for your pod.

What is a rollout in Kubernetes?

I just started to learn Kubernetes. I know what a rollback is, but I have never heard of rollout. Is "rollout" related to rollback in any way? Or "rollout is similar to deploying something?
Rollout simply means rolling update of application. Rolling update means that application is updated gradually, gracefully and with no downtime. So when you push new version of your application's Docker image and then trigger rollout of your deployment Kubernetes first launches new pod with new image while keeping old version still running. When new pod settles down (passes its readiness probe) - Kubernetes kills old pod and switches Service endpoints to point to new version. When you have multiple replicas it will happen gradually until all replicas are replaced with new version.
This behavior however is not the only one possible. You can tune Rolling Update settings in your deployments spec.strategy settings.
Official docs even have interactive tutorial on Rolling Update feature, it perfectly explains how it works: https://kubernetes.io/docs/tutorials/kubernetes-basics/update/update-intro/
Rollout is opposite to Rollback. Yes it means deploying new application or upgrading existing application.
Note: Some more details on the paragraph that you referred. Let's say we have 5 replicas. On rollout, we can configure how many applications should upgrade at a time and what should happen if there is a failure in the new configuration using maxUnavailabe, maxSurge and readinessProbe. Refer refer about all this parameters and tune accordingly.