Kubernetes Queue with Pod Per Work Item autoscaling - kubernetes

I want an application to pull an item off a queue, process the item on the queue and then destroy itself. Pull -> Process -> Destroy.
I've looked at using the job pattern Queue with Pod Per Work Item as that fits the usecase however it isn't appropriate when I need the job to autoscale aka 0/1 pods when queue is empty and scale to a point when items are added. The only way I can see doing this is via a deployment but that removes the pattern of Queue with Pod per Work Item. There must be a fresh container per item.
Is there a way to have the job pattern Queue with Pod Per Work Item but with auto-scaling?

I am a bit confused, so I'll just say this: if you don't mind a failed pod, and you wish that a failed pod will not be recreated by Kubernetes, you can do that in your code by catching all errors and exiting gracefully (not advised).
Please also note, that for deployments, the only accepted restartPolicy is always. So pods of a deployments who crash will always be restarted by Kubernetes, and will probably fails for the same reason, leading to a CrashLoopBackOff.
If you want to scale a deployment depending on the length of a RabbitMQ queue's length, check KEDA. It is an event-driven autoscaling platform.
Make sure to also check their example with RabbitMQ
Another possibility is a job/deployment, that routinely checks the length of the queue in question and executes kubectl commands to scale your deployment.
Here is the cleanest one I could find, at least for my taste

Related

Ensuring graceful shutdown of a pod without ordinal pod ids

When a pod is in a restart loop is it eligible for being removed during scaling down before it restarts successfully? (without stateful sets)
Also what happens if a pod container exits with a non-zero exit code when scaling that pod down? Will it be restarted and shutdown again or just removed? (with or without stateful sets)
Can I ensure that a pod is always gracefully shutdown without using stateful sets (because I want lifetime-unique UIDs instead of distinct reusable ordinal ids)?
Can I ensure that a pod is always gracefully shutdown without using stateful sets (because I want lifetime-unique UIDs instead of distinct reusable ordinal ids)?
Pods which are part of Job or Cronjob resources will run until all of the containers in the pod complete. However, the Linkerd proxy container runs continuously until it receives a TERM signal. Since Kubernetes does not give the proxy a means to know when the Cronjob has completed, by default, Job and Cronjob pods which have been meshed will continue to run even once the main container has completed.
it means we can stop graecfull shutdown
You can achieve this in three steps:
Add a label to all pods except the one you want to delete. Because the labels of the pods still satisfy the selector of the Replica Set, so no new pods will be created.
Update the Replica Set: adding the new label to the selector and decreasing the replicas of the Replica Set atomically. The pod you want to delete won't be selected by the Replica Set because it doesn't have the new label.
Delete the selected pod.
Fore more information refer to these documents.

Kubernetes rolling deploy: terminate a pod only when there are no containers running

I am trying to deploy updates to pods. However I want the current pods to terminate only when all the containers inside the pod have terminated and their process is complete.
The new pods can keep waiting to start untill all container in the old pods have completed. We have a mechanism to stop old pods from picking up new tasks and therefore they should eventually terminate.
It's okay if twice the pods exist at some instance of time. I tried finding solution for this in kubernetes docs but wan't successful. Pointers on how / if this is possible would be helpful.
well I guess then you may have to create a duplicate kind of deployment with new image as required and change the selector in service to new deployment, which will prevent external traffic from entering pre-existing pods and new calls can go to new pods. Then later you can check for something like -
Kubectl top pods -c containers
and if the load appears to be static and low, then preferrably you can delete the old pods related deployment later.
But for this thing everytime the service selectors have to be updated and likely for keeping track of things you can append the git commit hash to the service selector to keep it unique everytime.
But rollback to previous versions if required from inside Kubernetes cluster will be difficult, so preferably you can trigger the wanted build again.
I hope this makes some sense !!

Kubernetes Deployment Rolling Updates

I have an application that I deploy on Kubernetes.
This application has 4 replicas and I'm doing a rolling update on each deployment.
This application has a graceful shutdown which can take tens of minutes (it has to wait for running tasks to finish).
My problem is that during updates, I have over-capacity since all the older version pods are stuck at "Terminating" status while all the new pods are created.
During the updates, I end up running with 8 containers and it is something I'm trying to avoid.
I tried to set maxSurge to 0, but this setting doesn't take into consideration the "Terminating" pods, so the load on my servers during the deployment is too high.
The behaviour I'm trying to get is that new pods will only get created after the old version pods finished successfully, so at all times I'm not exceeding the number of replicas I set.
I wonder if there is a way to achieve such behaviour.
What I ended up doing is creating a StatefulSet with podManagementPolicy: Parallel and updateStrategy to OnDelete.
I also set terminationGracePeriodSeconds to the maximum time it takes for a pod to terminate.
As a part of my deployment process, I apply the new StatefulSet with the new image and then delete all the running pods.
This way all the pods are entering Terminating state and whenever a pod finished its task and terminated a new pod with the new image will replace it.
This way I'm able to keep a static number of replicas during the whole deployment process.
Let me suggest the following strategy:
Deployments implement the concept of ready pods to aide rolling updates. Readiness probes allow the deployment to gradually update pods while giving you the control to determine when the rolling update can proceed.
A Ready pod is one that is considered successfully updated by the Deployment and will no longer count towards the surge count for deployment. A pod will be considered ready if its readiness probe is successful and spec.minReadySeconds have passed since the pod was created. The default for these options will result in a pod that is ready as soon as its containers start.
So, what you can do, is implement (if you haven't done so yet) a readiness probe for your pods in addition to setting the spec.minReadySeconds to a value that will make sense (worst case) to the time that it takes for your pods to terminate.
This will ensure rolling out will happen gradually and in coordination to your requirements.
In addition to that, don't forget to configure a deadline for the rollout.
By default, after the rollout can’t make any progress in 10 minutes, it’s considered as failed. The time after which the Deployment is considered failed is configurable through the progressDeadlineSeconds property in the Deployment spec.

Set a Pod as owner reference for another pod in client go program

Is it possible to set a running pod as owner of another pod which is to be created. I tired but in that case pod creation fails.
This is not directly supported by Kubernetes. When you have a Pod that depends on the existence of another one (e.g. needs a Database or similar), you could use a Init Container. This will delay the container start until the init container finishes. This is a good way to apply e.g. waiting conditions.
I think you can use Kubernetes Jobs.
A Job creates one or more Pods and ensures that a specified number of them successfully terminate. As pods successfully complete, the Job tracks the successful completions. When a specified number of successful completions is reached, the task (ie, Job) is complete. Deleting a Job will clean up the Pods it created.
A simple case is to create one Job object in order to reliably run one Pod to completion. The Job object will start a new Pod if the first Pod fails or is deleted (for example due to a node hardware failure or a node reboot).
More information you can find here: jobs-kubernetes.

changing Pod priority without restarting the Pod?

I am trying to change priority of an existing Kubernetes Pod using 'patch' command, but it returns error saying that this is not one of the fields that can be modified. I can patch the priority in the Deployment spec, but it would cause the Pod to be recreated (following the defined update strategy).
The basic idea is to implement a mechanism conceptually similar to nice levels (for my application), so that certain Pods can be de-prioritized based on certain conditions (by my controller), and preempted by the default scheduler in case of resource congestion. But I don't want them to be restarted if there is no congestion.
Is there a way around it, or there is something inherent in the way scheduler works that would prevent something like this from working properly?
Priority values are applied to a pod based on the priority value of the PriorityClass assigned to their deployment at the time that the pod is scheduled. Any changes made to the PriorityClass will not be applied to pods which have already been scheduled, so you would have to redeploy the pod for the priority to take effect anyway.
As far I know,
Pod priority will work on when pod is getting scheduled.
First you need to create PriorityClasses
Create Pod with priorityClassName and mention priorityclass in pod definition.
If you are trying to add priority to already scheduled pod I will not work.
For reference: https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#pod-priority