Should we keep or delete completed jobs or pods in kubernetes - kubernetes

I can get jobs and pods in kubernetes after they have been completed.Then I have 2 choices:
1. delete them, but I can't find the history anymore, maybe I should build a history server by myself?
2. keep them, I wonder if there is any performance problem when lots of pods and jobs accumulate
Which one is kubernetes designed to be? Or a third choice?

Related

Automatically delete pods with status 'completed' periodically and across namespaces

I'm having Spring Cloud Dataflow deployed in multiple namespaces of my kubernetes cluster.
Additionally, a task is registered there which is executed from time to time.
Executing a Task in SCDF on kubernetes will create a pod for each execution, and if it's successful, the pod will not be deleted, but set to 'completed' instead.
I am looking for a way to automatically remove those completed pods regularily after a given amount of time (e.g. days). Also, best case scenario would be if that would work across namespaces, but i am not sure if this possible.
Do you know of any way to achieve this?
My first thought was CronJob with busybox, but i am not sure if i can give a CronJob the required permissions to delete ressources in a cluster and it would probably require to be deployed in each namespace that i want to delete ressources from.
Let me know of your thoughts, thanks in advance guys

Schedule as many pods as will fit in the cluster?

I've got a batch job to run: process a large number of media files. I have a Kubernetes cluster to run it on, but I don't want to change the size of the cluster. I want to run the processing as a low-priority job. Any time there are spare compute resources, they should work on media-processing. Any time there are other jobs that need resources, the media process should be suspended.
Currently, I'm running a Deployment with one replica for each node in my cluster. I defined a PriorityClass for the batch-job and a different PriorityClass (with higher priority) for everything else. That seems to be working to evict running batch-jobs when something else needs the resources.
I define a Affinity, specifically a WeightedPod(Anti)Affinity to discourage the batch-job from scheduling on the same machine.
The code itself is a queue-worker: it pulls one work-item off a shared queue and processes it and then goes back for the next. If it gets interrupted (because it's being evicted) the partial work is lost (which is fine).
This is working OK, but I'm leaving a lot of resources on the table, still. Is there some way to define my replica-count as "as many as you can schedule"? I could ask for far more replicas than the cluster can handle; would that be a good solution? Or are there problems with Kubernetes having 10 pods stuck "pending" for months at a time?
I think there's no harm in asking for more pods than the cluster can handle and keeping them pending forever. My only concern is whether the scheduler will be able to discern normal priority pending pods over low priority pending pods, and be able to give precedence to the more urgent ones.
The pro way to go about this issue, IMHO, is to leverage prometheus adapter and use an HPA to target the current capacity of your cluster using a prometheus query. This can give you continuous of the cluster capacity and the ability to autoscale accordingly. This medium article has a pretty good introduction to the concept.

Can a Job have multiple distinct tasks running in different pods under it?

In a K8S cluster when a Job has multiple pods under it, are these all replicas?
Can a Job have 5 pods running under it and each of the pod is basically a different task?
Yes, there is a provision for running multiple pods under a job either sequentially or in parallel. In the spec section of a Job, you can mention completions and it will be equivalent to the number of pods to be run sequentially.
And if you want to run them in parallel, similarly you can define parallelism and assign some value to this key in the spec part of Job yaml.
The former will take care of pods to complete execution successfully and later will define the limit of pods those can run in parallel.
I also started a discussion about this topic on Kubernetes community forum. Here is a discussion on it: https://discuss.kubernetes.io/t/can-a-job-have-multiple-distinct-tasks-running-in-different-pods-under-it/11575
As per that, a Job can only have a single pod spec. As a result even if a job has multiple pods deployed under it all of them will be mirror images of each other. There needs to a separate arrangement to make each of the pods do different things.
If your steps are sequential and not in parallel, then you can just define multiple containers under the same job.

Pod is still running after I delete the parent job

I created a job in my kubernetes cluster, the job takes a long time to finish, I decided to cancel it, so I deleted the job, but I noticed the associated pod is NOT automatically deleted. Is this the expected behavior? why is it not consistent with deployment deletion? Is there a way to make pod automatically deleted?
If you're deleting a deployment, chances are you don't want any of the underlying pods, so it most likely forcefully deletes the pods by default. Also, the desired state of pods would be unknown.
On the other hand if you're deleting a pod, it doesn't know what kind of replication controller may be attached to it and what it is doing next. So it signals a shutdown to the container so that it can perhaps clean up gracefully. There may be processes that are still using the pod, like a web request etc. and it would not be good to kill their request if it may take a second to complete. This is what happens if you may be scaling up your pods or rolling out a new deployment, and you don't want any of the users to experience any downtime. This is in fact one of the benefits of Kubernetes, as opposed to a traditional application server which requires you to shutdown the system to upgrade (or to play with load balancers to redirect traffic) which may negatively affect users.

Stateful jobs in Kubernetes

I have a requirement to run an ad-hoc job, once in a while. The job needs some state to work. Building the state takes a lot of time. So, it is desired to keep the state persistent and reusable in subsequent runs, for a fast turnaround time. I want this job to be managed as K8s pods.
This is a complete set of requirements:
Pods will go down after work finish. The K8s controller should not try to bring up the pods.
Each pod should have a persistent volume attached to it. There should be 1 volume per pod. I am planning to use EBS.
We should be able to manually bring the pods back up in future.
Future runs may have more or less replicas than the past runs.
I know K8s supports both Jobs and Statefulsets. Is there any Controller which supports both at the same time?
Pods will go down after work finish. The K8s controller should not try
to bring up the pods.
This is what Jobs do - run to completion. You only control whether you wanna retry on exit > 0.
Pods should have a persistent volume attached to
them.
Same volume to all? Will they write or only read? What volume backend do you have, AWS EBS or similar? Depending of answers you might want to split input data between few volumes or use separate volumes to write and then finalization job to assemble in 1 volume (kind of map reduce). Or use volume backend which supports multi-mount RW https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes (see table for ReadWriteMany)
We should be able to manually bring the pods back up in future.
Jobs fit here: You launch it when you need it, and it runs till completion.
Future runs may have more or less replicas than the past runs.
Jobs fit here. Specify different completions or parallelism when you launch a job: https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/#parallel-jobs
StatefulSets are different concept, they mostly used for clustered software which you run continuously and need to persist the role per pod (e.g. shard).