I want to execute a script immediately after the container is completed successfully or terminated due to an error in the pod.
I tried by attaching handlers to Container lifecycle events like preStop but it is only called when a container is terminated due to an API request or management event such as liveness probe failure, preemption, resource contention and others.
Reference - Kubernetes Doc: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
Is there an alternative approach to this?
From the official docs, as you said:
Kubernetes only sends the preStop event when a Pod is terminated. This means that the preStop hook is not invoked when the Pod is completed.
Although, the use of bare pods is not recommended. Consider using a Job Controller:
A Job creates one or more Pods and ensures that a specified number of them successfully terminate. As pods successfully complete, the Job tracks the successful completions.
You can check job conditions and wait for them with this:
kubectl wait --for=condition=complete job/your-job, then run your script. In the meanwhile add preStop event to your pods definition to run script if pods are terminated. You can write extra script which will work in the background and will be checking if job is completed and then it will run script.
while kubectl get jobs your-job -o jsonpath='{.status.conditions[?
(#.type=="Complete")].status}' | grep True ; do <run your main script> ; done
See more: job-completion-task.
Related
I'm using:
kubectl rollout restart deployment my_cool_workers
This terminates the workers and start new ones.
However I want to rollout in a way where if something is running on a specific worker I want to let the task finish - I don't want to kill the tasks (so the worker should finish the tasks but not accepting new)
Meaning - rollout new workers -> old workers no longer accept traffic -> when old worker is no longer running anything terminate it.
How can this be done?
If a Pod gets killed, manually via kubectl or by any k8s controller like during a deployment, it will instantly change from Running into Terminating state. At the same time, the SIGTERM signal will be sent to all containers inside that Pod.
Starting from Kubernetes 1.19 you can debug running pods using Ephemeral Containers and kubectl debug command.
While in Terminating state, containers of a Pod are not restarted if they end. Whenever a container inside a Pod stops while in Running state, the container is restarted. This is done because a Pod should always be running unless an error occurred.
For more information refer to this document.
How can I clean up the failed and completed pods created by kubernetes job automatically without using cronjob. I want to keep only the last pod created by job.
How can we accomplish that?
...clean up the failed and completed pods created by kubernetes job automatically without using cronjob
If you specify ttlSecondsAfterFinished to the same period as the Job schedule, you should see only the last pod until the next Job starts. You can prolong the duration to keep more pods in the system this way and not wait until they are explicitly delete.
I am running into a situation where, initcontainer execution to completion has to be time bounded. Can someone tell or recommend a strategy to achieve the same? What I have tried till now:
activeDeadlineSeconds - This attribute is supported on Pod but not on ReplicaSet. So, cannot use inside deployment object.
killing initcontainer from inside, when timer expires. This is not working as expected, please refer to link.
progressDeadlineSeconds - This doesnt take into account initcontainers.
One of the solutions could be by Adding lifecycle hooks
Pods also allow you to define two lifecycle hooks:
1: Post-start hooks: K8 docs
Remember: Until the hook completes, the container will stay in the Waiting state with the reason ContainerCreating. Because of this, the pod’s status will be Pending instead of Running. If the hook fails to run or returns a non-zero exit code, the main container will be killed.
2: Pre-stop hooks: K8 docs and pre-stop hook is executed immediately before a container is terminated.
Note:
1: These lifecycle hooks are specified per container, unlike init containers, which apply to the whole pod.
2: As their names suggest, they’re executed when the container starts and before it stops.
I hope this helps you to land you to a new approach!
Is it possible to set a running pod as owner of another pod which is to be created. I tired but in that case pod creation fails.
This is not directly supported by Kubernetes. When you have a Pod that depends on the existence of another one (e.g. needs a Database or similar), you could use a Init Container. This will delay the container start until the init container finishes. This is a good way to apply e.g. waiting conditions.
I think you can use Kubernetes Jobs.
A Job creates one or more Pods and ensures that a specified number of them successfully terminate. As pods successfully complete, the Job tracks the successful completions. When a specified number of successful completions is reached, the task (ie, Job) is complete. Deleting a Job will clean up the Pods it created.
A simple case is to create one Job object in order to reliably run one Pod to completion. The Job object will start a new Pod if the first Pod fails or is deleted (for example due to a node hardware failure or a node reboot).
More information you can find here: jobs-kubernetes.
Context
We have long running kubernetes jobs based on docker containers.
The containers needs resources (eg 15gb memory, 2 cpu) and we use autoscaler to scale up new worker nodes at request.
Scenario
Users can select the version of the docker image to be used for a job, eg 1.0.0, 1.1.0, or even a commit hash of the code the image was build from in test environment.
As we leave the docker tag to be freetext, the user can type a non-existing docker tag. Because of this the job pod comes in ImagePullBackOff state. The pod stays in this state and keeps the resources locked so that they cannot be reused by any other job.
Question
What is the right solution, that can be applied in kubernetes itself, for failing the pod immediately or at least quickly if a pull fails due to a non existing docker image:tag?
Possibilities
I looked into backofflimit. I have set it to 0, but this doesn't fail or remove the job. The resources are of course kept as well.
Maybe they can be killed by a cron job. Not sure how to do so.
Ideally, resources should not even be allocated for a job with an unexisting docker image. But I'm not sure if there is a possibility to easily achieve this.
Any other?
After Looking at your design, I would recommend to add InitContainer to Job specification to check existence of docker images with the given tag.
If the image with the tag doesn't exist in the registry, InitContainer can report an error and fail the Job's Pod by exiting with non-zero exit code.
After that Job's Pod will be restarted. After certain amount of attempts Job will get Failed state. By configuring .spec.ttlSecondsAfterFinished option, failed jobs can be wiped out.
If a Pod’s init container fails, Kubernetes repeatedly restarts the Pod until the init container succeeds. However, if the Pod has a restartPolicy of Never, Kubernetes does not restart the Pod.
If the image exists, InitContainer script exits with zero exit code and the main Job container image is going to be pulled and container starts.
When a Job completes, no more Pods are created, but the Pods are not deleted either.
By default, a Job will run uninterrupted unless a Pod fails (restartPolicy=Never) or a Container exits in error (restartPolicy=OnFailure), at which point the Job defers to the .spec.backoffLimit described above. Once .spec.backoffLimit has been reached the Job will be marked as failed and any running Pods will be terminated.
Another way to terminate a Job is by setting an active deadline. Do this by setting the .spec.activeDeadlineSeconds field of the Job to a number of seconds. The activeDeadlineSeconds applies to the duration of the job, no matter how many Pods are created. Once a Job reaches activeDeadlineSeconds, all of its running Pods are terminated and the Job status will become type: Failed with reason: DeadlineExceeded.
Note that a Job’s .spec.activeDeadlineSeconds takes precedence over its .spec.backoffLimit. Therefore, a Job that is retrying one or more failed Pods will not deploy additional Pods once it reaches the time limit specified by activeDeadlineSeconds, even if the backoffLimit is not yet reached.
Here is more information: jobs.
You can also set-up concurrencyPolicy of cronjob to Replace and replace the currently running job with a new job.
Here is an example:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/2 * * * *"
concurrencyPolicy: Replace
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster && sleep 420
restartPolicy: Never
Setting Replace value for concurrencyPolicy flag means if it is time for a new job run and the previous job run hasn’t finished yet, the cron job replaces the currently running job run with a new job run.
Regardless of this solutions your problem lies in wrong images so automated deleting pods or jobs doesn't solve problem. Because if you don't change anything in definition of jobs and images your pods will still fail after creating job again.
Here is example of troubleshooting for Error: ImagePullBackOff Normal BackOff: ImagePullBackOff .
You can use failedJobsHistoryLimit for failed jobs and successfulJobsHistoryLimit for success jobs
With these two parameters, you can keep your job history clean
.spec.backoffLimit to specify the number of retries before considering a Job as failed.