Confusion about how to update kubernetes jobs - kubernetes

I am eagerly awaiting the release of Kubernetes v1.3 in mid to late June, so that I can access cron scheduling for jobs. In the meantime, what I plan to do is the following:
Deploy a job on my Kubernetes cluster
Use jenkins as a cron tool to trigger the job in defined intervals (e.g. 1 hour).
I have two questions:
How do I update a job? For replication controllers, I would simply do a rolling update, but in the jobs API spec (http://kubernetes.io/docs/user-guide/jobs/) there are no details about how to do this. For example, lets say that I want to use my jenkins deploy system to update the job whenever I do a git commit.
Is it possible to use the kubernetes API to trigger jobs? For example, I have a job that runs and then the pod is terminated on completion. Then, 1 hour later, I want to use jenkins to trigger the job again.
Thanks so much!

I am not sure if there is any fancy way to trigger a completed job, but one way to do it can be to delete and recreate the job.
Re: rolling-update: that is required for long running pods, which is what RCs control.
For jobs: You can update the podTemplateSpec in jobSpec and that will ensure that any new pod created by the job after the update will have the updated podTemplateSpec (note: already running pods will not be affected).
Hope this helps!

Related

Argo Workflows semaphore with value 0

I'm learning about semaphores in the Argo project workflows to avoid concurrent workflows using the same resource.
My use case is that I have several external resources which only one workflow can use one at a time. So far so good, but sometimes the resource needs some maintenance and during that period I don't want to Argo to start any workflow.
I guess I have two options:
I tested manually setting the semaphore value in the configMap to the value 0, but Argo started one workflow anyway.
I can start a workflow that runs forever, until it is deleted, claiming the synchronization lock, but that requires some extra overhead to have workflows running that don't do anything.
So I wonder how it is supposed to work if I set the semaphore value to 0, I think it should not start the workflow then since it says 0. Anyone have any info about this?
This is the steps I carried out:
First I apply my configmap with kubectl -f.
I then submit some workflows and since they all use the same semaphore Argo will start one and the rest will be executed in order one at a time.
I then change value of the semapore with kubectl edit configMap
Submit new job which then Argo will execute.
Perhaps Argo will not reload the configMap when I update the configMap through kubectl edit? I would like to update the configmap programatically in the future but used kubectl edit now for testing.
Quick fix: after applying the ConfigMap change, cycle the workflow-controller pod. That will force it to reload semaphore state.
I couldn't reproduce your exact issue. After using kubectl edit to set the semaphore to 0, any newly submitted workflows remained Pending.
I did encounter an issue where using kubectl edit to bump up the semaphore limit did not automatically kick off any of the Pending workflows. Cycling the workflow controller pod allowed the workflows to start running again.
Besides using the quick fix, I'd recommend submitting an issue. Synchronization is a newer feature, and it's possible it's not 100% robust yet.

Kubernetes CronJob - Do ConcurrencyPolicy and manual job execution/creation communicate with one another?

I have a Kube cronjob that has a concurrencyPolicy of Replace. As I'd have expected, documentation suggests this means if there is a job running when the next cycle in the schedule is met while the previous job is running that the previous job would be killed off / cancelled.
What I want to know is, if I manually kick off a job with kubectl create job --from, does the concurrencyPolicy still play a part? It seems as though the answer is no from the testing I've been doing (and then I'll have multiple concurrent jobs), but would like to confirm.
If I'm correct and they don't work together, is there a way to have this functionality? Basically wanting to be able to deploy a job and then test it without having to wait around for it to kick off, but also don't want to have two jobs running at the same time.
Thanks!

Is it possible to stop a job in Kubernetes without deleting it

Because Kubernetes handles situations where there's a typo in the job spec, and therefore a container image can't be found, by leaving the job in a running state forever, I've got a process that monitors job events to detect cases like this and deletes the job when one occurs.
I'd prefer to just stop the job so there's a record of it. Is there a way to stop a job?
1) According to the K8S documentation here.
Finished Jobs are usually no longer needed in the system. Keeping them around in the system will put pressure on the API server. If the Jobs are managed directly by a higher level controller, such as CronJobs, the Jobs can be cleaned up by CronJobs based on the specified capacity-based cleanup policy.
Here are the details for the failedJobsHistoryLimit property in the CronJobSpec.
This is another way of retaining the details of the failed job for a specific duration. The failedJobsHistoryLimit property can be set based on the approximate number of jobs run per day and the number of days the logs have to be retained. Agree that the Jobs will be still there and put pressure on the API server.
This is interesting. Once the job completes with failure as in the case of a wrong typo for image, the pod is getting deleted and the resources are not blocked or consumed anymore. Not sure exactly what kubectl job stop will achieve in this case. But, when the Job with a proper image is run with success, I can still see the pod in kubectl get pods.
2) Another approach without using the CronJob is to specify the ttlSecondsAfterFinished as mentioned here.
Another way to clean up finished Jobs (either Complete or Failed) automatically is to use a TTL mechanism provided by a TTL controller for finished resources, by specifying the .spec.ttlSecondsAfterFinished field of the Job.
Not really, no such mechanism exists in Kubernetes yet afaik.
You can workaround is to ssh into the machine and run a: (if you're are using Docker)
# Save the logs
$ docker log <container-id-that-is-running-your-job> 2>&1 > save.log
$ docker stop <main-container-id-for-your-job>
It's better to stream log with something like Fluentd, or logspout, or Filebeat and forward the logs to an ELK or EFK stack.
In any case, I've opened this
You can suspend cronjobs by using the suspend attribute. From the Kubernetes documentation:
https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#suspend
Documentation says:
The .spec.suspend field is also optional. If it is set to true, all
subsequent executions are suspended. This setting does not apply to
already started executions. Defaults to false.
So, to pause a cron you could:
run and edit "suspend" from False to True.
kubectl edit cronjob CRON_NAME (if not in default namespace, then add "-n NAMESPACE_NAME" at the end)
you could potentially create a loop using "for" or whatever you like, and have them all changed at once.
you could just save the yaml file locally and then just run:
kubectl create -f cron_YAML
and this would recreate the cron.
The other answers hint around the .spec.suspend solution for the CronJob API, which works, but since the OP asked specifically about Jobs it is worth noting the solution that does not require a CronJob.
As of Kubernetes 1.21, there alpha support for the .spec.suspend field in the Job API as well, (see docs here). The feature is behind the SuspendJob feature gate.

How to run kubernetes pod for a set period of time each day?

I'm looking for a way to deploy a pod on kubernetes to run for a few hours each day. Essentially I want it to run every morning at 8AM and continue running until about 5:30 PM.
I've been researching a lot and haven't found a way to deploy the pod with a specific timeframe in mind. I've found cron jobs, but that seems to be to be for pods that terminate themselves, whereas mine should be running constantly.
Is there any way to deploy my pod on kubernetes this way? Or should I just set up the pod itself to run its intended application based on its internal clock?
According to the Kubernetes architecture, a Job creates one or more pods and ensures that a specified number of them successfully terminate. As pods successfully complete, the job tracks the successful completions. When a specified number of successful completions is reached, the job itself is complete.
In simple words, Jobs run until completion or failure. That's why there is no option to schedule a Cron Job termination in Kubernetes.
In your case, you can start a Cron Job regularly and terminate it using one of the following options:
A better way is to terminate a container by itself, so you can add such functionality to your application or use Cron. More information about how to add Cron to the Docker container, you can find here.
You can use another Cron Job to terminate your Cron Job. You need to run a command inside a Pod to find and delete a Pod related to your Job. For more information, you can look through this link. But it is not a good way, because your Cron Job will always have failed status.
In both cases, you need to check with what status your Cron Job was finished and use the correct RestartPolicy accordingly.
It seems you can implement using a cronjob object,
[ https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#creating-a-cron-job ]

Jenkins trigger job by another which are running on offline node

Is there any way to do the following:
I have 2 jobs. One job on offline node has to trigger the second one. Are there any plugins in Jenkins that can do this. I know that TeamCity has a way of achieving this, but I think that Jenkins is more constrictive
When you configure your node, you can set Availability to Take this slave on-line when in demand and off-line when idle.
Set Usage as Leave this machine for tied jobs only
Finally, configure the job to be executed only on that node.
This way, when the job goes to queue and cannot execute (because the node is offline), Jenkins will try to bring this node online. After the job is finished, the node will go back to offline.
This of course relies on the fact that Jenkins is configured to be able to start this node.
One instance will always be turn on, on which the main job can be run. And have created the job which will look in DB and if in the DB no running instances, it will prepare one node. And the third job after running tests will clean up my environment.