I was trying to find a solution how to run a job handled by 2 pods in a cluster.
The job is ran by the cronjob scheduler, to run every (say) 15 mins. This job is to fetch records from the db table and process it. There is only READ permission provided to access the table records. I am trying to see, is there any way to configure in k8s, that only one pod run the job.
This way I want to prevent the duplicate processing.
The alternate is have a temporary lock file in the persistent storage and the application in the pod puts a lock to it and releases after processing.
If there is any out of box solution available with in k8s, please let me know.
This is implemented using a traditional resource lock mechanism. A lock file is created during the process and the pods do no run if there is any lock file exists.
This way only one pods will run the job any point of time.
Related
I'm getting stuck in the configuration of a deployment. The problem is the following.
The application in the deployment is using a database which is stored in a file. While this database is open, it's locked (there's no way for read/write access for many).
If I delete the running POD the new one can't get in ready state, because the database is still locked. I read about preStop-Hook and tried to use it without success.
I could delete the lock file, which seems to be pretty harsh. What's the right way to solve this in Kubernetes?
This really isn't different than running this process outside of Kubernetes. When the pod is killed, it will be given a chance to shutdown cleanly. So the lock should be cleaned up. If the lock isn't cleaned up, there's not a lot of ways you can determined if the lock remains because an unclean shutdown was made, or a node is unhealthy, or if there is a network partition. So deleting the lock at pod startup does seem to be unwise.
I think the first step for you should be trying to determine why this lock file isn't getting cleaned up correctly. (Rather than trying to address the symptom.)
Current Setup
We have kubernetes cluster setup with 3 kubernetes pods which run spring boot application. We run a job every 12 hrs using spring boot scheduler to get some data and cache it.(there is queue setup but I will not go on those details as my query is for the setup before we get to queue)
Problem
Because we have 3 pods and scheduler is at application level , we make 3 calls for data set and each pod gets the response and pod which processes at caches it first becomes the master and other 2 pods replicate the data from that instance.
I see this as a problem because we will increase number of jobs for get more datasets , so this will multiply the number of calls made.
I am not from Devops side and have limited azure knowledge hence I need some help from community
Need
What are the options available to improve this? I want to separate out Cron schedule to run only once and not for each pod
1 - Can I keep cronjob at cluster level , i have read about it here https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/
Will this solve a problem?
2 - I googled and found other option is to run a Cronjob which will schedule a job to completion, will that help and not sure what it really means.
Thanks in Advance to taking out time to read it.
Based on my understanding of your problem, it looks like you have following two choices (at least) -
If you continue to have scheduling logic within your springboot main app, then you may want to explore something like shedlock that helps make sure your scheduled job through app code executes only once via an external lock provider like MySQL, Redis, etc. when the app code is running on multiple nodes (or kubernetes pods in your case).
If you can separate out the scheduler specific app code into its own executable process (i.e. that code can run in separate set of pods than your main application code pods), then you can levarage kubernetes cronjob to schedule kubernetes job that internally creates pods and runs your application logic. Benefit of this approach is that you can use native kubernetes cronjob parameters like concurrency and few others to ensure the job runs only once during scheduled time through single pod.
With approach (1), you get to couple your scheduler code with your main app and run them together in same pods.
With approach (2), you'd have to separate your code (that runs in scheduler) from overall application code, containerize it into its own image, and then configure kubernetes cronjob schedule with this new image referring official guide example and kubernetes cronjob best practices (authored by me but can find other examples).
Both approaches have their own merits and de-merits, so you can evaluate them to suit your needs best.
i am trying to setup a complete GitLab Routine to setup my Kubernetes Cluster with all installations and configurations automatically incl. decommissioning at the end.
However the Creation and Decommissioning Progress is one of the most time consuming because i am basically waiting for the provisioning till i can execute further commands.
as i have some times troubles in the bases of the Kubernetes Setup, i currently decomission my cluster and create a new one. But this is pretty un-economical and time consuming.
Question:
Is there a command or a series of commands to completely reset a Kubernetes to his state after creation ?
The closest is probably to do all your work in a new namespace and then delete that namespace when you are done. That automatically deletes all objects inside it.
Because Kubernetes handles situations where there's a typo in the job spec, and therefore a container image can't be found, by leaving the job in a running state forever, I've got a process that monitors job events to detect cases like this and deletes the job when one occurs.
I'd prefer to just stop the job so there's a record of it. Is there a way to stop a job?
1) According to the K8S documentation here.
Finished Jobs are usually no longer needed in the system. Keeping them around in the system will put pressure on the API server. If the Jobs are managed directly by a higher level controller, such as CronJobs, the Jobs can be cleaned up by CronJobs based on the specified capacity-based cleanup policy.
Here are the details for the failedJobsHistoryLimit property in the CronJobSpec.
This is another way of retaining the details of the failed job for a specific duration. The failedJobsHistoryLimit property can be set based on the approximate number of jobs run per day and the number of days the logs have to be retained. Agree that the Jobs will be still there and put pressure on the API server.
This is interesting. Once the job completes with failure as in the case of a wrong typo for image, the pod is getting deleted and the resources are not blocked or consumed anymore. Not sure exactly what kubectl job stop will achieve in this case. But, when the Job with a proper image is run with success, I can still see the pod in kubectl get pods.
2) Another approach without using the CronJob is to specify the ttlSecondsAfterFinished as mentioned here.
Another way to clean up finished Jobs (either Complete or Failed) automatically is to use a TTL mechanism provided by a TTL controller for finished resources, by specifying the .spec.ttlSecondsAfterFinished field of the Job.
Not really, no such mechanism exists in Kubernetes yet afaik.
You can workaround is to ssh into the machine and run a: (if you're are using Docker)
# Save the logs
$ docker log <container-id-that-is-running-your-job> 2>&1 > save.log
$ docker stop <main-container-id-for-your-job>
It's better to stream log with something like Fluentd, or logspout, or Filebeat and forward the logs to an ELK or EFK stack.
In any case, I've opened this
You can suspend cronjobs by using the suspend attribute. From the Kubernetes documentation:
https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#suspend
Documentation says:
The .spec.suspend field is also optional. If it is set to true, all
subsequent executions are suspended. This setting does not apply to
already started executions. Defaults to false.
So, to pause a cron you could:
run and edit "suspend" from False to True.
kubectl edit cronjob CRON_NAME (if not in default namespace, then add "-n NAMESPACE_NAME" at the end)
you could potentially create a loop using "for" or whatever you like, and have them all changed at once.
you could just save the yaml file locally and then just run:
kubectl create -f cron_YAML
and this would recreate the cron.
The other answers hint around the .spec.suspend solution for the CronJob API, which works, but since the OP asked specifically about Jobs it is worth noting the solution that does not require a CronJob.
As of Kubernetes 1.21, there alpha support for the .spec.suspend field in the Job API as well, (see docs here). The feature is behind the SuspendJob feature gate.
I'm looking for a way to deploy a pod on kubernetes to run for a few hours each day. Essentially I want it to run every morning at 8AM and continue running until about 5:30 PM.
I've been researching a lot and haven't found a way to deploy the pod with a specific timeframe in mind. I've found cron jobs, but that seems to be to be for pods that terminate themselves, whereas mine should be running constantly.
Is there any way to deploy my pod on kubernetes this way? Or should I just set up the pod itself to run its intended application based on its internal clock?
According to the Kubernetes architecture, a Job creates one or more pods and ensures that a specified number of them successfully terminate. As pods successfully complete, the job tracks the successful completions. When a specified number of successful completions is reached, the job itself is complete.
In simple words, Jobs run until completion or failure. That's why there is no option to schedule a Cron Job termination in Kubernetes.
In your case, you can start a Cron Job regularly and terminate it using one of the following options:
A better way is to terminate a container by itself, so you can add such functionality to your application or use Cron. More information about how to add Cron to the Docker container, you can find here.
You can use another Cron Job to terminate your Cron Job. You need to run a command inside a Pod to find and delete a Pod related to your Job. For more information, you can look through this link. But it is not a good way, because your Cron Job will always have failed status.
In both cases, you need to check with what status your Cron Job was finished and use the correct RestartPolicy accordingly.
It seems you can implement using a cronjob object,
[ https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#creating-a-cron-job ]