How to share volume mounts between different cron jobs in kubernetes - kubernetes

I have unique use case, where i need to run a bunch of cron jobs every minute. These cron jobs have numerous volume mounts defined within them (50 or so) via PVCs and PVs. Clearly mounting and un-mounting during every cron run is inefficient. Given the constraint that I cannot move away from cron jobs for my use-case, is there a better way to share the volume mounts?
Thanks
K

Related

Kubernetes cluster running Cronjob triggering only one pod

I was trying to find a solution how to run a job handled by 2 pods in a cluster.
The job is ran by the cronjob scheduler, to run every (say) 15 mins. This job is to fetch records from the db table and process it. There is only READ permission provided to access the table records. I am trying to see, is there any way to configure in k8s, that only one pod run the job.
This way I want to prevent the duplicate processing.
The alternate is have a temporary lock file in the persistent storage and the application in the pod puts a lock to it and releases after processing.
If there is any out of box solution available with in k8s, please let me know.
This is implemented using a traditional resource lock mechanism. A lock file is created during the process and the pods do no run if there is any lock file exists.
This way only one pods will run the job any point of time.

How does one get config into volumes for ECS tasks?

I tend to use a bootstrap task which basically puts config in a volume ( gets it from S3 etc ) and then the main task mounts this volume .
Is there a better way to handle this scenario ?
One other way to achieve this would be to map an EFS volume in your tasks. It's a bit more work to add the volume in the task definition (+ creating the volume out of band) but it may be worth doing it for getting rid of the init task. This blog series talk about the why and how of ECS/EFS.

Spring boot scheduler running cron job for each pod

Current Setup
We have kubernetes cluster setup with 3 kubernetes pods which run spring boot application. We run a job every 12 hrs using spring boot scheduler to get some data and cache it.(there is queue setup but I will not go on those details as my query is for the setup before we get to queue)
Problem
Because we have 3 pods and scheduler is at application level , we make 3 calls for data set and each pod gets the response and pod which processes at caches it first becomes the master and other 2 pods replicate the data from that instance.
I see this as a problem because we will increase number of jobs for get more datasets , so this will multiply the number of calls made.
I am not from Devops side and have limited azure knowledge hence I need some help from community
Need
What are the options available to improve this? I want to separate out Cron schedule to run only once and not for each pod
1 - Can I keep cronjob at cluster level , i have read about it here https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/
Will this solve a problem?
2 - I googled and found other option is to run a Cronjob which will schedule a job to completion, will that help and not sure what it really means.
Thanks in Advance to taking out time to read it.
Based on my understanding of your problem, it looks like you have following two choices (at least) -
If you continue to have scheduling logic within your springboot main app, then you may want to explore something like shedlock that helps make sure your scheduled job through app code executes only once via an external lock provider like MySQL, Redis, etc. when the app code is running on multiple nodes (or kubernetes pods in your case).
If you can separate out the scheduler specific app code into its own executable process (i.e. that code can run in separate set of pods than your main application code pods), then you can levarage kubernetes cronjob to schedule kubernetes job that internally creates pods and runs your application logic. Benefit of this approach is that you can use native kubernetes cronjob parameters like concurrency and few others to ensure the job runs only once during scheduled time through single pod.
With approach (1), you get to couple your scheduler code with your main app and run them together in same pods.
With approach (2), you'd have to separate your code (that runs in scheduler) from overall application code, containerize it into its own image, and then configure kubernetes cronjob schedule with this new image referring official guide example and kubernetes cronjob best practices (authored by me but can find other examples).
Both approaches have their own merits and de-merits, so you can evaluate them to suit your needs best.

Does each task in an ECS cluster get its own disk space?

I have an ECS cluster set up. I can launch several tasks that all point to the same task definition, and I see them running with different container runtime id.
I understand that Each Fargate task has its own isolation boundary and does not share the underlying kernel, CPU resources, memory resources, or elastic network interface with another task.
What I want to understand is does each task gets its own disk space as well? Suppose I append logs to a static file (logs/application_logs.txt). Will each running task only have its own logs in that file?
If 3 tasks are running together, will the logs of all 3 tasks end up in logs/application_logs.txt?
What I want to understand is does each task gets its own disk space as
well? Suppose I append logs to a static file
(logs/application_logs.txt). Will each running task only have its own
logs in that file?
Each fargate task gets its own storage space allocated. This is unique to the task. Each running task will only have its own logs in that particular location.
If 3 tasks are running together, will the logs of all 3 tasks end up in >logs/application_logs.txt?
No they will not. You can add data volumes to tasks that can be shared between multiple tasks if you wanted to. See https://docs.aws.amazon.com/AmazonECS/latest/developerguide/fargate-task-storage.html for more info.
Quoting the doc:
When provisioned, each Fargate task receives the following storage.
Task storage is ephemeral. After a Fargate task stops, the storage is
deleted.
10 GB of Docker layer storage
An additional 4 GB for volume mounts. This can be mounted and shared among containers using the volumes, mountPoints and volumesFrom
parameters in the task definition.
Yes, if you have provisioned storage inside volumes section of task definition then your task gets a non-persistent storage.
If you're appending logs to a file and there are 3 tasks running then I guess each one of them will have their own log file.

How to run kubernetes pod for a set period of time each day?

I'm looking for a way to deploy a pod on kubernetes to run for a few hours each day. Essentially I want it to run every morning at 8AM and continue running until about 5:30 PM.
I've been researching a lot and haven't found a way to deploy the pod with a specific timeframe in mind. I've found cron jobs, but that seems to be to be for pods that terminate themselves, whereas mine should be running constantly.
Is there any way to deploy my pod on kubernetes this way? Or should I just set up the pod itself to run its intended application based on its internal clock?
According to the Kubernetes architecture, a Job creates one or more pods and ensures that a specified number of them successfully terminate. As pods successfully complete, the job tracks the successful completions. When a specified number of successful completions is reached, the job itself is complete.
In simple words, Jobs run until completion or failure. That's why there is no option to schedule a Cron Job termination in Kubernetes.
In your case, you can start a Cron Job regularly and terminate it using one of the following options:
A better way is to terminate a container by itself, so you can add such functionality to your application or use Cron. More information about how to add Cron to the Docker container, you can find here.
You can use another Cron Job to terminate your Cron Job. You need to run a command inside a Pod to find and delete a Pod related to your Job. For more information, you can look through this link. But it is not a good way, because your Cron Job will always have failed status.
In both cases, you need to check with what status your Cron Job was finished and use the correct RestartPolicy accordingly.
It seems you can implement using a cronjob object,
[ https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#creating-a-cron-job ]