Architecture a service on Kubernetes - kubernetes

I have a UI where I can start machine learning jobs. When a job is requested, a message is added to a PubSub (kafka) and pulled by the service that will run the job.
I have a problem with this service design. I was thinking about creating the main service on Kubernetes that will pull messages from PubSub then this main service would create pods (or rather jobs) to run the actual ML work.
However, I don't know how to make the main service monitor the "worker" jobs it creates. Do I have to do it manually by persisting the ID of the job somewhere and monitoring it? Also how to deal with the "main" service potential failure?
I feel like this is a "classic" use case but I can't find much about how to solve this.
Thanks for your help

Related

How to create a cron job in a Kubernetes deployed app without duplicates?

I am trying to find a solution to run a cron job in a Kubernetes-deployed app without unwanted duplicates. Let me describe my scenario, to give you a little bit of context.
I want to schedule jobs that execute once at a specified date. More precisely: creating such a job can happen anytime and its execution date will be known only at that time. The job that needs to be done is always the same, but it needs parametrization.
My application is running inside a Kubernetes cluster, and I cannot assume that there always will be only one instance of it running at the any moment in time. Therefore, creating the said job will lead to multiple executions of it due to the fact that all of my application instances will spawn it. However, I want to guarantee that a job runs exactly once in the whole cluster.
I tried to find solutions for this problem and came up with the following ideas.
Create a local file and check if it is already there when starting a new job. If it is there, cancel the job.
Not possible in my case, since the duplicate jobs might run on other machines!
Utilize the Kubernetes CronJob API.
I cannot use this feature because I have to create cron jobs dynamically from inside my application. I cannot change the cluster configuration from a pod running inside that cluster. Maybe there is a way, but it seems to me there have to be a better solution than giving the application access to the cluster it is running in.
Would you please be as kind as to give me any directions at which I might find a solution?
I am using a managed Kubernetes Cluster on Digital Ocean (Client Version: v1.22.4, Server Version: v1.21.5).
After thinking about a solution for a rather long time I found it.
The solution is to take the scheduling of the jobs to a central place. It is as easy as building a job web service that exposes endpoints to create jobs. An instance of a backend creating a job at this service will also provide a callback endpoint in the request which the job web service will call at the execution date and time.
The endpoint in my case links back to the calling backend server which carries the logic to be executed. It would be rather tedious to make the job service execute the logic directly since there are a lot of dependencies involved in the job. I keep a separate database in my job service just to store information about whom to call and how. Addressing the startup after crash problem becomes trivial since there is only one instance of the job web service and it can just re-create the jobs normally after retrieving them from the database in case the service crashed.
Do not forget to take care of failing jobs. If your backends are not reachable for some reason to take the callback, there must be some reconciliation mechanism in place that will prevent this failure from staying unnoticed.
A little note I want to add: In case you also want to scale the job service horizontally you run into very similar problems again. However, if you think about what is the actual work to be done in that service, you realize that it is very lightweight. I am not sure if horizontal scaling is ever a requirement, since it is only doing requests at specified times and is not executing heavy work.

Best practice when deplyoying a Flink Job Cluster on Kubernetes regarding savepointing and updating the job

I am looking into a deploying a Flink job on Kubernetes. When looking through the documentations I'm having a hard time coming up with what the best practices are regarding how to deploy the job specifically when the job has to maintain state.
There are two main points regarding this job:
It is a streaming job dealing with unbounded data (never ending stream)
Keeps and uses state that needs to be maintained over different job versions
Currently, we are running on Hadoop. There it is quite easy when you want to deploy a new version of the job and keep state. The steps are: cancel the job with savepoint, then deploy a new job and point to that savepoint.
Kubernetes:
Based on the definitions, it seems that for our use case a Job Cluster is the best fit for the requirements. There will only be one job running on this cluster.
The issue with the Kubernetes setup is that the savepoint location needs to be added as an argument to the Deployment. In the case that a pod is taken offline, it will restart the application with the original savepoint in the Deployment. Specifically this will reset the Kafka offset to whenever the job was deployed and reprocess a lot of data.
In addition to that, how would i go about canceling a job with savepoint when running on a Job cluster from something like ci/cd? Would i need to create another deployer pod and use the rest api?
What is the best practice regarding deploying a stateful Flink job on kubernetes and upgrading it without losing the state?

Spring boot scheduler running cron job for each pod

Current Setup
We have kubernetes cluster setup with 3 kubernetes pods which run spring boot application. We run a job every 12 hrs using spring boot scheduler to get some data and cache it.(there is queue setup but I will not go on those details as my query is for the setup before we get to queue)
Problem
Because we have 3 pods and scheduler is at application level , we make 3 calls for data set and each pod gets the response and pod which processes at caches it first becomes the master and other 2 pods replicate the data from that instance.
I see this as a problem because we will increase number of jobs for get more datasets , so this will multiply the number of calls made.
I am not from Devops side and have limited azure knowledge hence I need some help from community
Need
What are the options available to improve this? I want to separate out Cron schedule to run only once and not for each pod
1 - Can I keep cronjob at cluster level , i have read about it here https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/
Will this solve a problem?
2 - I googled and found other option is to run a Cronjob which will schedule a job to completion, will that help and not sure what it really means.
Thanks in Advance to taking out time to read it.
Based on my understanding of your problem, it looks like you have following two choices (at least) -
If you continue to have scheduling logic within your springboot main app, then you may want to explore something like shedlock that helps make sure your scheduled job through app code executes only once via an external lock provider like MySQL, Redis, etc. when the app code is running on multiple nodes (or kubernetes pods in your case).
If you can separate out the scheduler specific app code into its own executable process (i.e. that code can run in separate set of pods than your main application code pods), then you can levarage kubernetes cronjob to schedule kubernetes job that internally creates pods and runs your application logic. Benefit of this approach is that you can use native kubernetes cronjob parameters like concurrency and few others to ensure the job runs only once during scheduled time through single pod.
With approach (1), you get to couple your scheduler code with your main app and run them together in same pods.
With approach (2), you'd have to separate your code (that runs in scheduler) from overall application code, containerize it into its own image, and then configure kubernetes cronjob schedule with this new image referring official guide example and kubernetes cronjob best practices (authored by me but can find other examples).
Both approaches have their own merits and de-merits, so you can evaluate them to suit your needs best.

How do you create a message queue service for the scope of a specific Kubernetes job

I have a parallel Kubernetes job with 1 pod per work item (I set parallelism to a fixed number in the job YAML).
All I really need is an ID per pod to know which work item to do, but Kubernetes doesn't support this yet (if there's a workaround I want to know).
Therefore I need a message queue to coordinate between pods. I've successfully followed the example in the Kubernetes documentation here: https://kubernetes.io/docs/tasks/job/coarse-parallel-processing-work-queue/
However, the example there creates a rabbit-mq service. I typically deploy my tasks as a job. I don't know how the lifecycle of a job compares with the lifecycle of a service.
It seems like that example is creating a permanent message queue service. But I only need the message queue to be in existence for the lifecycle of the job.
It's not clear to me if I need to use a service, or if I should be creating the rabbit-mq container as part of my job (and if so how that works with parallelism).

Kubernetes dynamic Job scaling

I’m finally dipping my toes in the kubernetes pool and wanted to get some advice on the best way to approach a problem I have:
Tech we are using:
GCP
GKE
GCP Pub/Sub
We need to do bursts of batch processing spread out across a fleet and have decided on the following approach:
New raw data flows in
A node analyses this and breaks the data up into manageable portions which are pushed onto a queue
We have a cluster with Autoscaling On and Min Size ‘0’
A Kubernetes job spins up a pod for each new message on this cluster
When pods can’t pull anymore messages they terminate successfully
The question is:
What is the standard approach for triggering jobs such as this?
Do you create a new job each time or are jobs meant to be long lived and re-run?
I have only seen examples of using a yaml file however we would probably want the node which did the portioning of work to create the job as it knows how many parallel pods should be run. Would it be recommended to use the python sdk to create the job spec programatically? Or if jobs are long lived would you simply hit the k8 api and modify the parallel pods required then re-run job?
Jobs in Kubernetes are meant to be short-lived and are not designed to be reused. Jobs are designed for run-once, run-to-completion workloads. Typically they are be assigned a specific task, i.e. to process a single queue item.
However, if you want to process multiple items in a work queue with a single instance then it is generally advisable to instead use a Deployment to scale a pool of workers that continue to process items in the queue, scaling the number of pool workers dependent on the number of items in the queue. If there are no work items remaining then you can scale the deployment to 0 replicas, scaling back up when there is work to be done.
To create and control your workloads in Kubernetes the best-practice would be to use the Kubernetes SDK. While you can generate YAML files and shell out to another tool like kubectl using the SDK simplifies configuration and error handling, as well as allowing for simplified introspection of resources in the cluster as well.