Job/Task in Kubernetes and Spring Cloud Task - kubernetes

I created a Pod that have #EnableTaskLauncher with spring-cloud-deployer-kubernetes. It is receiving task requests through spring-cloud-stream and launching the tasks.
Everything is working perfectly except that I want the task to be launched as Kind: Job instead of Kind: Deployment .
I could not find any configuration or property in spring-cloud-deployer-kubernetes that do this or if it is available .

We moved away from the Jobs to Bare-pods model for Spring Cloud Task (in SCDF) to better control its lifecycle such as the clean shutdown of the container when the SCT-operation is complete.
However, there's spring-cloud/spring-cloud-deployer-kubernetes#163 that adds an option to choose between Jobs vs. Pods for Tasks. Please try it out and give us feedback on the PR.

Related

Architecture a service on Kubernetes

I have a UI where I can start machine learning jobs. When a job is requested, a message is added to a PubSub (kafka) and pulled by the service that will run the job.
I have a problem with this service design. I was thinking about creating the main service on Kubernetes that will pull messages from PubSub then this main service would create pods (or rather jobs) to run the actual ML work.
However, I don't know how to make the main service monitor the "worker" jobs it creates. Do I have to do it manually by persisting the ID of the job somewhere and monitoring it? Also how to deal with the "main" service potential failure?
I feel like this is a "classic" use case but I can't find much about how to solve this.
Thanks for your help

Best practice when deplyoying a Flink Job Cluster on Kubernetes regarding savepointing and updating the job

I am looking into a deploying a Flink job on Kubernetes. When looking through the documentations I'm having a hard time coming up with what the best practices are regarding how to deploy the job specifically when the job has to maintain state.
There are two main points regarding this job:
It is a streaming job dealing with unbounded data (never ending stream)
Keeps and uses state that needs to be maintained over different job versions
Currently, we are running on Hadoop. There it is quite easy when you want to deploy a new version of the job and keep state. The steps are: cancel the job with savepoint, then deploy a new job and point to that savepoint.
Kubernetes:
Based on the definitions, it seems that for our use case a Job Cluster is the best fit for the requirements. There will only be one job running on this cluster.
The issue with the Kubernetes setup is that the savepoint location needs to be added as an argument to the Deployment. In the case that a pod is taken offline, it will restart the application with the original savepoint in the Deployment. Specifically this will reset the Kafka offset to whenever the job was deployed and reprocess a lot of data.
In addition to that, how would i go about canceling a job with savepoint when running on a Job cluster from something like ci/cd? Would i need to create another deployer pod and use the rest api?
What is the best practice regarding deploying a stateful Flink job on kubernetes and upgrading it without losing the state?

Spring boot scheduler running cron job for each pod

Current Setup
We have kubernetes cluster setup with 3 kubernetes pods which run spring boot application. We run a job every 12 hrs using spring boot scheduler to get some data and cache it.(there is queue setup but I will not go on those details as my query is for the setup before we get to queue)
Problem
Because we have 3 pods and scheduler is at application level , we make 3 calls for data set and each pod gets the response and pod which processes at caches it first becomes the master and other 2 pods replicate the data from that instance.
I see this as a problem because we will increase number of jobs for get more datasets , so this will multiply the number of calls made.
I am not from Devops side and have limited azure knowledge hence I need some help from community
Need
What are the options available to improve this? I want to separate out Cron schedule to run only once and not for each pod
1 - Can I keep cronjob at cluster level , i have read about it here https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/
Will this solve a problem?
2 - I googled and found other option is to run a Cronjob which will schedule a job to completion, will that help and not sure what it really means.
Thanks in Advance to taking out time to read it.
Based on my understanding of your problem, it looks like you have following two choices (at least) -
If you continue to have scheduling logic within your springboot main app, then you may want to explore something like shedlock that helps make sure your scheduled job through app code executes only once via an external lock provider like MySQL, Redis, etc. when the app code is running on multiple nodes (or kubernetes pods in your case).
If you can separate out the scheduler specific app code into its own executable process (i.e. that code can run in separate set of pods than your main application code pods), then you can levarage kubernetes cronjob to schedule kubernetes job that internally creates pods and runs your application logic. Benefit of this approach is that you can use native kubernetes cronjob parameters like concurrency and few others to ensure the job runs only once during scheduled time through single pod.
With approach (1), you get to couple your scheduler code with your main app and run them together in same pods.
With approach (2), you'd have to separate your code (that runs in scheduler) from overall application code, containerize it into its own image, and then configure kubernetes cronjob schedule with this new image referring official guide example and kubernetes cronjob best practices (authored by me but can find other examples).
Both approaches have their own merits and de-merits, so you can evaluate them to suit your needs best.

Conditionally launch Spring Cloud Task on a specific node of Kubernetes cluster

I am building a data pipeline for batch processing. And I find that Spring Cloud Data Flow is a quite attractive framework to use. Without much knowledge in SCDF and Kubernetes, I am not sure whether it is possible to conditionally launch a Spring Cloud Task on a specific machine.
Suppose I have two physical servers that are for running the batch process (Server A and Server B). By default, I would like my Spring cloud task to be launched on Server A. If the Server A is shut down, the task should be deployed on server B. Can Kubernetes / SCDF handle this kind of mechanism? I am wondering whether the nodeselector is the thing that I should look into.
Yes, you can pass deployment.nodeSelector as a deployment property when launching the task.
The deployment.nodeSelector is a Kubernetes deployment property and hence, you need to pass something like this:
task launch mytask --properties "deployer.<taskAppName>.kubernetes.deployment.nodeSelector=foo1:bar1,foo2:bar2"
You can check the list of supported Kubernetes deployer properties here

Confusion about how to update kubernetes jobs

I am eagerly awaiting the release of Kubernetes v1.3 in mid to late June, so that I can access cron scheduling for jobs. In the meantime, what I plan to do is the following:
Deploy a job on my Kubernetes cluster
Use jenkins as a cron tool to trigger the job in defined intervals (e.g. 1 hour).
I have two questions:
How do I update a job? For replication controllers, I would simply do a rolling update, but in the jobs API spec (http://kubernetes.io/docs/user-guide/jobs/) there are no details about how to do this. For example, lets say that I want to use my jenkins deploy system to update the job whenever I do a git commit.
Is it possible to use the kubernetes API to trigger jobs? For example, I have a job that runs and then the pod is terminated on completion. Then, 1 hour later, I want to use jenkins to trigger the job again.
Thanks so much!
I am not sure if there is any fancy way to trigger a completed job, but one way to do it can be to delete and recreate the job.
Re: rolling-update: that is required for long running pods, which is what RCs control.
For jobs: You can update the podTemplateSpec in jobSpec and that will ensure that any new pod created by the job after the update will have the updated podTemplateSpec (note: already running pods will not be affected).
Hope this helps!