I am using spring-boot-starter-quartz 2.2.1.RELEASE to schedule Quartz jobs.And I've deploy my code on two nodes.
And the quartz.properties is like this:
For node one:
org.quartz.scheduler.instanceName: machine1
org.quartz.scheduler.instanceId = AUTO
For node two:
org.quartz.scheduler.instanceName: machine2
org.quartz.scheduler.instanceId = AUTO
So in this situation, each node can run the same scanning job separately.
And now in my database in "qrtz_job_details", I can have two job records ,namely scanJobbyMachine1 and scanJobbyMachine2.
I also deployed an frontend UI on node1 that have RESTful API to schedule jobs. And I use nginx to randomly send my request to one of my nodes.
If I made a request to query all jobs, the request may be sent to node1 and only node1's job will be shown. But I want to show both node1 and node2's jobs.
If I made a request to update scanJobbyMachine1, and it may be sent to node2. And update can't be made, because node2 only have properties file whose instanceName is machine2.
Here is my plan:
Plan A:use cluster mode. But Quartz doesn't support "Allow Job Excution to be pin to a cluster node" yet. So in cluster mode, my job will only be excuted by one node. But I want both node to do the scanning jobs.
here is the issue link in github
Plan B:use Non cluster mode. Then I have to write duplicate APIs in Controller like this:
localhost:8090/machine0/updateJob
localhost:8090/machine1/updateJob
And use nginx to set when I request /machine0/updateJob, send it to 10.110.200.60(machine1's ip), when I request /machine1/updateJob, send it to 10.110.200.62(machine2's ip)
And for queryAllJobs I have to use my backend to send request to 10.110.200.60 and 10.110.200.62 first, and combine the response list in my backend, then show it in the frontend.
Plan C:write another backend with two properties files. just to schedule the jobs and don't excute these jobs (I don't know if this can work) and depoyed it on these two nodes.
I really don't want write and deploy another backend like Plan C or write duplicate APIs like Plan B.
Any good ideas?
your problem is a cluster problem :)
Perhaps you could dynamically configure your job registration: as node1 start, it will be alone on the cluster and run jobs only on this one. As node2 start, the two nodes can be called.
Related
Deployed Rundeck (rundeck/rundeck:4.2.0) importing and discovering my inventory using Ansible Resource Model Source. Having 300 nodes, out of which statistically ~150 are accessible/online, the rest is offline (IOT devices). All working fine.
My challenge is when creating jobs i can assign only those nodes which are online, while i wanted to assign ALL nodes (including those offline) and keep retrying the job for the failed ones only. Only this way i could track the completeness of my deployment. Ideally i would love rundeck to be intelligent enough to automatically deploy the job as soon as my node goes back online.
Any ideas/hints how to achieve that ?
Thanks,
The easiest way is to use the health checks feature (only available on PagerDuty Process Automation On-Prem, formerly "Rundeck Enterprise"), in that way you can use a node filter only for "healthy" (up) nodes.
Using this approach (e.g: configuring a command health check against all nodes) you can dispatch your jobs only for "up" nodes (from a global set of nodes), this is possible using the .* as node filter and !healthcheck:status: HEALTHY as exclude node filter. If any "offline" node "turns on", the filter/exclude filter should work automatically.
On Ansible/Rundeck integration it works using the following environment variable: ANSIBLE_HOST_KEY_CHECKING=False or using host_key_checking=false on the ansible.cfg file (at [defaults] section).
In that way, you can see all ansible hosts in your Rundeck nodes, and your commands/jobs should be dispatched only for online nodes, if any "offline" node changes their status, the filter should work.
Current Setup
We have kubernetes cluster setup with 3 kubernetes pods which run spring boot application. We run a job every 12 hrs using spring boot scheduler to get some data and cache it.(there is queue setup but I will not go on those details as my query is for the setup before we get to queue)
Problem
Because we have 3 pods and scheduler is at application level , we make 3 calls for data set and each pod gets the response and pod which processes at caches it first becomes the master and other 2 pods replicate the data from that instance.
I see this as a problem because we will increase number of jobs for get more datasets , so this will multiply the number of calls made.
I am not from Devops side and have limited azure knowledge hence I need some help from community
Need
What are the options available to improve this? I want to separate out Cron schedule to run only once and not for each pod
1 - Can I keep cronjob at cluster level , i have read about it here https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/
Will this solve a problem?
2 - I googled and found other option is to run a Cronjob which will schedule a job to completion, will that help and not sure what it really means.
Thanks in Advance to taking out time to read it.
Based on my understanding of your problem, it looks like you have following two choices (at least) -
If you continue to have scheduling logic within your springboot main app, then you may want to explore something like shedlock that helps make sure your scheduled job through app code executes only once via an external lock provider like MySQL, Redis, etc. when the app code is running on multiple nodes (or kubernetes pods in your case).
If you can separate out the scheduler specific app code into its own executable process (i.e. that code can run in separate set of pods than your main application code pods), then you can levarage kubernetes cronjob to schedule kubernetes job that internally creates pods and runs your application logic. Benefit of this approach is that you can use native kubernetes cronjob parameters like concurrency and few others to ensure the job runs only once during scheduled time through single pod.
With approach (1), you get to couple your scheduler code with your main app and run them together in same pods.
With approach (2), you'd have to separate your code (that runs in scheduler) from overall application code, containerize it into its own image, and then configure kubernetes cronjob schedule with this new image referring official guide example and kubernetes cronjob best practices (authored by me but can find other examples).
Both approaches have their own merits and de-merits, so you can evaluate them to suit your needs best.
I have a parallel Kubernetes job with 1 pod per work item (I set parallelism to a fixed number in the job YAML).
All I really need is an ID per pod to know which work item to do, but Kubernetes doesn't support this yet (if there's a workaround I want to know).
Therefore I need a message queue to coordinate between pods. I've successfully followed the example in the Kubernetes documentation here: https://kubernetes.io/docs/tasks/job/coarse-parallel-processing-work-queue/
However, the example there creates a rabbit-mq service. I typically deploy my tasks as a job. I don't know how the lifecycle of a job compares with the lifecycle of a service.
It seems like that example is creating a permanent message queue service. But I only need the message queue to be in existence for the lifecycle of the job.
It's not clear to me if I need to use a service, or if I should be creating the rabbit-mq container as part of my job (and if so how that works with parallelism).
I am trying to see if rundeck is capable of deciding if a node is currently busy runnning another job and it will switch to another node and ruin the job there instead.
For example, I am current running a job on NODE1, then another person logs into rundeck and decide to run their job on NODE1, but NODE1 is busy running my job so rundeck will automatically run their job on NODE2.
Thanks
This could be possible assuming the following design:
1) Create/use a resource model source plugin that sets attributes about the node's business. This can be a metric like load status or something else that you use to gauge rundeck utilization.
2) Write the job with a nodefilter to match on the attribute about utilization metric
3 ) Define the job to use the RandomSubset Orchestration strategy specifying to use 1 node.
Is there any way to do the following:
I have 2 jobs. One job on offline node has to trigger the second one. Are there any plugins in Jenkins that can do this. I know that TeamCity has a way of achieving this, but I think that Jenkins is more constrictive
When you configure your node, you can set Availability to Take this slave on-line when in demand and off-line when idle.
Set Usage as Leave this machine for tied jobs only
Finally, configure the job to be executed only on that node.
This way, when the job goes to queue and cannot execute (because the node is offline), Jenkins will try to bring this node online. After the job is finished, the node will go back to offline.
This of course relies on the fact that Jenkins is configured to be able to start this node.
One instance will always be turn on, on which the main job can be run. And have created the job which will look in DB and if in the DB no running instances, it will prepare one node. And the third job after running tests will clean up my environment.