We are planning to create a scheduler module. There is an external service which provides all the necessary tasks for the scheduler and the scheduler must invoke these tasks at intervals. Scheduler has various use cases and the scheduling would change drastically and dynamically based for data from the data store. We do not want to include this functionality as a part of the system(in play framework) which we are currently building but rather have a standalone process to execute the scheduling.
Considering the above mentioned use case, would Akka MicroKernel serve the purpose or should we think of deploying the scheduler in some other container or application server(If so which one would guys prefer)?
Related
I’m setting up a new Spring Batch Jobs and want to deploy it using SCDF. However, I have found that SCDF does not support scheduler feature in local framework.
I have 3 questions to ask you:
Can someone explain how scheduler of SCDF work?
Are there any ways to schedule 1 job using SCDF?
Can I use my local server as a Cloud Foundry? and how?
Yes, Spring Cloud Data Flow does not support scheduling on local platform. Please note that the local SCDF server is for development purposes only and by design, the scheduling support is intended to be relying on the platform. Hence, SCDF scheduling feature is supported on Cloud Foundry and Kubernetes using the CF and K8s schedulers.
1) Can s/o explain how scheduler of SCDF work?
sure, Similar to how the deployer is used for launching task/deploying the stream, there is an SPI for scheduling the tasks under spring-cloud-deployer project. The underlying scheduler implementations can implement this. Currently, we have CF and K8s scheduler implementations in spring-cloud-deployer-cloudfoundry and spring-cloud-deployer-kubernetes.
As a user, you can configure a scheduler for a task (batch) application (via SCDF Dashboard, shell etc.,). You can specify a cron expression to schedule the task. Once configured, the SCDF delegates the schedule request to the platform scheduler using the above-mentioned scheduler implementations. Once scheduled, it is the platform (PCF scheduler on CF, K8s scheduler on K8s) that takes care of the task using the schedule.
2) Are there any ways to schedule 1 job using SCDF?
Yes, based on the answer from 1
3) Can I use my local server as a cloud Foundry? and How?
To run SCDF on local pointing to the CF instance, you can set the necessary CF deployer properties and start the SCDF server instance. It is similar to how you configure multi platforms in SCDF server. You can find more documentation on this here.
I would like to run a sequence of Kubernetes jobs one after another. It's okay if they are run on different nodes, but it's important that each one run to completion before the next one starts. Is there anything built into Kubernetes to facilitate this? Other architecture recommendations also welcome!
This requirement to add control flow, even if it's a simple sequential flow, is outside the scope of Kubernetes native entities as far as I know.
There are many workflow engine implementations for Kubernetes, most of them are focusing on solving CI/CD but are generic enough for you to use however you want.
Argo: https://applatix.com/open-source/argo/
Added a custom resource deginition in Kubernetes entity for Workflow
Brigade: https://brigade.sh/
Takes a more serverless like approach and is built on Javascript which is very flexible
Codefresh: https://codefresh.io
Has a unique approach where you can use the SaaS to easily get started without complicated installation and maintenance, and you can point Codefresh at your Kubernetes nodes to run the workflow on.
Feel free to Google for "Kubernetes Workflow", and discover the right platform for yourself.
Disclaimer: I work at Codefresh
I would try to use cronjobs and set the concurrency policy to forbid so it doesn't run concurrent jobs.
I have worked on IBM TWS (Workload Automation) which is a scheduler similar to cronjob where you can mention the dependencies of the jobs.
You can specify a job to run only after it's dependencies has run using follows keyword.
Are there any open source Job Scheduler with REST API for commercial use which will support features like:
Tree like Job dependency
Hold & Release
Rerun failed steps
Parallelism
Help would be appreciated :)
NOTE: we are looking for open source alternative for TWS,Control-M,AutoSys.
JobScheduler would seem to meet your requirements:
Open Source see: Open Source and Commercial Licenses
Rest API see: Web Service Integration
Parallelism see: Organisation of Jobs and Job Chains
I think that these areas are also covered (I downloaded and trialled the application): See here
Tree like Job dependency
Hold & Release
Rerun failed steps
I'm not affiliated with SOS GmbH
ProActive Scheduler is an open source job scheduler.
It is part of OW2 organization
It is written in Java so it comes with a Java and a REST API
It provides workflows that are set of tasks with dependencies and more (loop,replicate, branch), upon failures you can control if the task should be cancelled or restarted
Parallelism and distribution is at the heart of it, with features like for instance
Commercial Support is provided by Activeeon, the company behind ProActive (full disclosure: I work for Activeeon).
You might be interested in DKron
Dkron is a system service that runs scheduled jobs at given intervals or times, just like the cron unix service but distributed in several machines in a cluster. If a machine fails (the leader), a follower will take over and keep running the scheduled jobs without human intervention. Dkron is Open Source and freely available.
http://dkron.io/
While not open source, a more cost effective job scheduler solution with REST API and support for the features listed is ActiveBatch workload automation and job scheduling. I do work for the company (being up-front) but our customers love how they can easily extend their automated processes to connect to any application, any service, any server with our REST API adapter. You can get more information here: https://www.advsyscon.com/en-us/activebatch/rest-api-adapter
Can anyone please say if quartz will allow you to add additional job types once the scheduler is up and running?
I reckon that our implementation of quarts will be an asp.net service using ram store. It is likely that new jobs will be written over time and that we will want to add these jobs into the scheduler without having to shut the service down.
Yes, so long as the classes related to the new job types make it into the classpath, and your class loader will discover them. Quartz does nothing to "preload" or "prediscover" job classes - it just loads them as they're referenced.
I have multiple quartz cron jobs in a load balanced environment. Currently these jobs are running on each node, which is not desirable. I want a node to run only a particular scheduler and if the node crashes, another node should run the scheduler intended for the node that crashed.
How can this be done with spring 2.5.6/tomcat load balancer.
I think there's a few aspects to this question.
Firstly, Quartz has API methods for pausing and resuming the Scheduler, or even individual triggers and jobs
e.g.
http://www.jarvana.com/jarvana/view/opensymphony/quartz/1.6.1/quartz-1.6.1-javadoc.jar!/org/quartz/Scheduler.html#standby()
I would create a spring bean with a reference to the Quartz scheduler or trigger, and a simple isMasterNode boolean member for storing state. I'd then expose 2 [restricted-access] web service calls: makeMaster and makeSlave, which will call Scheduler.resume() or standby/pause, respectively.
Finall, the big question is how & with what you determine that another node has 'crashed'.
If you're using a hardware loadbalancer to manage this, you could configure it to call the 'makeMaster' WS on the new 'primary' node, which in turn calls Scheduler.resume() or similar.
hth