Quartz scheduler - external Trigger configuration through AdoJobStore and Clustering - quartz-scheduler

Exploring (Ado)JobStore (data base job store in general) I met subjects like clustering, load balancing and sharing jobs' work data state across multiple applications.
But I think I didn't find a JobStore subject that covers my scenario.
I need to run Quartz Jobs in Windows Service and I need to be able to change configuration of Triggers in other application (in Admin panel in web application) and the Triggers to be applied by the Quartz in my Windows Service automatically (Quartz tracks changes and applies them).
Is it possible to do this by using AdoJobStore/Clustering mechanism? I mean in terms of JobStore's features, so by using Quartz scheduler API. Not by using SQL and changing data in Quartz tables directly or any other workarounds (according to Quartz's Best Practices doc).

The Quartz.NET scheduler can be accessed remotely, independently of job stores. Since you already have a web app you can add a reference to the remote scheduler and use the API to administer jobs, triggers etc.

Related

Workflow system for both ETL and Queries by Users

I am looking for a workflow system that supports the following needs:
dealing with a complex ETL pipeline with various kinds of APIs
(file-based, REST, console, databases, ...)
offers automated scheduling/orchestration on different execution environments (AWS, Azure, on-Premise clusters, local machine, ...)
has an option for "reactive" workflows i.e. workflows that can be triggered and executed instantaneously without unnecessary delay, are executed with highest priority and the same workflow can be started several times simultaneously
Especially the third requirement seems to be tricky to find. The purpose of this requirement is that a user should be able to send a query to activate a (computationally non-heavy) workflow and get back a result immediately instead of waiting some seconds or even minutes and multiple users might want to use the same workflow simultaneously. The reason this is important is that the ETL workflows and the user ("reactive") workflows share a substantial overlap and I do intend to reuse parts of these workflows instead of maintaining two sets of workflows that are executed by different tools.
Apache Airflow appears to be the natural choice for requirements 1. and 2. but does not seem to support the third requirement since it starts the execution in (lengthy) fixed time slots and does not allow for the simulataneous execution of several instances of the same DAG (workflow).
Are there any tools out there that support all these requirements or do I have to use two different workflow management tools or even have to stick to a (Python) script for the user workflows?
You can trigger a dag manually by using the CLI or the API. Have a look at this post: https://medium.com/#ntruong/airflow-externally-trigger-a-dag-when-a-condition-match-26cae67ecb1a
You'll have to test if you can execute multiple dag runs at the same time.

Quartz Scheduler implementation

What is the internal mechanism for persisting data using Quartz Scheduler?
I went through internet but didn't find clear description.
It would be great if you suggest the same to work in hibernate platform.
When you use Quartz Scheduler in your project you should have a file for its properties which is called quartz.properties. In this file you should determine your persistence mechanism by using parameter: org.quartz.jobStore.class
The value for this parameter can be the followings:
org.quartz.impl.jdbcjobstore.JobStoreCMT: it means that you want to persist in a database and transactions are managed by a container (Like Weblogic, JBoss, ...)
org.quartz.impl.jdbcjobstore.JobStoreTX: it means that you want to persist in a database and transactions are NOT managed by a container. this option is used mostly when you run Quartz Scheduler as a standing alone application.
org.quartz.simpl.RAMJobStore: This option actually is not recommended in production environment because according this parameter Quartz persists jobs and triggers just in RAM!
org.terracotta.quartz.TerracottaJobStore: The last option is using Terracotta Server as your persistence unit, Quartz says that it is the fastest way.
I myself prefer first option, it is straightforward and more reliable I think.
You can read more about this configuration here.
And about hibernate, quartz will manage the persistence tasks, like rollback and persist, and you wont being involved in this process.

How to modify the scheduler of Pegasus WMS

I'm interested in scientific workflow scheduling. I'm trying to figure out and modify the existing scheduling algorithm inside Pegasus workflow management system from http://pegasus.isi.edu/, but I don't know where it is and how to do so. Thanks!
Pegasus has a notion of site selection during it's mapping phase where it maps the jobs to the various sites defined in the site catalog. The site selection is explained in the documentation here
https://pegasus.isi.edu/wms/docs/latest/running_workflows.php#mapping_refinement_steps
Internally, there is a site selector interface that you can implement to incorporate your own scheduling algorithms.
You can access the javadoc at
https://pegasus.isi.edu/wms/docs/latest/javadoc/edu/isi/pegasus/planner/selector/SiteSelector.html
There are some implementations included in this package
There is a version of Heft also implemented there. The algorithm is implemented in the the following class.
edu.isi.pegasus.planner.selector.site.heft.Algorithm
Looking at the Heft implementation of site selector will provide you a good template on how to incorporate other site selection algorithms.
However, you need to keep in mind, that Pegasus maps the workflow to various sites and then hands over the workflow to Condor DAGMan for execution. Condor DAGMAn looks at what jobs are ready to run and then releases them to local Condor queue ( managed by Condor Schedd). The jobs are then submitted to the remote sites by Condor Schedd. The actual node on which a job gets executed is determined the by local resource scheduler on the site. For example, if you submit the jobs in a workflow to a site that is running PBS , then PBS decides the actual nodes on which a job runs.
In case of Condor you can associate requirements with your jobs that can help you steer jobs to specific nodes etc.
With a workflow, you can also associate job priorities that determine the priority of the job in the local Condor Queue on the submit host. You can use that to control what job gets submitted by schedd first if there are multiple jobs in the queue.

Is it possible to configure Quartz instances to only handle certain jobgroups?

I have two web applications living on the same Tomcat installation, in which I would like to implement Quartz 1.x such that each web app only serves a single jobgroup in a shared Quartz data store.
Is it possible to configure a Quartz instance to serve (or ignore) specific group set?
No, but you can create multiple quartz instances, and only put certain job groups in each (and then it will of course only fire certain job groups).

quartz jdbcjobstore sharing

Quartz can store jobs on database so its not volatile.
But if i have two application(web-application and web service) ,
how can i share this store between applications.
That is if one application select a job to run other application informed.And when one application fail it will continue to run
I realise this is a late reply, but for anyone else who might find this useful...
Quartz is designed with clustered environments in mind, specifically for what you're asking. You can point both of your applications (web service and web application) to the same Quartz job database, and Quartz itself will manage locking the jobs so that they still only run according to their schedule.
In your Quartz config make sure you're using:
org.quartz.jobStore.class=org.quartz.impl.jdbcjobstore.JobStoreTX
... And then duplicate the Quartz setup across both your applications, ensuring they both point to the same database.
I think it should take care of itself! Search for "Quartz clustering" if you need more info.