Failure reported in Quartz 2.2.2 scheduler - quartz-scheduler

I am getting below error
java.sql.SQLIntegrityConstraintViolationException: ORA-00001: unique constraint (QRTZ_FIRED_TRIGGER_PK) violated
I am not sure why this would fail as out of these two (SCHED_NAME,ENTRY_ID) pkey, Sched_name is always same for all jobs and ENTRY_ID is generated by Quartz itself.
Can anyone please help me find how this ENTRY_ID is created and in which scenario this can cause PK failure.
There are multiple Quartz JOBS scheduled on same interval in my application.

The ENTRY_ID is determined by the instanceId of your node with an appended counter. (e.g. "mynode1557833519480")
This collision can happen when you have two nodes with the same instanceId in your cluster.
Make sure the instanceIds are unique. From Quartz-documentation:
org.quartz.scheduler.instanceId
Can be any string, but must be unique for all schedulers working as if they are the same ‘logical’ Scheduler within a cluster. You may use the value “AUTO” as the instanceId if you wish the Id to be generated for you
(The counter is initialized with the current time, so if the system clock was turned back at some point, this could also happen)

Related

Kogito - wait until data from multiple endpoints is received

I am using Kogito with Quarkus. I have set on drl rule and am using a bpmn configuration. As can be seen below, currently one endpoint is exposed, that starts the process. All needed data is received from the initial request, it is then evaluated and process goes on.
I would like to extend the workflow to have two separate endpoints. One to provide the age of the person and another to provide the name. The process must wait until all needed data is gathered before it proceeds with evaluation.
Has anybody come across a similar solution?
Technically you could use a signal or message to add more data into a process instance before you execute the rules over the entire data, see https://docs.kogito.kie.org/latest/html_single/#ref-bpmn-intermediate-events_kogito-developing-process-services.
In order to do that you need to have some sort of correlation between these events, otherwise, how do you map that event name 1 should be matched to event age 1. If you can keep the process instance id, then the second event can either trigger a rest endpoint to the specific process instance or send it a message via a message broker.
You also have your own custom logic to aggregate the events and only fire a new process instance once your criteria of complete data is met, and there is also plans in Kogito to extend the capabilities of how correlation is done, allowing for instance to use variables of the process as the identifier. For example, if you have person.id as correlation and event to name and age of the same id would signal the same process instance. HOpe this info helps.

How to add a new Job to a Quartz cluster that needs to start running during the rolling update of the cluster?

We have clustered Quartz scheduler runner on a couple of application nodes. The application nodes need to be updated, and for high-availability reasons, the update is done as rolling update.
Together with the update, we need to add a new job, and that job needs to start running immediately - i.e. it can't wait until all nodes have been updated. The problem is that I can't control which node will run the new job, and if one of the old nodes runs the job, the job instantiation will faill (with a ClassNotFoundException), the trigger will be set to the state ERROR and the job won't run again.
One solution for this problem would be to do two updates: one to add the class in all nodes, and one to add the trigger. The main reason against this approach is that our ops procedures don't support this.
So is there also a way to schedule the new job and make it run reliably with a single update?
I just tried it and it turned out that Quartz gets a ClassCastException while trying acquire the trigger. The exception is wrapped into a JobPersistenceException and the trigger is left in WAITING state.
So, although this could cause an error log entry in one of the old nodes, Quartz doesn't leave the trigger in a non-working state.

Group Priority on a Subset of Nodes

I am using a recent build of Torque/Maui (w/ PBS) to schedule jobs on a cluster with heterogenous hardware. Hardware consists on two set of 10 nodes for which I would like to have two group have elevated priority on one of the sets of nodes. For example:
Node set A of 10 nodes has elevated priority for User Group 1
Node set B of 10 nodes has elevated priority for User Group 2
I am familiar with how this is accomplished for all nodes, which is documented here:
http://docs.adaptivecomputing.com/maui/5.1.3priorityusage.php
However, I am unfamiliar on the best strategy to set this type of priority on a subset of the cluster. From what I can ascertain from the Maui docs it may be done using node sets or partitions, but I am unsure if either of these are correct or there is another strategy all together.
Edit: I would prefer to have a single queue as it simplifies usability and would enable a user to potentially use the entire cluster, albeit with differing priority on node set A and B.
Thanks in advance for the help.
The way I understand the question, you've confused node allocation with job priority. Job priority determines how much more quickly Maui will run a job, as it accrues priority in the priority reservation queue. This will determine how soon a job can run, within the constraints placed on the job, relative to all other jobs in the eligible/idle queue.
That's separate from where Maui decides to place (schedule) jobs. The most natural way to handle this type of use case is with standing reservations. You can create reservations over each set of nodes (via host list, feature, or partition), and then give both groups (or everyone) access to both reservations, but apply negative affinity to everyone outside the group with preferential access.
Example:
SRCFG[rsvA] NODEFEATURES=setA
SRCFG[rsvA] GROUPLIST=group1,ALL-
SRCFG[rsvA] HOSTLIST=ALL
SRCFG[rsvB] NODEFEATURES=setB
SRCFG[rsvB] GROUPLIST=group2,ALL-
SRCFG[rsvB] HOSTLIST=ALL
With this configuration, Maui will create reservation rsvA to include only the nodes with the "setA" property/feature, and jobs from group1 will gravitate (i.e., have positive affinity) to the nodes in that reservation. Likewise, jobs from users in group2 will flow to the nodes in rsvB, with the "setB" property (as defined in the nodes file, or on NODECFG lines in the maui.cfg). This configuration works fine with a single queue, and is essentially user-transparent.

Spring Batch FlowStep in Partitioner restart issue

Here is what I'm trying to achieve in a Spring Batch job:
A partitioner launches a FlowStep
The FlowStep consists of n step(s)
In case of failure, I want a consistent restart of the inner steps
I encounter the following issue during a restart:
Suppose I have 2 partitions, for the sake of simplicity I have a syncTaskExecutor. The first partition (partition0) runs well, we run now the second partition (partition1).
The first problem is that the sub-steps of the FlowStep are detected as duplicates. This is because the names of the sub-steps are not suffixed with the partition index. But the steps run ultimately.
The consequence of this happens if one sub-step fails. In that case, during a restart, since all sub-steps of the partition0 execution exit successfuly, the remaining steps of partition1 won't be executed.
The main problem here is that the sub-steps of a partitioner are not indexed and therefor detected as equivalent but they are not.
Additionally I don't want to set the sub-steps as restartable because I just want the missing steps to be executed and not all of them.
Am I missing something at this point? Do you have an alternative for what I want to do?
I know I could also launch a real job from the partitioner (using a JobStep) but this is not as powerful as FlowStep because we are really limited by the parameters we can provide to a job (no existing ExecutionContext). The guy here had the same issue I guess (
Spring batch Partitioning with multiple steps in parallel?)
Thank you for your help
After digging in the Spring Batch arcanes, I think I can answer my own question and maybe help some other people.
The key here is to provide our own StepHandler instead of the default SimpleStepHandler. In this handler, we can use the provided ExecutionContext to look after a predefined key that will contain the current partition id. We just need to use this id to build a unique step name in the form step.getName() + ":" + id.
In order to insert this custom StepHandler, we override the default FlowStep implementation.
A complete example can be found here https://github.com/miremond/spring-boot-sample-batch.

how to make multiple instances execute the same job at the same time not concurrently

I have 4 instances of Quartz Server. All of the instances point to one ADO JobStore. All I want to do is to make each Quartz instance execute the same job at the same time.
I hope it's clear enough.
This isn't supported out of the box. Whenever a trigger fires, it can only be consumed by one instance. You could fire 4 triggers, but it is not guaranteed that the job will not run twice on one instance.
If you want each instance to fire the job once, then you will have to set up 4 separate job stores.
What I do (in Quartz.NET 2.4.1) is that I have multiple identical scheduler instances, which only differ in scheduler instance name (quartz.scheduler.instanceName). They register identical jobs and triggers. Because of different scheduler instance names, the jobs and triggers are duplicated in the job store (scheduler name is part of the primary key in every table of JobStoreTX). This causes logically the same triggers to fire on all scheduler instances at the same time. They are actually separate triggers, though, so each will handle misfires etc separately.