How to clear triggers in a Quartz Scheduler - quartz-scheduler

If I have a Quartz scheduler running with a bunch of triggers and I want to clear out all the triggers, how is best to do that?
I've considered iterating over the groups and names, calling unschedule as I go, but that seems very slow when there are thousands of triggers in place (around 2s to unschedule 10 triggers).
A rudimentary test case (schedule 1000 triggers, delete batches of 100) shows an exponential complexity w.r.t. the number of scheduled triggers on the unschedule operation:
Deleted 100 triggers in 3594ms,35.94 triggers/ms
Deleted 100 triggers in 2734ms,13.67 triggers/ms
Deleted 100 triggers in 2453ms,8.176666666666666 triggers/ms
Deleted 100 triggers in 1985ms,4.9625 triggers/ms
Deleted 100 triggers in 1547ms,3.094 triggers/ms
Deleted 100 triggers in 1281ms,2.135 triggers/ms
Deleted 100 triggers in 1047ms,1.4957142857142858 triggers/ms
Deleted 100 triggers in 765ms,0.95625 triggers/ms
Deleted 100 triggers in 485ms,0.5388888888888889 triggers/ms
Deleted 100 triggers in 156ms,0.156 triggers/ms
I can't find any kind of bulk methods to clear things out.
I finally considered stopping the scheduler and cutting it loose for garbage collection, but I'm not sure if there's anything else I might need to tidy up to make sure it is not referenced anywhere.
Anyone got a view on the best approach here?

How many triggers and jobs you are going to have?
When triggers are stored in memory in map-like structures, for large numbers of them there is heavy memory requirement and the more triggers, the slower the operations are.
Have you considered database store? You could than profit from SQL ability to operate on set of datas and delete the whole group of triggers, or all triggers matching pattern with single command.

I think your best bet is to just use the shutdown method of the scheduler. You can call scheduler.shutdown(true) if you want all the currently executing jobs to finish before scheduler is shut down. According to Quartz API, calling shutdown cleans up all resources associated with the Scheduler.

Related

Is using "pg_sleep" in plpgsql Procedures/Functions while using multiple worker background processes concurrently bad practice?

I am running multiple background worker processes in my Postgres database. I am doing this using Pg_Cron extension. I cannot unfortunately use Pg_Timetables, as suggested by another user here.
Thus, I have 5 dependent "Jobs" that need 1 other independent Procedure/Function to execute and complete before they can start. I originally having my Cron jobs simply check every 30minutes or-so some "job_log" table I created to see if the independent Job completed (i.e. if yes, execute Procedure, if not, return out of Procedure and check at next Cron interval)
However, I believe I could simplify the way I am triggering/orchestrating all these Jobs/Procedures greatly if I utilize pg_sleep and start all the Jobs at one -time (so no more checking every 30minutes). I would be running these Jobs in the night time concurrently so I believe it shouldn't effect my actual traffic that much.
i.e.
WHILE some_variable != some_condition LOOP
PERFORM pg_sleep(1);
some_variable := some_value; -- update variable here
END LOOP;
My question is
Would starting all these Jobs at one time (i.e. Setting a concrete time in the Cron expression e.g. 15 18 * * *), and utilizing pg_sleep be bad practice/inefficient as I would be idling 5 background workers while 1 Job completes. The 1 job these are dependent on could take any amount of time to finish i.e. 15 min, 30 min, 1hr (should be < 1 hr though).
Or is better to simply just use a Cron expression to check every 5min or so if the main/independent Job is done, so my other Jobs that are dependent can then run?
Running two schedulers, one of them home-built, seems more complex than just running one scheduler that does 2 (or 6, 1+5, however you count it) different things. If your goal is to make things simpler, does your proposal really achieve that?
I wouldn't worry about 5 backends sleeping on pg_sleep at the same time. But you might worry about them holding back the xid horizon while they do so, which would make vacuuming and HOT pruning less effective. But if you already have one long-running task in one snapshot (the thing they are waiting for) then more of them aren't going to make the matter worse.

Queues: How to process dependent jobs

I am working on an application where multiple clients will be writing to a queue (or queues), and multiple workers will be processing jobs off the queue. The problem is that in some cases, jobs are dependent on each other. By 'dependent', I mean they need to be processed in order.
This typically happens when an entity is created by the user, then deleted shortly after. Obviously I want the first job (i.e. the creation) to take place before the deletion. The problem is that creation can take a lot longer than deletion, so I can't guarantee that it will be complete before the deletion job commences.
I imagine that this type of problem is reasonably common with asynchronous processing. What strategies are there to deal with it? I know that I can assign priorities to queues to have some control over the processing order, but this is not good enough in this case. I need concrete guarantees.
This may not fit your model, but the model I have used involves not providing the deletion functionality until the creation functionality is complete.
When Create_XXX command is completed, it is responsible for raising an XXX_Created event, which also gets put on the queue. This event can then be handled to enable the deletion functionality, allowing the deletion of the newly created item.
The process of a Command completing, then raising an event which is handled and creates another Command is a common method of ensuring Commands get processed in the desired order.
I think an handy feature for your use case is Job chaining:
https://laravel.com/docs/5.5/queues#job-chaining

Work around celerybeat being a single point of failure

I'm looking for recommended solution to work around celerybeat being a single point of failure for celery/rabbitmq deployment. I didn't find anything that made sense so far, by searching the web.
In my case, once a day timed scheduler kicks off a series of jobs that could run for half a day or longer. Since there can only be one celerybeat instance, if something happens to it or the server that it's running on, critical jobs will not be run.
I'm hoping there is already a working solution for this, as I can't be the only one who needs reliable (clustered or the like) scheduler. I don't want to resort to some sort of database-backed scheduler, if I don't have to.
There is an open issue in celery github repo about this. Don't know if they are working on it though.
As a workaround you could add a lock for tasks so that only 1 instance of specific PeriodicTask will run at a time.
Something like:
if not cache.add('My-unique-lock-name', True, timeout=lock_timeout):
return
Figuring out lock timeout is well, tricky. We're using 0.9 * task run_every seconds if different celerybeats will try to run them at different times.
0.9 just to leave some margin (e.g. when celery is a little behind schedule once, then it is on schedule which would cause lock to still be active).
Then you can use celerybeat instance on all machines. Each task will be queued for every celerybeat instance but only one task of them will finish the run.
Tasks will still respect run_every this way - worst case scenario: tasks will run at 0.9*run_every speed.
One issue with this case: if tasks were queued but not processed at scheduled time (for example because queue processors was unavailable) - then lock may be placed at wrong time causing possibly 1 next task to simply not run. To go around this you would need some kind of detection mechanism whether task is more or less on time.
Still, this shouldn't be a common situation when using in production.
Another solution is to subclass celerybeat Scheduler and override its tick method. Then for every tick add a lock before processing tasks. This makes sure that only celerybeats with same periodic tasks won't queue same tasks multiple times. Only one celerybeat for each tick (one who wins the race condition) will queue tasks. In one celerybeat goes down, with next tick another one will win the race.
This of course can be used in combination with the first solution.
Of course for this to work cache backend needs to be replicated and/or shared for all of servers.
It's an old question but I hope it helps anyone.

Quartz.net scheduler and IStatefulJob

I am wondering if I am understanding this right.
http://quartznet.sourceforge.net/apidoc/
IStatefulJob instances follow slightly
different rules from regular IJob
instances. The key difference is that
their associated JobDataMap is
re-persisted after every execution of
the job, thus preserving state for the
next execution. The other difference
is that stateful jobs are not allowed
to Execute concurrently, which means
new triggers that occur before the
completion of the IJob.Execute method
will be delayed.
Does this mean all triggers will be delayed until another trigger is done? If so how can I make it so only the same triggers will not fire until the previous trigger is done.
Say I have trigger A that fires every min but for some reason it is slow and takes a minute and half to execute. If I just use a plan IJob the next one would fire and I don't want this. I want to halt trigger A from fireing again until it is done.
However at the same time I have trigger B that fires every minute as well. It is going normal speed and finishes every minutes on time. I don't want trigger B to be held up because of trigger A.
From my understanding this is what would happen if I use IStatefulJob.
In short.. This behavior is from job's side. So regardless how many triggers you may have only single instance of given IStatefulJob (job name, job group dictates the instance id) running at a time. So there might be two instance of same job type, but no same-named jobs (name, group) if job implements IStatefulJob.
If trigger misses its fire time because of this, the misfire instructions come into play. A trigger that misses its next fire because the earlier invocation is still running decides what to do based on its misfire instruction (see API and tutorial).
With plain IJob you have no guarantees about how many jobs will be running at the same time if you have multiple triggers for it and/or misfires are happening. IJob is just contract interface for invoking the job. Quartz.NET 2.0 will split IStatefulJob combined behavior to two separate attributes: DisallowConcurrentExecution and PersistJobDataAfterExecution.
So you could combine same job type (IStatefulJobs) with two definitions (different job names) and triggers with applicable misfire instructions.

how to make multiple instances execute the same job at the same time not concurrently

I have 4 instances of Quartz Server. All of the instances point to one ADO JobStore. All I want to do is to make each Quartz instance execute the same job at the same time.
I hope it's clear enough.
This isn't supported out of the box. Whenever a trigger fires, it can only be consumed by one instance. You could fire 4 triggers, but it is not guaranteed that the job will not run twice on one instance.
If you want each instance to fire the job once, then you will have to set up 4 separate job stores.
What I do (in Quartz.NET 2.4.1) is that I have multiple identical scheduler instances, which only differ in scheduler instance name (quartz.scheduler.instanceName). They register identical jobs and triggers. Because of different scheduler instance names, the jobs and triggers are duplicated in the job store (scheduler name is part of the primary key in every table of JobStoreTX). This causes logically the same triggers to fire on all scheduler instances at the same time. They are actually separate triggers, though, so each will handle misfires etc separately.