Delay Celery task based on condition - celery

Is there any way to delay a Celery task from running based on a condition? Before it moves from scheduled to active I would like to perform a quick check to see if my machine can run the task based on the arguments provided and my machine's state at the time. If it's not, it halts the scheduled queue and waits until the condition is satisfied.
I've looked around at the following points but it didn't seem to cut it:
Celery's Signals: closest thing I could get to is task_prerun() but regardless of what I put in there, the task will get run and it doesn't halt the other scheduled tasks from running. There's also worker_ready() but that doesn't look at the upcoming task's arguments to do the check.
Database Lock (also here as well): I can have each of the tasks start running normally and then do the check at the beginning of the task's run but if I set a periodic interval to check if condition is met, I lose the order of the active queue as condition can be met at any point and one of the many active tasks will be able to continue. This is where the database lock comes in and is so far the most feasible solution. I can make a lock every time I do the check and if the condition's not met, it stays locked. When the condition's finally met, I release the lock for the next item in the queue, preserving the queue's original order.
I find it surprising that celery doesn't have this functionality to specify if/when the next item in the scheduled queue is ready to be run.

Related

Detecting outstanding celery tasks

This is purely for a non-eager pytest mode of operation. I want to know when celery has "caught" up with all the outstanding work. Is there any way to find that information? My testing config has a celery_session_app and a single celery_session_worker in it's own thread.
Check the number of entries in the Rabbit queue. This has problems because of pre-fetch. I can set prefetch to 1 and maybe solve it that way but I worry about race conditions. (I'm testing chords and some celery tasks queue other celery tasks)
Add a task to the "end" of the list and then .wait() on it to finish. This has problems for tasks that queue other tasks because the queue is being extended in the other thread so I can be at the end of the list when queued, but that quickly moves forward as tasks are queued behind it. I can work around this using .apply_async(countdown=3) but this is pretty much the definition of a race condition and I might need countdown=4 or I might need nothing and that is some number of seconds wasted on a test regardless.
Use signals (somehow). But what I really need is a worker_is_bored which does not exist and suffers from the same kind of race conditions mentioned above. Tasks queueing tasks could make it flash "bored" and right back to "busy".
time.sleep(N) but what should N be. (i'm running pytest -n 10 so how busy the machine is during tests, is non-trivial). And this wastes time like countdown= above.

Stop recurring job during specific times

We have a job on our SQL database that runs periodically forever.
During predefined maintenance periods, we would like to have this job stop for a set time (say 12 hours) and then restart the regular periodic schedule.
We've tried using a separate job that disables it a the predefined time and a second one that enables it. This works but is not very neat.
Is there a better way to do this that only involves the job itself?
Make a "maintenance schedule" table in some service database or MSDB (StartDate, EndDate, Description, etc.). Let the first step of your job check if current datetime within maintenance period. If so, just do nothing.
If a session or transaction is associated with the maintenance process then you could use an application lock to have the regular job wait, or terminate, if it attempts to run while the maintenance is in process.
Using a locking mechanism allows finer control over the processes, e.g. the regular job can release and reacquire the lock between steps and wait (or terminate) if the maintenance process has started. Alternatively, the maintenance process could wait for the regular job to terminate (or reach a suitable checkpoint) before proceeding.
See sp_getapplock for additional information.

FreeSTOS task never get swapped

According to the FreeRTOS task scheduling documentation, the kernel can swap a task even if the task is currently executing and haven't called any blocking function. So once the kernel gets the clock ticks interrupt and is executing its ISR, it can schedule another task to execute after that.
On my system with FreeRTOS, I launch 5 tasks, each one is programmed to delay itself at some point and therefore I can see all tasks being swapped in and out and each task is executing at some point. But if I enter an infinite loop inside a task, that task is NEVER gets swapped out.
How is that possible?
Firstly you need to ensure that configUSE_TIME_SLICING is set. This enables the round robin scheduler, which allows the scheduler to do what you are expecting.
Also it will only switch to another task if it is of equal or higher priority.

Zookeeper priority queue

My problem description is follows:
I have n state based database infinite crawlers:
Currently how it is happening:
We are using single machine for crawling.
We have three level of priority queue. High, Medium and LOW.
At starting all Database job are put into lower level queue.
Worker reads a job from queue and do operation.
After finishing job it reschedule it with a delay of 5 minutes.
Solution I found
For Priority Queue I can use:
-
http://zookeeper.apache.org/doc/r3.2.2/recipes.html#sc_recipes_priorityQueues
Problem solution I am still searching are:
How to reschedule a job in queue with future schedule time. Is there
a way to do that in zookeeper ?
Canceling a already started job. Suppose user change his database
authentication details. I want to stop already running job for that
database and restart with new details.
What I thought is while starting a worker It will subscribe for that
it's znode changes and if something happen, It will stop that job and
reschedule it.
Infinite Queue
What I thought is that after finishing it will remove it from queue and
readd it with future schdule time. (It implementation depend on point 1)
Is it correct way of doing this task infinite task?

Work around celerybeat being a single point of failure

I'm looking for recommended solution to work around celerybeat being a single point of failure for celery/rabbitmq deployment. I didn't find anything that made sense so far, by searching the web.
In my case, once a day timed scheduler kicks off a series of jobs that could run for half a day or longer. Since there can only be one celerybeat instance, if something happens to it or the server that it's running on, critical jobs will not be run.
I'm hoping there is already a working solution for this, as I can't be the only one who needs reliable (clustered or the like) scheduler. I don't want to resort to some sort of database-backed scheduler, if I don't have to.
There is an open issue in celery github repo about this. Don't know if they are working on it though.
As a workaround you could add a lock for tasks so that only 1 instance of specific PeriodicTask will run at a time.
Something like:
if not cache.add('My-unique-lock-name', True, timeout=lock_timeout):
return
Figuring out lock timeout is well, tricky. We're using 0.9 * task run_every seconds if different celerybeats will try to run them at different times.
0.9 just to leave some margin (e.g. when celery is a little behind schedule once, then it is on schedule which would cause lock to still be active).
Then you can use celerybeat instance on all machines. Each task will be queued for every celerybeat instance but only one task of them will finish the run.
Tasks will still respect run_every this way - worst case scenario: tasks will run at 0.9*run_every speed.
One issue with this case: if tasks were queued but not processed at scheduled time (for example because queue processors was unavailable) - then lock may be placed at wrong time causing possibly 1 next task to simply not run. To go around this you would need some kind of detection mechanism whether task is more or less on time.
Still, this shouldn't be a common situation when using in production.
Another solution is to subclass celerybeat Scheduler and override its tick method. Then for every tick add a lock before processing tasks. This makes sure that only celerybeats with same periodic tasks won't queue same tasks multiple times. Only one celerybeat for each tick (one who wins the race condition) will queue tasks. In one celerybeat goes down, with next tick another one will win the race.
This of course can be used in combination with the first solution.
Of course for this to work cache backend needs to be replicated and/or shared for all of servers.
It's an old question but I hope it helps anyone.