Change timeout for builtin celery tasks (i.e. celery.backend_cleanup) - celery

We're using Celery 4.2.1 and Redis with global soft and hard timeouts set for our tasks. All of our custom tasks are designed to stay under the limits, but every day the builtin task backend_cleanup task ends up forcibly killed by the timeouts.
I'd rather not have to raise our global timeout just to accommodate builtin Celery tasks. Is there a way to set the timeout of these builtin tasks directly?
I've had trouble finding any documentation on this or even anyone hitting the same problem.
Relevant source from celery/app/builtins.py:
#connect_on_app_finalize
def add_backend_cleanup_task(app):
"""Task used to clean up expired results.
If the configured backend requires periodic cleanup this task is also
automatically configured to run every day at 4am (requires
:program:`celery beat` to be running).
"""
#app.task(name='celery.backend_cleanup', shared=False, lazy=False)
def backend_cleanup():
app.backend.cleanup()
return backend_cleanup

You may set backend cleanup schedule directly in celery.py.
app.conf.beat_schedule = {
'backend_cleanup': {
'task': 'celery.backend_cleanup',
'schedule': 600, # 10 minutes
},
}
And then run the beat celery process:
celery -A YOUR_APP_NAME beat -l info --detach

Related

Dash Celery setup

I have docker-compose setup for my Dash application. I need suggestion or preferred way to setup my celery image.
I am using celery for following use-cases and these are cancellable/abortable/revoked task:
Upload file
Model training
Create train, test set
Case-1. Create one service as celery,
command: ["celery", "-A", "tasks", "worker", "--loglevel=INFO", "--pool=prefork", "--concurrency=3", "--statedb=/celery/worker.state"]
So, here we are using default queue, single worker (main) and 3 child/worker processes(ie can execute 3 tasks simultaneously)
Now, if I revoke any task, will it kill the main worker or just that child worker processes executing that task?
Case-2. Create three services as celery-{task_name} ie celery-upload etc,
command: ["celery", "-A", "tasks", "worker", "--loglevel=INFO", "--pool=prefork", "--concurrency=1", , "--statedb=/celery/worker.state", "--queues=upload_queue", , "--hostname=celery_worker_upload_queue"]
So, here we are using custom queue, single worker (main) and 1 child/worker processe(ie can execute 1 task) in its container. This way one service for each task.
Now, if I revoke any task, it will only kill the main worker or just the only child worker processes executing that task in respective container and rest celery containers will be alive?
I tried using below signals with command task.revoke(terminate=True)
SIGKILL and SIGTERM
In this, I observed #worker_process_shutdown.connect and #task_revoked.connect both gets fired.
Does this means main worker and concerned child worker process for whom revoke command is issued(or all child processes as main worker is down) are down?
SIGUSR1
In this, I observed only #task_revoked.connect gets fired.
Does this means main worker is still running/alive and only concerned child worker process for whom revoke command is issued is down?
Which case is preferred?
Is it possible to combine both cases? ie having single celery service with individual workers(main) and individual child worker process and individual queues Or
having single celery service with single worker (main), individual/dedicated child worker processes and individual queues for respective tasks?
One more doubt, As I think, using celery is required for above listed tasks, now say I have button for cleaning a dataframe will this too requires celery?
ie wherever I am dealing with dataframes should I need to use celery?
Please suggest.
UPDATE-2
worker processes = child-worker-process
This is how I am using as below
# Start button
result = background_task_job_one.apply_async(args=(n_clicks,), queue="upload_queue")
# Cancel button
result = result_from_tuple(data, app=celery_app)
result.revoke(terminate=True, signal=signal.SIGUSR1)
# Task
#celery_app.task(bind=True, name="job_one", base=AbortableTask)
def background_task_job_one(self, n_clicks):
msg = "Aborted"
status = False
try:
msg = job(n_clicks) # Long running task
status = True
except SoftTimeLimitExceeded as e:
self.update_state(task_id=self.request.id, state=states.REVOKED)
msg = "Aborted"
status = True
raise Ignore()
finally:
print("FINaLLY")
return status, msg
Is this way ok to handle cancellation of running task? Can you elaborate/explain this line [In practice you should not send signals directly to worker processes.]
Just for clarification from line [In prefork concurrency (the default) you will always have at least two processes running - Celery worker (coordinator) and one or more Celery worker-processes (workers)]
This means
celery -A app worker -P prefork -> 1 main worker and 1 child-worker-process. Is it same as below
celery -A app worker -P prefork -c 1 -> 1 main worker and 1 child-worker-process
Earlier, I tried using class AbortableTask and calling abort(), It was successfully updating the state and status as ABORTED but task was still alive/running.
I read to terminate currently executing task, it is must to pass terminate=True.
This is working, the task stops executing and I need to update task state and status manually to REVOKED, otherwise default PENDING. The only hard-decision to make is to use SIGKILL or SIGTERM or SIGUSR1. I found using SIGUSR1 the main worker process is alive and it revoked only the child worker process executing that task.
Also, luckily I found this link I can setup single celery service with multiple dedicated child-worker-process with its dedicated queues.
Case-3: Celery multi
command: ["celery", "multi", "show", "start", "default", "model", "upload", "-c", "1", "-l", "INFO", "-Q:default", "default_queue", "-Q:model", "model_queue", "-Q:upload", "upload_queue", "-A", "tasks", "-P", "prefork", "-p", "/proj/external/celery/%n.pid", "-f", "/proj/external/celery/%n%I.log", "-S", "/proj/external/celery/worker.state"]
But getting error,
celery service exited code 0
command: bash -c "celery multi start default model upload -c 1 -l INFO -Q:default default_queue -Q:model model_queue -Q:upload upload_queue -A tasks -P prefork -p /proj/external/celery/%n.pid -f /proj/external/celery/%n%I.log -S /proj/external/celery/worker.state"
Here also getting error,
celery | Usage: python -m celery worker [OPTIONS]
celery | Try 'python -m celery worker --help' for help.
celery | Error: No such option: -p
celery | * Child terminated with exit code 2
celery | FAILED
Some doubts, what is preferred 1 worker vs multi worker?
If multi worker with dedicated queues, creating docker service for each task increases the docker-file and services too. So I am trying single celery service with multiple dedicated child-worker-process with its dedicated queues which is easy to abort/revoke/cancel a task.
But getting error with case-3 i.e. celery multi.
Please suggest.
If you revoke a task, it may terminate the working process that was executing the task. The Celery worker will continue working as it needs to coordinate other worker processes. If the life of container is tied to the Celery worker, then container will continue running.
In practice you should not send signals directly to worker processes.
In prefork concurrency (the default) you will always have at least two processes running - Celery worker (coordinator) and one or more Celery worker-processes (workers).
To answer the last question we may need more details. It would be easier if you could run Celery task when all dataframes are available. If that is not the case, then perhaps run individual tasks to process dataframes. It is worth having a look at Celery workflows and see if you can build Chunk-ed workflow. Keep it simple, start with assumption that you have all dataframes available at once, and build from there.

How to prevent celery.backend_cleanup from executing in default queue

I am using python + flask + SQS and I'm also using celery beat to execute some scheduled tasks.
Recently I went from having one single default "celery" queue to execute all my tasks to having dedicated queues/workers for each task. This includes tasks scheduled by celery beat which now all go to a queue named "scheduler".
Before dropping the "celery" queue, I monitored it to see if any tasks would wind up in that queue. To my surprise, they did.
Since I had no worker consuming from that queue, I could easily inspect the messages which piled up using the AWS console. What is saw was that all tasks were celery.backend_cleanup!!!
I cannot find out from the celery docs how do I prevent this celery.backend_cleanup from getting tossed into this default "celery" queue which I want to get rid of! And the docs on beat do not show an option to pass a queue name. So how do I do this?
This is how I am starting celery beat:
/venv/bin/celery -A backend.app.celery beat -l info --pidfile=
And this is how I am starting the worker
/venv/bin/celery -A backend.app.celery worker -l info -c 2 -Ofair -Q scheduler
Keep in mind, I don't want to stop backend_cleanup from executing, I just want it to go in whatever queue I specify.
Thanks ahead for the assistance!
You can override this in the beat task setup. You could also change the scheduled time to run here if you wanted to.
app.conf.beat_schedule = {
'backend_cleanup': {
'task': 'celery.backend_cleanup',
'options': {'queue': <name>,
'exchange': <name>,
'routing_key': <name>}
}
}

Is a crontab enough to schedule a celery task to run periodically?

I have a periodic task that uses a crontab to run every day at 1:01 AM using
run_every = crontab(hour=1, minute=1)
Once I get my server up and running, is that enough to trigger the task to run once a day? Or do I also need to use a database scheduler?
Yes. It should be enough as Celery beat has own state file that is enough to run everything as you require.

How to have celery expire results when using a database backend

I'm not sure I understand how result_expires works.
I read,
result_expires
Default: Expire after 1 day.
Time (in seconds, or a timedelta object) for when after stored task tombstones will be deleted.
A built-in periodic task will delete the results after this time (celery.backend_cleanup), assuming that celery beat is enabled. The task runs daily at 4am.
...
When using the database backend, celery beat must be running for the results to be expired.
(from here: http://docs.celeryproject.org/en/latest/userguide/configuration.html#std:setting-result_expires)
So, in order for this to work, I have to actually do something like this:
python -m celery -A myapp beat -l info --detach
?
Is that what the documentation is referring to by "celery beat is enabled"? Or, rather than executing this manually, there is some configuration that needs to be set which would cause celery beat to be called automatically?
Re: celery beat--you are correct. If you use a database backend, you have to run celery beat as you posted in your original post. By default celery beat sets up a daily task that will delete older results from the results database. If you are using a redis results backend, you do not have to run celery beat. How you choose to run celery beat is up to you, personally, we do it via systemd.
If you want to configure the default expiration time to be something other than the default 1 day, you can use the result_expires setting in celery to set the number of seconds after a result is recorded that it should be deleted. e.g., 1800 for 30 minutes.

Gracefully update running celery pod in Kubernetes

I have a Kubernetes cluster running Django, Celery, RabbitMq and Celery Beat. I have several periodic tasks spaced out throughout the day (so as to keep server load down). There are only a few hours when no tasks are running, and I want to limit my rolling-updates to those times, without having to track it manually. So I'm looking for a solution that will allow me to fire off a script or task of some sort that will monitor the Celery server, and trigger a rolling update once there's a window in which no tasks are actively running. There are two possible ways I thought of doing this, but I'm not sure which is best, nor how to implement either one.
Run a script (bash or otherwise) that checks up on the Celery server every few minutes, and initiates the rolling-update if the server is inactive
Increment the celery app name before each update (in the Beat run command, the Celery run command, and in the celery.py config file), create a new Celery pod, rolling-update the Beat pod, and then delete the old Celery 12 hours later (a reasonable time span for all running tasks to finish)
Any thoughts would be greatly appreciated.