How Flower determines Celery workers? - celery

Celery workers are being ran like this:
celery -A backend worker --broker=$REDIS_URL
Flower:
celery -A backend flower --broker=$REDIS_URL
When one run another worker Flower determines it. But how? Is there information stored about workers in Redis for example?

When Flower starts, it subscribes itself to be notified of most (if not all) task and worker events ( https://docs.celeryproject.org/en/stable/userguide/monitoring.html#event-reference ). When you run a new Celery worker, the moment it connects to the broker Flower will receive a new worker-online event. - That is how it finds out there is a "new worker in town"...

Related

Dash Celery setup

I have docker-compose setup for my Dash application. I need suggestion or preferred way to setup my celery image.
I am using celery for following use-cases and these are cancellable/abortable/revoked task:
Upload file
Model training
Create train, test set
Case-1. Create one service as celery,
command: ["celery", "-A", "tasks", "worker", "--loglevel=INFO", "--pool=prefork", "--concurrency=3", "--statedb=/celery/worker.state"]
So, here we are using default queue, single worker (main) and 3 child/worker processes(ie can execute 3 tasks simultaneously)
Now, if I revoke any task, will it kill the main worker or just that child worker processes executing that task?
Case-2. Create three services as celery-{task_name} ie celery-upload etc,
command: ["celery", "-A", "tasks", "worker", "--loglevel=INFO", "--pool=prefork", "--concurrency=1", , "--statedb=/celery/worker.state", "--queues=upload_queue", , "--hostname=celery_worker_upload_queue"]
So, here we are using custom queue, single worker (main) and 1 child/worker processe(ie can execute 1 task) in its container. This way one service for each task.
Now, if I revoke any task, it will only kill the main worker or just the only child worker processes executing that task in respective container and rest celery containers will be alive?
I tried using below signals with command task.revoke(terminate=True)
SIGKILL and SIGTERM
In this, I observed #worker_process_shutdown.connect and #task_revoked.connect both gets fired.
Does this means main worker and concerned child worker process for whom revoke command is issued(or all child processes as main worker is down) are down?
SIGUSR1
In this, I observed only #task_revoked.connect gets fired.
Does this means main worker is still running/alive and only concerned child worker process for whom revoke command is issued is down?
Which case is preferred?
Is it possible to combine both cases? ie having single celery service with individual workers(main) and individual child worker process and individual queues Or
having single celery service with single worker (main), individual/dedicated child worker processes and individual queues for respective tasks?
One more doubt, As I think, using celery is required for above listed tasks, now say I have button for cleaning a dataframe will this too requires celery?
ie wherever I am dealing with dataframes should I need to use celery?
Please suggest.
UPDATE-2
worker processes = child-worker-process
This is how I am using as below
# Start button
result = background_task_job_one.apply_async(args=(n_clicks,), queue="upload_queue")
# Cancel button
result = result_from_tuple(data, app=celery_app)
result.revoke(terminate=True, signal=signal.SIGUSR1)
# Task
#celery_app.task(bind=True, name="job_one", base=AbortableTask)
def background_task_job_one(self, n_clicks):
msg = "Aborted"
status = False
try:
msg = job(n_clicks) # Long running task
status = True
except SoftTimeLimitExceeded as e:
self.update_state(task_id=self.request.id, state=states.REVOKED)
msg = "Aborted"
status = True
raise Ignore()
finally:
print("FINaLLY")
return status, msg
Is this way ok to handle cancellation of running task? Can you elaborate/explain this line [In practice you should not send signals directly to worker processes.]
Just for clarification from line [In prefork concurrency (the default) you will always have at least two processes running - Celery worker (coordinator) and one or more Celery worker-processes (workers)]
This means
celery -A app worker -P prefork -> 1 main worker and 1 child-worker-process. Is it same as below
celery -A app worker -P prefork -c 1 -> 1 main worker and 1 child-worker-process
Earlier, I tried using class AbortableTask and calling abort(), It was successfully updating the state and status as ABORTED but task was still alive/running.
I read to terminate currently executing task, it is must to pass terminate=True.
This is working, the task stops executing and I need to update task state and status manually to REVOKED, otherwise default PENDING. The only hard-decision to make is to use SIGKILL or SIGTERM or SIGUSR1. I found using SIGUSR1 the main worker process is alive and it revoked only the child worker process executing that task.
Also, luckily I found this link I can setup single celery service with multiple dedicated child-worker-process with its dedicated queues.
Case-3: Celery multi
command: ["celery", "multi", "show", "start", "default", "model", "upload", "-c", "1", "-l", "INFO", "-Q:default", "default_queue", "-Q:model", "model_queue", "-Q:upload", "upload_queue", "-A", "tasks", "-P", "prefork", "-p", "/proj/external/celery/%n.pid", "-f", "/proj/external/celery/%n%I.log", "-S", "/proj/external/celery/worker.state"]
But getting error,
celery service exited code 0
command: bash -c "celery multi start default model upload -c 1 -l INFO -Q:default default_queue -Q:model model_queue -Q:upload upload_queue -A tasks -P prefork -p /proj/external/celery/%n.pid -f /proj/external/celery/%n%I.log -S /proj/external/celery/worker.state"
Here also getting error,
celery | Usage: python -m celery worker [OPTIONS]
celery | Try 'python -m celery worker --help' for help.
celery | Error: No such option: -p
celery | * Child terminated with exit code 2
celery | FAILED
Some doubts, what is preferred 1 worker vs multi worker?
If multi worker with dedicated queues, creating docker service for each task increases the docker-file and services too. So I am trying single celery service with multiple dedicated child-worker-process with its dedicated queues which is easy to abort/revoke/cancel a task.
But getting error with case-3 i.e. celery multi.
Please suggest.
If you revoke a task, it may terminate the working process that was executing the task. The Celery worker will continue working as it needs to coordinate other worker processes. If the life of container is tied to the Celery worker, then container will continue running.
In practice you should not send signals directly to worker processes.
In prefork concurrency (the default) you will always have at least two processes running - Celery worker (coordinator) and one or more Celery worker-processes (workers).
To answer the last question we may need more details. It would be easier if you could run Celery task when all dataframes are available. If that is not the case, then perhaps run individual tasks to process dataframes. It is worth having a look at Celery workflows and see if you can build Chunk-ed workflow. Keep it simple, start with assumption that you have all dataframes available at once, and build from there.

Making the right worker execute a task via send_task

How do I make Celery send a task to the right worker when using send_task?
For instance, given the following services:
service_add.py
from celery import Celery
celery = Celery('service_add', backend='redis://localhost', broker='pyamqp://')
#celery.task
def add(x, y):
return x + y
service_sub.py
from celery import Celery
celery = Celery('service_sub', backend='redis://localhost', broker='pyamqp://') #redis backend,rabbitmq for messaging
#celery.task
def sub(x, y):
return x - y
the following code is guaranteed to fail:
main.py
from celery.execute import send_task
result1 = send_task('service_sub.sub',(1,1)).get()
result2 = send_task('service_sub.sub',(1,1)).get()
With the exception celery.exceptions.NotRegistered: 'service_sub.sub' because Celery sends each process the tasks in a round-robin fashion, even though service_sub belongs to just one process.
For the question to be complete, here's how I ran the processes and the config file:
celery -A service_sub worker --loglevel=INFO --pool=solo -n worker1
celery -A service_add worker --loglevel=INFO --pool=solo -n worker2
celeryconfig.py
## Broker settings.
broker_url = 'pyamqp://'
# List of modules to import when the Celery worker starts.
imports = ('service_add.py','service_sub.py')
If you're using two different apps service_add / service_sub only to achieve the routing of tasks to a dedicated worker, I would like to suggest another solution. If that's not the case and you still need two (or more apps) I would suggest better encapsulate the broker like amqp://localhost:5672/add_vhost & backend: redis://localhost/1. Having a dedicate vhost in rabbitMQ and a dedicated database id (1) in Redis might do the trick.
Having said that, I think that the right solution in such a case is using the same celery application (not splitting into two application) and use router:
task_routes = {'tasks.service_add': {'queue': 'add'}, 'tasks.service_sub': {'queue': 'sub'}}
add it to the configuration:
app.conf.task_routes = task_routes
and run your worker with Q (which queue to read messages from):
celery -A shared_app worker --loglevel=INFO --pool=solo -n worker1 -Q add
celery -A shared_app worker --loglevel=INFO --pool=solo -n worker2 -Q sub
Note that this approach has more benefits, like if you want to have some dependencies between tasks (canvas).
There are more ways to define routers, you can read more about it here.

Why Celery queues randomly stuck on starting using Supervisord?

There are over 20 workers managed by supervisord.
My celery worker command:
celery worker myproject.server.celery -l INFO --pool=gevent --concurrency=10 --config=myproject.celeryconfig -n default_worker.%%h -Q default
the problem is: each time deploy new code and then restart each supervisor task, few workers would stuck on staring randomly, which is confused. You can check the Flower dashboard and found the stuck worker:
Image: flower dashboard worker status
Then, you can find the more strange in htop, the ldconfig.real started, instead of the failed celery worker:
Image: htop monitor celery worker
I appreciate any suggestion!
If you are gracefully restarting, then this is a normal behaviour because at the time of deployment you may have some active tasks. Celery will not kill the worker processes when in the graceful shutdown state. It will wait for all of them to finish, and then stop.

Should Restart celery beat after restart celery workers?

i have two systemd service
one handles my celery workers(10 queue for different tasks) and one handles celery beat
after deploying new code i restart celery worker service to get new tasks and update celery jobs
Should i restart celery beat with celery worker service too?
or it gets new tasks automatically ?
It depends on what type of scheduler you're using.
If it's default PersistentScheduler then yes, you need to restart beat daemon to allow it to pick up new configuration from the beat_schedule setting.
But if you're using something like django-celery-beat which allows managing periodic tasks at runtime then you don't have to restart celery beat.

What if i schedule tasks for celery to perform every minute and it is not able to complete it in time?

If I schedule the task for every minute and if it is not able to be getting completed in the time(one minute). Would the task wait in queue and it will go on like this? if this happens then after few hours it will be overloaded. Is there any solution for this kind of problems?
I am using beat and worker combination for this. It is working fine for less records to perform tasks. but for large database, I think this could cause problem.
Task is assign to queue (RabbitMQ for example).
Workers are queue consumers, more workers (or worker with high concurrency) - more tasks could be handled in parallel.
Your periodic task produce messages of the same type (I guess) and your celery router route them to the same queue.
Just set your workers to consume messages from that queue and that's all.
celery worker -A celeryapp:app -l info -Q default -c 4 -n default_worker#%h -Ofair
In the example above I used -c 4 for concurrency of four (eqv. to 4 consumers/workers). You can also start move workers and let them consume from the same queue with -Q <queue_name> (in my example it's default queue).
EDIT:
When using celery (the worker code) you are initiate Celery object. In Celery constructor you are setting your broker and backend (celery used them as part of the system)
for more info: http://docs.celeryproject.org/en/latest/getting-started/first-steps-with-celery.html#application