How to test the tasks of a Celery instance using pytest? - celery

How to test the tasks of a Celery instance using pytest? I am not talking about testing the Celery tasks created using the #shared_task decorator with pytest. For that, there is already a good solution. I am talking about testing tasks created with the app.task decorator where app is a Celery app, a Celery instance.

Related

How to make celery worker find the tasks to include?

I have a fastapi app and I use celery for some async tasks. I also use docker, so fastapi runs in one container and celery on another. Now I am moving to break the workers into different queues and they will run in different containers. Right now I am using almost the same image for fastapi and celery, but for this new worker I will end up with a image way bigger than it should be since I have code and packages that the worker don't need. To get around that I now have 2 different dockerfiles, one for each worker, but in both of them I have the exact same file for setting up the celery app.
This is the celery I was setting up:
celery_app = Celery(
broker=config.celery_settings.broker_url,
backend=config.celery_settings.result_backend,
include=[
"src.iam.service_layer.tasks",
"src.receipt_tracking.service_layer.tasks",
"src.cfe_scraper.tasks",
],
)
And the idea is to spin off src.cfe_scraper.tasks and in that docker image I will not have src.iam.service_layer.tasks neither src.receipt_tracking.service_layer.tasks. But when I try to build the image I get an error saying that those paths don't exists, which is correct for that case, but if I simple delete the include argument the worker won't have any task registered. Is there an easy way to solve that without having to have two modules to setup different celery apps?

What does celery daemonization mean?

Can I know what does celery daemonization mean? Also, I would like to start running both celery worker and celery beat using single command. Anyway to do it (one way I can think of is using supervisor module for worker and beat and then writing the starting scripts for those in a separate .sh file and running that script..any other way?)
In other terms, I can start the worker and beat process as background process manually even..right? So, daemonization in celery just runs the processes as background processes or is there anything else?

Should Restart celery beat after restart celery workers?

i have two systemd service
one handles my celery workers(10 queue for different tasks) and one handles celery beat
after deploying new code i restart celery worker service to get new tasks and update celery jobs
Should i restart celery beat with celery worker service too?
or it gets new tasks automatically ?
It depends on what type of scheduler you're using.
If it's default PersistentScheduler then yes, you need to restart beat daemon to allow it to pick up new configuration from the beat_schedule setting.
But if you're using something like django-celery-beat which allows managing periodic tasks at runtime then you don't have to restart celery beat.

Is there a way to have Celery use MongoDB for scheduling tasks?

I know that it's currently possible to use Django Celery to schedule tasks using Django's built-in ORM, but is there a way to use MongoDB in this regard?
(I'm not asking about brokers or result backends, as I know that Celery supports those, I'm specifically asking about scheduling.)
I think what you're looking for is celerybeat-mongo
Yes it posible. I just use worker -B arguments for my workers.
I do not sure that celery scheduler put this tasks in queue (mongodb in this case), because it already runing on backend. But you always can trigger (delay) task inside schedule task.

cannot run celery subtasks if queues are specified

I am running celery 3.0.19 with mongodb as backend and broker. I would like to use queues in sub-tasks but it does not work. Here is how to reproduce the problem from the example add task.
Start a celery worker with the command
celery -A tasks worker --loglevel=info --queue=foo
Then create a task that never gets done like that
from tasks import add
sub_task = add.s(queue="foo")
sub_task_async_result = sub_task.apply_async((2,2))
note the following task will get executed normally.
async_result = add.apply_async((2,2), queue="foo")
what do I do wrong?
Thanks!