Do I need to enable celery beat to activate result_expires? - celery

I have django celery up and running.
Do I have something else in order to activate https://docs.celeryproject.org/en/master/userguide/configuration.html#std:setting-result_expires or it works already?
I don't have celery beat installed.

No. You do not need Celery beat for that. Result expiry is handled internally by Celery and/or the backend you use in your project.
However, keep in mind this:
Note
For the moment this only works with the AMQP, database, cache, Couchbase, and Redis backends.
When using the database backend, celery beat must be running for the results to be expired.

Related

Django celery delete specific tasks

we know we have some badly formed tasks in the celery queue that are crashing our workers, is there an easy way to manually delete them?
We don't want to flush the whole thing as there might be some important emails to be sent...
Use celery flower to manage and monitor the celery task. Refer to https://flower.readthedocs.io/en/latest/

How to have celery expire results when using a database backend

I'm not sure I understand how result_expires works.
I read,
result_expires
Default: Expire after 1 day.
Time (in seconds, or a timedelta object) for when after stored task tombstones will be deleted.
A built-in periodic task will delete the results after this time (celery.backend_cleanup), assuming that celery beat is enabled. The task runs daily at 4am.
...
When using the database backend, celery beat must be running for the results to be expired.
(from here: http://docs.celeryproject.org/en/latest/userguide/configuration.html#std:setting-result_expires)
So, in order for this to work, I have to actually do something like this:
python -m celery -A myapp beat -l info --detach
?
Is that what the documentation is referring to by "celery beat is enabled"? Or, rather than executing this manually, there is some configuration that needs to be set which would cause celery beat to be called automatically?
Re: celery beat--you are correct. If you use a database backend, you have to run celery beat as you posted in your original post. By default celery beat sets up a daily task that will delete older results from the results database. If you are using a redis results backend, you do not have to run celery beat. How you choose to run celery beat is up to you, personally, we do it via systemd.
If you want to configure the default expiration time to be something other than the default 1 day, you can use the result_expires setting in celery to set the number of seconds after a result is recorded that it should be deleted. e.g., 1800 for 30 minutes.

How to configure Celery to send email alerts when tasks fail?

How is it possible to configure celery to send email alerts when tasks are failing?
For example I want Celery to notify me when more than 3 tasks fail or more than 10 tasks are being retried.
Is it possible using celery or a utility (e.g. flower) or I have to write my own plugin?
Yes, all you need to do is set CELERY_SEND_TASK_ERROR_EMAILS = True and if Celery process fails django will send message with traceback to all emails set in ADMINS settings.
As far as I know, it's not possible out of the box.
You could write custom client on top of celery or flower or directly accessing RabbitMQ.
What I would do (and I am doing) is simply logging failed tasks and then use something like Graylog2 to monitor the log files, this works for all your infrastructure, not just Celery.
You can also use something like NewRelic which monitors your processes directly and offers many other features. Although email reporting on exceptions is somewhat limited in NewRelic.
A simple client/monitor probably is the quickest solution.

Is there a way to have Celery use MongoDB for scheduling tasks?

I know that it's currently possible to use Django Celery to schedule tasks using Django's built-in ORM, but is there a way to use MongoDB in this regard?
(I'm not asking about brokers or result backends, as I know that Celery supports those, I'm specifically asking about scheduling.)
I think what you're looking for is celerybeat-mongo
Yes it posible. I just use worker -B arguments for my workers.
I do not sure that celery scheduler put this tasks in queue (mongodb in this case), because it already runing on backend. But you always can trigger (delay) task inside schedule task.

How do I coordinate a cluster of celery beat daemons?

I have a cluster of three machines. I want to run celery beat on those. I have a few related questions.
Celery has this notion of a persistent scheduler. As long as my schedule consists only of crontab entries and is statically defined by CELERYBEAT_SCHEDULE, do I need to persist it at all?
If I do, then do I have to ensure this storage is synchronized between all machines of the cluster?
Does djcelery.schedulers.DatabaseScheduler automatically take care of concurrent beat daemons? That is, if I just run three beat daemons with DatabaseScheduler, am I safe from duplicate tasks?
Is there something like DatabaseScheduler but based on MongoDB, without Django ORM? Like Celery’s own MongoDB broker and result backend.
Currently Celery doesn't support multiple concurrent celerybeat instances.
You have to ensure only a single scheduler is running for a schedule
at a time, otherwise you would end up with duplicate tasks. Using a
centralized approach means the schedule does not have to be
synchronized, and the service can operate without using locks.
http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html