Django celery delete specific tasks - celery

we know we have some badly formed tasks in the celery queue that are crashing our workers, is there an easy way to manually delete them?
We don't want to flush the whole thing as there might be some important emails to be sent...

Use celery flower to manage and monitor the celery task. Refer to https://flower.readthedocs.io/en/latest/

Related

Do I need to enable celery beat to activate result_expires?

I have django celery up and running.
Do I have something else in order to activate https://docs.celeryproject.org/en/master/userguide/configuration.html#std:setting-result_expires or it works already?
I don't have celery beat installed.
No. You do not need Celery beat for that. Result expiry is handled internally by Celery and/or the backend you use in your project.
However, keep in mind this:
Note
For the moment this only works with the AMQP, database, cache, Couchbase, and Redis backends.
When using the database backend, celery beat must be running for the results to be expired.

Celery Beat runs duplicate tasks

I have one celery beat task, that is running other scraping tasks.
When those tasks are not processed, queue is starting to grow.
I know celery use backend db, but there are only: id, task_id, status, result, date_done, traceback.
My ideas is to switch from celery beat to rescheduling tasks by them self, but some tasks are unconnected or can get lost, so celery beat is useful in these cases.
Second idea is to add my logs, like my table, where I can save task-id and task context, by which I will be able to find out if task already exists.
May be you have better approach? Thanks
celery tasks can be delayed with expires argument:
http://docs.celeryproject.org/en/latest/userguide/calling.html#expiration

How to configure Celery to send email alerts when tasks fail?

How is it possible to configure celery to send email alerts when tasks are failing?
For example I want Celery to notify me when more than 3 tasks fail or more than 10 tasks are being retried.
Is it possible using celery or a utility (e.g. flower) or I have to write my own plugin?
Yes, all you need to do is set CELERY_SEND_TASK_ERROR_EMAILS = True and if Celery process fails django will send message with traceback to all emails set in ADMINS settings.
As far as I know, it's not possible out of the box.
You could write custom client on top of celery or flower or directly accessing RabbitMQ.
What I would do (and I am doing) is simply logging failed tasks and then use something like Graylog2 to monitor the log files, this works for all your infrastructure, not just Celery.
You can also use something like NewRelic which monitors your processes directly and offers many other features. Although email reporting on exceptions is somewhat limited in NewRelic.
A simple client/monitor probably is the quickest solution.

Is there a way to have Celery use MongoDB for scheduling tasks?

I know that it's currently possible to use Django Celery to schedule tasks using Django's built-in ORM, but is there a way to use MongoDB in this regard?
(I'm not asking about brokers or result backends, as I know that Celery supports those, I'm specifically asking about scheduling.)
I think what you're looking for is celerybeat-mongo
Yes it posible. I just use worker -B arguments for my workers.
I do not sure that celery scheduler put this tasks in queue (mongodb in this case), because it already runing on backend. But you always can trigger (delay) task inside schedule task.

How do I coordinate a cluster of celery beat daemons?

I have a cluster of three machines. I want to run celery beat on those. I have a few related questions.
Celery has this notion of a persistent scheduler. As long as my schedule consists only of crontab entries and is statically defined by CELERYBEAT_SCHEDULE, do I need to persist it at all?
If I do, then do I have to ensure this storage is synchronized between all machines of the cluster?
Does djcelery.schedulers.DatabaseScheduler automatically take care of concurrent beat daemons? That is, if I just run three beat daemons with DatabaseScheduler, am I safe from duplicate tasks?
Is there something like DatabaseScheduler but based on MongoDB, without Django ORM? Like Celery’s own MongoDB broker and result backend.
Currently Celery doesn't support multiple concurrent celerybeat instances.
You have to ensure only a single scheduler is running for a schedule
at a time, otherwise you would end up with duplicate tasks. Using a
centralized approach means the schedule does not have to be
synchronized, and the service can operate without using locks.
http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html