Is there way to specify expiry when creating a celery task? - celery

Specifically I use shared_task decorator to create a celery task.
I tried the obvious #shared_task(expires=3) but it doesn't seem to work.
Is there a way to tell this task is supposed to expire ** seconds after it is received as you can do with apply_async in calling time?

I am not sure about shared_task, but in Celery 5.2.7 at least, you can pass an expires parameter to #app.task.
from myapp.celery import app as celery_app
#celery_app.task(expires=60)
def my_task(...

Probably this should help
from tasks import add
result = add.apply_async(args=[10, 10], expires=6000)
Also make sure that your producer and consumer machine clocks are in sync. Else use
CELERY_ENABLE_UTC = True
CELERY_TIMEZONE = 'Etc/UTC'

Related

Dash - Run callback in server side

Good morning,
I have created a callback in Dash that makes the job of a scheduler.
Every 10 minutes (with the help of an interval component), my callback is running to fetch the data from a server and to update the csv file that I use in my app.
The problem is that my callback is called only when I have the webpage opened. As soon as I close the page, the scheduler stops and runs again when I open the page again.
As the data process of updating data can be long sometimes, I want the scheduler to always run and fetch the data every 10 minutes.
I assume that a callback is a client side process right? So how can I make it run in server side?
Thank you,
Dash is probably not the right solution for this. I think it would make more sense to set up the Python code you need for this job in a simple .py script file, and set a cron job to run that script every 10 min.
Thank you #coralvanda for the help.
I finally did a python script in my container A that calls the container B every 10 minutes. The container B is fetching the data.
It makes the job.
import schedule
import time
import docker
def worker_restart():
client = docker.from_env()
container = client.containers.get('container_worker')
container.restart()
schedule.every(10).minutes.do(worker_restart)
while True:
schedule.run_pending()
time.sleep(1)

How do I disable Celery's default timeout for a task, and/or prevent it from retrying?

I'm having some troubles with celery. Unfortunately the person who set it up isn't working here any more, and until now we've never had problems and thought we understood how it works well enough. Now it has become clear that we don't, and after hours of searching through documentation and other posts on here, I have to admit defeat. Hopefully, someone here can shed some light on what I am missing.
We're using several tasks, all of them are defined in a CELERYBEAT_SCHEDULE like this:
CELERYBEAT_SCHEDULE = {
'runs-every-5-minutes': {
'task': 'tasks.webhook',
'schedule': crontab(minute='*/5'),
'args': (WEBHOOK_BASE + '/task/refillordernumberbuffer', {'refill_count': 1000})
},
'send-sameday-delivery-confirmation': {
'task': 'tasks.webhook',
'schedule': crontab(minute='*/2'),
'args': (WEBHOOK_BASE + '/task/sendsamedaydeliveryconfirmation', {})
},
'send-customer-hotspot-notifications': {
'task': 'tasks.webhook',
'schedule': crontab(hour=9, minute=0),
'args': (WEBHOOK_BASE + '/task/sendcustomerhotspotnotifications', {})
},
}
That's not all of them, but they all work like this. All of those are actually PHP scripts that have no knowledge of the whole celery concept. They are just scripts that execute certain things, and send notifications if necessary. When they are done, they just spit out a JSON response that says success=true.
As far as I know, celery is only used to execute them periodically. We don't have problems with any of them except the last one from my code snippet. That task/script sends out emails, usually 5 to 10, but sometimes a lot more. And that's where the problems start, because (as far as I could examine by watching in celery events, I could honestly not find any confirmation for this in the docs anywhere) when the successful JSOn response from the PHP script doesn't arrive within 3 minutes, celery retries the task, and the script sends a lot of emails again. And again, because just a small amount of emails was saved as "done" form the tasks initial run. This often leads to 4 or 5 retries until finally enough emails were marked as "successfully sent" by the prior retries that finally the last retry finishes under this mystical 3 minute limit.
My questions:
Is there a default time limit? Where is it set? How do I override it? I've read about time_limit and soft_time_limit, but nothing I tried in the config seemed to help. If this is the solution, I would be in need of assistance as to how the settings are properly applied.
Can't I "just" disable the whole retry concept (for one task or for all, doesn't really matter) altogether? It seems to me that we don't need it, as we're running our tasks periodically and missing one due to a temporary error would not matter. I guess that means we shouldn't have used celery in the first place as we're misusing it, but for now I'd just like to understand it better.
Thanks for any help, and sorry if I left anything unclear – happy to answer any follow-up questions and provide more details if necessary.
The rest of the config file goes like this:
## Broker settings.
databases = parse_databases_xml()
settings = parse_custom_settings_xml()
BROKER_URL = 'redis://' + databases['taskqueue']['host'] + '/' + databases['taskqueue']['dbname']
# List of modules to import when celery starts.
CELERY_IMPORTS = ("tasks", )
## Using the database to store task state and results.
CELERY_RESULT_BACKEND = BROKER_URL
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_ANNOTATIONS = {
"*": {"rate_limit": "100/m"},
"ping": {"rate_limit": "100/m"},
}
There is no time_limit to be found anywhere, so I don't think we're setting it ourselves. I left out the python imports and the functions that read from our config xml files, as that stuff is all working fine and just concerns some database auth data.

django-celery PeriodicTask and eta field

I have a django project in combination with celery and my need is to be able to schedule tasks dynamically, at some point in the future, with recurrence or not. I need the ability to delete/edit already scheduled tasks
So to achieve this at the beginning I started using django-celery with DatabaseScheduler to store some PeriodicTasks (with expiration) to the database as it is described more or less here
In this way if I close my app and start it again my schedules are still there
My problem though still remains since I cannot utilize the eta and schedule a task at some point in the future. Is it possible somehow to dynamically schedule a task with eta?
A second question of mine is whether I can schedule a once off task, like schedule it to run e.g. at 2015-05-15 15:50:00 (that is why I'm trying to use eta)
Finally, I will be scheduling some thousants of notifications, is celery beat capable to handle this number of scheduled tasks? some of them once-off while others being periodic? Or do I have to go with a more advanced solution such as APScheduler
Thank you
I've faced the same problem yesterday. My ugly temporary solution is:
# tasks.py
from djcelery.models import PeriodicTask, IntervalSchedule
from datetime import timedelta, datetime
from django.utils.timezone import now
...
#app.task
def schedule_periodic_task(task='app.tasks.task', task_args=[], task_kwargs={},
interval=(1, 'minute'), expires=now()+timedelta(days=365*100)):
PeriodicTask.objects.filter(name=task+str(task_args)+str(task_kwargs)).delete()
task = PeriodicTask.objects.create(
name=task+str(task_args)+str(task_kwargs), task=task,
args=str(task_args),
kwargs=str(task_kwargs),
interval=IntervalSchedule.objects.get_or_create(
every=interval[0],
period=interval[1])[0],
expires=expires,
)
task.save()
So, if you want to schedule periodic task with eta, you shoud
# anywhere.py
schedule_periodic_task.apply_async(
kwargs={'task': 'grabber.tasks.grab_events',
'task_args': [instance.xbet_id], 'task_kwargs': {},
'interval': (10, 'seconds'),
'expires': instance.start + timedelta(hours=3)},
eta=instance.start,
)
schedule task with eta, which creates periodic task. Ugly:
deal with raw.task.name
strange period (n, 'interval')
Please, let me know, if you designed some pretty solution.

Setting Time Limit on specific task with celery

I have a task in Celery that could potentially run for 10,000 seconds while operating normally. However all the rest of my tasks should be done in less than one second. How can I set a time limit for the intentionally long running task without changing the time limit on the short running tasks?
You can set task time limits (hard and/or soft) either while defining a task or while calling.
from celery.exceptions import SoftTimeLimitExceeded
#celery.task(time_limit=20)
def mytask():
try:
return do_work()
except SoftTimeLimitExceeded:
cleanup_in_a_hurry()
or
mytask.apply_async(args=[], kwargs={}, time_limit=30, soft_time_limit=10)
This is an example with decorator for an specific Task and Celery 3.1.23 using soft_time_limit=10000
#task(bind=True, default_retry_delay=30, max_retries=3, soft_time_limit=10000)
def process_task(self, task_instance):
"""Task processing."""
pass

Necessity of cloning classes for background processes running through rake?

I have a resque worker class which works with ActionMailer and another that works with Mail directly. Here's a short example:
class NotificationWorker
def self.perform(id)
Mailer.delivery_method.settings = {
# custom settings here
}
# Working with Mailer to deliver mails
end
end
Assuming that there may be two workers running on NotificationWorker, I am not sure if these interfer each other. From my understanding, working directly on the Mail class would break functionality because this would result in both mailers using the same settings instead of their assigned ones. A solution would be to create a clone of such a class (which works with ActionMailer, but not with Mail AFAIK).
According to the Resque docs:
Resque workers are rake tasks that run forever. They basically do
this:
start
loop do
if job = reserve
job.process
else
sleep 5 # Polling frequency = 5
end
end
shutdown
I am not familiar with rake besides the basic usage for rails apps. So can anyone enlighten me?
not quite sure what you're trying to achieve here. I have a resque system which queue and delivers automated emails. I have it set up like this:
1) env.rb
config.action_mailer.delivery_method = :smtp
config.action_mailer.smtp_settings = {...}
2) notification_job.rb # its the job not the worker that needs creating.
class NotificationWorker
def self.perform(id)
# Working with Mailer to deliver mails
end
end
If you really need to work with mailer directly and each worker needs different settings then you may need to create a yaml file which relates to a variable that you give the worker on startup.