I'm running a task every second, and it seems celery doesn't actually perform the task every second.
I guess celery might be a good scheduler for every 1 minute task, but might not be adequte for every second task.
Here's the picture which illustrates what I mean.
I'm using the following options
'schedule': 1.0,
'args': [],
'options': {
'expires': 3
}
And I'm using celery 4.0.0
Yes, Celery actually handles times as low as 1 second, and possibly lower since it takes a float. See this entry of periodic tasks in the docs http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html:
from celery import Celery
from celery.schedules import crontab
app = Celery()
#app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
# Calls test('hello') every 10 seconds.
sender.add_periodic_task(10.0, test.s('hello'), name='add every 10')
# Calls test('world') every 30 seconds
sender.add_periodic_task(30.0, test.s('world'), expires=10)
# Executes every Monday morning at 7:30 a.m.
sender.add_periodic_task(
crontab(hour=7, minute=30, day_of_week=1),
test.s('Happy Mondays!'),
)
#app.task
def test(arg):
print(arg)
A better written example can be found 1/3 the way down https://github.com/celery/celery/issues/3589:
# file: tasks.py
from celery import Celery
celery = Celery('tasks', broker='pyamqp://guest#localhost//')
#celery.task
def add(x, y):
return x + y
#celery.on_after_configure.connect
def add_periodic(**kwargs):
celery.add_periodic_task(10.0, add.s(2,3), name='add every 10')
So sender is the actual Celery broker, i.e. app = Celery()
Related
I try to figure out how to setup my "celery workers" to restart after a living day.
Indeed, I configured my worker children/processes to restart after executing a task.
But in some cases, there are no tasks to execute in 3-4 days. So I need to restart the long-living children.
Do you how to do this ?
This is my actual celery app setup:
app = Celery(
"celery",
broker=f"amqp://bla#blabla/blablabla",
backend="rpc://",
)
app.conf.task_serializer = "pickle"
app.conf.result_serializer = "pickle"
app.conf.accept_content = ["pickle", "application/json"]
app.conf.broker_connection_max_retries = 5
app.conf.broker_pool_limit = 1
app.conf.worker_max_tasks_per_child = 1 # Ensure 1 task is executed before restarting child
app.conf.worker_max_living_time_before_restart = 60 * 60 * 24 # The conf I want
Thank you :)
1.I have to use the same function like manychat's smart delay. But I write the chatbot code with fbmq, which is used the python language. so In this code, I want to delay 10 secs, and then use page_send(response.id,"link") to the users. How should I code that?
#page.callback(['test4'])
def callback_clicked_button(payload, event):
recipient_id = event.sender_id
page.send(recipient_id, "Before 10 secs" )
time_sleep(10)
page.send(recipient_id, "link" )
there's no response!
This python time has sleep fucntion which will help in delay.
import time
time.sleep(5) # 5 stand for seconds
Your code will go like this
import time
#page.callback(['test4'])
def callback_clicked_button(payload, event):
recipient_id = event.sender_id
page.send(recipient_id, "Before 10 secs" )
time.sleep(10)
#time_sleep(10)
page.send(recipient_id, "link" )
Coming from a node.js background, I am new to Scala and I tried using Twitter's Future.collect to perform some simple concurrent operations. But my code shows sequential behavior rather than concurrent behavior. What am I doing wrong?
Here's my code,
import com.twitter.util.Future
def waitForSeconds(seconds: Int, container:String): Future[String] = Future[String] {
Thread.sleep(seconds*1000)
println(container + ": done waiting for " + seconds + " seconds")
container + " :done waiting for " + seconds + " seconds"
}
def mainFunction:String = {
val allTasks = Future.collect(Seq(waitForSeconds(1, "All"), waitForSeconds(3, "All"), waitForSeconds(2, "All")))
val singleTask = waitForSeconds(1, "Single")
allTasks onSuccess { res =>
println("All tasks succeeded with result " + res)
}
singleTask onSuccess { res =>
println("Single task succeeded with result " + res)
}
"Function Complete"
}
println(mainFunction)
and this is the output I get,
All: done waiting for 1 seconds
All: done waiting for 3 seconds
All: done waiting for 2 seconds
Single: done waiting for 1 seconds
All tasks succeeded with result ArraySeq(All :done waiting for 1 seconds, All :done waiting for 3 seconds, All :done waiting for 2 seconds)
Single task succeeded with result Single :done waiting for 1 seconds
Function Complete
The output I expect is,
All: done waiting for 1 seconds
Single: done waiting for 1 seconds
All: done waiting for 2 seconds
All: done waiting for 3 seconds
All tasks succeeded with result ArraySeq(All :done waiting for 1 seconds, All :done waiting for 3 seconds, All :done waiting for 2 seconds)
Single task succeeded with result Single :done waiting for 1 seconds
Function Complete
Twitter's futures are more explicit about where computations are executed than the Scala standard library futures. In particular, Future.apply will capture exceptions safely (like s.c.Future), but it doesn't say anything about which thread the computation will run in. In your case the computations are running in the main thread, which is why you're seeing the results you're seeing.
This approach has several advantages over the standard library's future API. For one thing it keeps method signatures simpler, since there's not an implicit ExecutionContext that has to be passed around everywhere. More importantly it makes it easier to avoid context switches (here's a classic explanation by Brian Degenhardt). In this respect Twitter's Future is more like Scalaz's Task, and has essentially the same performance benefits (described for example in this blog post).
The downside of being more explicit about where computations run is that you have to be more explicit about where computations run. In your case you could write something like this:
import com.twitter.util.{ Future, FuturePool }
val pool = FuturePool.unboundedPool
def waitForSeconds(seconds: Int, container:String): Future[String] = pool {
Thread.sleep(seconds*1000)
println(container + ": done waiting for " + seconds + " seconds")
container + " :done waiting for " + seconds + " seconds"
}
This won't produce exactly the output you're asking for ("Function complete" will be printed first, and allTasks and singleTask aren't sequenced with respect to each other), but it will run the tasks in parallel on separate threads.
(As a footnote: the FuturePool.unboundedPool in my example above is an easy way to create a future pool for a demo, and is often just fine, but it isn't appropriate for CPU-intensive computations—see the FuturePool API docs for other ways to create a future pool that will use an ExecutorService that you provide and can manage yourself.)
Celery tasks have the limitation that they can not call subprocesses. Call backs and other canvas functionality is handled by how the tasks are called and related through the various canvas functions.
But when you schedule tasks through celery beat it appear that the only option is to call a single task without any of the canvas funcitons.
I need to schedule a task that has call backs. Is there anyway to do that in celery?
Its possible I could accomplish what I need in a single task but it would involve having significant memory overhead and take a long time to execute with many db hits. Is that a good idea?
You can just generate the signature for whatever you need and pass it in the options dictionary. Here is a working tasks.py for a simple chain:
from datetime import timedelta
from celery import Celery, signature
app = Celery('tasks', broker='redis://localhost')
app.conf.update(
beat_schedule={'add_divide': {'task': 'tasks.add',
'args': (5, 7),
'schedule': timedelta(seconds=10),
'options': {'queue': 'testq',
'link': signature('tasks.divide',
args=(4, ),
queue='testq'
)
}
}
}
)
#app.task
def add(x, y):
z = x + y
print('sum: {0}'.format(z))
return z
#app.task
def divide(x, y):
z = x / y
print('divide: {0}'.format(z))
return z
Running celery worker -A tasks.celery -Q testq --beat -c1 will output something like:
[2017-04-07 00:00:00,000: WARNING/PoolWorker-2] sum: 12
[2017-04-07 00:00:00,050: WARNING/PoolWorker-2] divide: 3
I have a celery chain that runs some tasks. Each of the tasks can fail and be retried. Please see below for a quick example:
from celery import task
#task(ignore_result=True)
def add(x, y, fail=True):
try:
if fail:
raise Exception('Ugly exception.')
print '%d + %d = %d' % (x, y, x+y)
except Exception as e:
raise add.retry(args=(x, y, False), exc=e, countdown=10)
#task(ignore_result=True)
def mul(x, y):
print '%d * %d = %d' % (x, y, x*y)
and the chain:
from celery.canvas import chain
chain(add.si(1, 2), mul.si(3, 4)).apply_async()
Running the two tasks (and assuming that nothing fails), your would get/see printed:
1 + 2 = 3
3 * 4 = 12
However, when the add task fails the first time and succeeds in subsequent retry calls, the rest of the tasks in the chain do not run, i.e. the add task fails, all other tasks in the chain are not run and after a few seconds, the add task runs again and succeeds and the rest of the tasks in the chain (in this case mul.si(3, 4)) does not run.
Does celery provide a way to continue failed chains from the task that failed, onwards? If not, what would be the best approach to accomplishing this and making sure that a chain's tasks run in the order specified and only after the previous task has executed successfully even if the task is retried a few times?
Note 1: The issue can be solved by doing
add.delay(1, 2).get()
mul.delay(3, 4).get()
but I am interested in understanding why chains do not work with failed tasks.
You've found a bug :)
Fixed in https://github.com/celery/celery/commit/b2b9d922fdaed5571cf685249bdc46f28acacde3
will be part of 3.0.4.
I'm also interested in understanding why chains do not work with failed tasks.
I dig some celery code and what I've found so far is:
The implementation happends at app.builtins.py
#shared_task
def add_chain_task(app):
from celery.canvas import chord, group, maybe_subtask
_app = app
class Chain(app.Task):
app = _app
name = 'celery.chain'
accept_magic_kwargs = False
def prepare_steps(self, args, tasks):
steps = deque(tasks)
next_step = prev_task = prev_res = None
tasks, results = [], []
i = 0
while steps:
# First task get partial args from chain.
task = maybe_subtask(steps.popleft())
task = task.clone() if i else task.clone(args)
i += 1
tid = task.options.get('task_id')
if tid is None:
tid = task.options['task_id'] = uuid()
res = task.type.AsyncResult(tid)
# automatically upgrade group(..) | s to chord(group, s)
if isinstance(task, group):
try:
next_step = steps.popleft()
except IndexError:
next_step = None
if next_step is not None:
task = chord(task, body=next_step, task_id=tid)
if prev_task:
# link previous task to this task.
prev_task.link(task)
# set the results parent attribute.
res.parent = prev_res
results.append(res)
tasks.append(task)
prev_task, prev_res = task, res
return tasks, results
def apply_async(self, args=(), kwargs={}, group_id=None, chord=None,
task_id=None, **options):
if self.app.conf.CELERY_ALWAYS_EAGER:
return self.apply(args, kwargs, **options)
options.pop('publisher', None)
tasks, results = self.prepare_steps(args, kwargs['tasks'])
result = results[-1]
if group_id:
tasks[-1].set(group_id=group_id)
if chord:
tasks[-1].set(chord=chord)
if task_id:
tasks[-1].set(task_id=task_id)
result = tasks[-1].type.AsyncResult(task_id)
tasks[0].apply_async()
return result
def apply(self, args=(), kwargs={}, **options):
tasks = [maybe_subtask(task).clone() for task in kwargs['tasks']]
res = prev = None
for task in tasks:
res = task.apply((prev.get(), ) if prev else ())
res.parent, prev = prev, res
return res
return Chain
You can see that at the end prepare_steps prev_task is linked to the next task.
When the prev_task failed the next task is not called.
I'm testing with adding the link_error from prev task to the next:
if prev_task:
# link and link_error previous task to this task.
prev_task.link(task)
prev_task.link_error(task)
# set the results parent attribute.
res.parent = prev_res
But then, the next task must take care of both cases (maybe, except when it's configured to be immutable, e.g. not accept more arguments).
I think chain can support that by allowing some syntax likes this:
c = chain(t1, (t2, t1e), (t3, t2e))
which means:
t1 link to t2 and link_error to t1e
t2 link to t3 and link_error to t2e