I am using celery.contrib.batches to execute a batch of celery tasks. I know its experimental but still wanted to give it a try and I am pretty close. While executing individual tasks in the batch and I am sending signals like backend.mark_as_started(request.id), backend.mark_as_done(request.id, True) deliberately. But the signals are not being received at the worker. Note that everything works if I get rid of batches and execute task one a time. Meaning, my signal handler functions do get executed.
The celery.contrib.Batches indeed do not send these signals. The solution is to send those signals from inside the Batch task.
Related
Say you have a message queue that needs to be polled every x seconds. What are the usual ways to poll it and execute HTTP/Rest-based jobs? Do you simply create a cron service and call the worker script every x seconds?
Note: This is for a web application
I would write a windows service which constantly polls/waits for new messages.
Scheduling a program to run every x min has a number of problems
If your interval is too small the program will still be running with the next startup is triggered.
If your interval is too big the queue will fill up between runs.
Generally you expect a constant stream of messages, so there is no problem just keeping the program running 24/7
One common feature of the message queue systems I've worked with is that you don't poll but use a blocking read. If you have more than one waiting worker, the queue system will pick which one gets to process the message.
I want to run some jobs in a cluster, but I want to be able to kill the job if it is taking too long. Can I do this gracefully from the client, and still have the worker available to do more jobs?
My scenario is that I want to investigate how different machine learning classifiers and hyperparameters affect the time to run .fit(). If the time takes too long, I just want to abandon the task and move on to the next one.
I can find the PIDs of the workers, and I can use kill() to send a signal from the client, but sending SIGINT, SIGHUP and SIGABRT all seem to ruthlessly kill the worker, not just interrupt it. I can't put any logic in the worker code because it's the atomic call to .fit() that I want to time and interrupt.
I want to launch a chain of Celery tasks, and have them all execute before any newly arriving tasks do. I'll have a single worker process handling all tasks.
I guess the easiest thing to do would be to not make them a chain at all, but instead launch a single task that synchronously calls a sequence of functions. But I'd like to take advantage of Celery retries, allowing each task to be retried a different number of times.
What's the best way to do this?
If you have a single worker running a single process then as far as I can tell from working with celery (this is not explicitly documented) you should get the behavior you want.
If you want to use multiple worker processes then you may need to set CELERYD_PREFETCH_MULTIPLIER to 1.
I am currently using Redis for a workflow with a couple of steps in it. For each step, a worker snatches the payload from a queue, and when its done, pushes it onto the next queue, where the next worker can take it further. If an exception occurs, the task is put into a special queue by the worker.
The application logic with regards to the flow through the application hence lies in the workers themselves. I now want to switch to Celery.
I understand that in Celery you can use subtasks, but I fail to see how you express your specific error handling there for different conditions such as exceptions and time-outs. Are you supposed to use different queues or use subtasks, and what would that look like in code?
I have now read the docs even more thoroughly and additionally made some tests, and this works:
The problem is to string together tasks so that they happen one after the other, but at the same time be able to handle error conditions and "break out" of the flow and do something else, not just abort.
You can string together tasks with link, and if an extra parameter *link_error* is in there it will be used for failure. From reading:
http://docs.celeryproject.org/en/latest/userguide/calling.html#linking-callbacks-errbacks
I made this:
res = add.apply_async((2, 2), link=mul.s(16), link_error=onerror.s())
The three tasks are add, mul and onerror. Add adds two numbers and mul multiplies two numbers. So this will add 2 and 2 together, and then the sum will carried over to the next step (mul) and be multiplied by 16.
However, if the add code is buggy, or has bad data or if something else bad but detectable occurs, add throws an exception and the onerror task will be run instead of mul. The onerror task gets the uuid of the job and can look the job up in the database backend, if such a one is configured. The onerror task can then archive the job or send an e-mail or whatever.
I'm new to celery and I would appreciate a little help with a design pattern(or example code) for a worker I have yet to write.
Below is a description of the desired characteristics of the worker.
The worker will run a task that collects data from an endless source, a generator.
The worker task will run forever feeding from the generator unless it is directed to stop.
The worker task should stop gracefully on the occurrence of any one of the following triggers.
It exceeds an execution time limit in seconds.
It exceeds a number of iterations of the endless generator loop.
The client sends a message instructing the worker task to finish immediately.
Below is some sudo code for how I believe I need to handle trigger scenarios 1 and 2.
What I don't know is how I send the 'finish immediately' signal from the client and how it is received and executed in the worker task.
Any advice or sample code would be appreciated.
from celery.task import task
from celery.exceptions import SoftTimeLimitExceeded
COUNTLIMIT = # some value sent to the worker task by the client
#task()
def getData():
try:
for count, data in enumerate(endlessGeneratorThing()):
# process data here
if count > COUNTLIMIT: # Handle trigger scenario 2
clean_up_task_nicely()
break
except SoftTimeLimitExceeded: # Handle trigger scenario 1
clean_up_task_nicely()
My understanding of revoke is that it only revokes a task prior to its execution. For (3), I think what you want to do is use an AbortableTask, which provides a cooperative way to end a task:
http://docs.celeryproject.org/en/latest/reference/celery.contrib.abortable.html
On the client end you are able to call task.abort(), on the task end, you are able to poll task.is_aborted()