I am new to Celery. I would like to chain two groups of tasks, where all tasks in a group run async and the second group is processed only after all tasks in the first group are done. I do not need to return results for any of the tasks.
I have tried
g1 = group([task1.si(1), tasks1.si(2)])
g2 = group([task2.si(3), tasks2.si(4)])
chain(g1,g2).delay()
and it appears that the second group starts to process (task2.si(3)) after the first task in the first group (task1.si(1)) is done. I would expect task2.si(3) to start after tasks1.si(2) is done.
How can I chain together two groups so that the second group starts to process only after the first group has completed?
Thanks!
Related
I'm following the MPSolver java examples in an attempt to assign two tasks per worker. A worker must perform two tasks or no tasks, a task can be assigned to only one worker. A matrix indicates the tasks that a worker can perform. The goal is to maximize the number of worker-task combinations. Some workers and tasks may not get assigned. The examples cover single worker-task assignments, but I don't follow how to structure MPSolver constraints to associate two tasks per worker.
Subsequently, I need to minimize the amount of time each worker spends. The time worked is associated with individual workers.
1 ) Celery chain.
On the doc I read this:
Here’s a simple chain, the first task executes passing its return value to the next task in the chain, and so on.
>>> from celery import chain
>>> # 2 + 2 + 4 + 8
>>> res = chain(add.s(2, 2), add.s(4), add.s(8))()
>>> res.get()
16
But where exactly is chain item's result passed to next chain item? On the celery server side, or it passed to my app and then my app pass it to the next chain item?
It's important to me, because my results is quite big to pass them to app, and I want to do all this messaging right into celery server.
2 ) Celery group.
>>> g = group(add.s(i) for i in xrange(10))
>>> g(10).get()
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
Can I be sure that these tasks will be executed as much as possible together. Will celery give priority certain group since the first task of the group start to be being executed?
For example I have 100 requests and each request run group of task, and I don't want to mix task from different groups between each other. First started request to be processed can be the last completed, while his the last task are waiting for free workers which are busy with tasks from others requests. It seems to be better if group of task will be executed as much as possible together.
I will really appreciate if you can help me.
1. Celery Chain
Results are passed on celery side using message passing broker such as rabbitmq. Result are stored using result backend(explicitly required for chord execution). You could verify this information by running your celery worker with loglevel 'INFO' and identify how tasks are invoked.
Celery maintains dependency graph once you invoke tasks, so it exactly knows how to chain your tasks.
Consider callbacks where you link two different tasks,
http://docs.celeryproject.org/en/latest/userguide/canvas.html#callbacks
2. Celery Group
When you call tasks in group celery executes(invokes) them in parallel. Celery worker will try to pick up them depending upon workload it can pick up. If you invoke large number of tasks than your worker can handle, it is certainly possible your first few tasks will get executed first then celery worker will pick rest gradually.
If you have very large no. of task to be invoked in parallel better to invoke then in chunks of certain pool size,
You can mention priority of tasks as mentioned in answer
Completion of tasks in group depends on how much time each task takes. Celery tries to do fair task scheduling as much as possible.
Can I use a Celery Group primitive as the umbrella task in a map/reduce workflow?
Or more specific: Can the subtasks in a Group be run on multiple workers on multiple servers?
From the docs:
However, if you call apply_async on the group it will send a special
grouping task, so that the action of calling the tasks happens in a worker
instead of the current process
That seems to imply the tasks are all send to one worker...
Before 3.0 (and still) one could fire off the subtasks in a TaskSet which would run on multiple servers. The problem is determining whether all tasks have finished executing. That is normally done by polling all subtasks which is not really elegant.
I am wondering if the Group primitive can be used to mitigate this problem.
I found out it is possible to use Chords for such a map reduce like problem.
#celery.task(name='ic.mapper')
def mapper():
#split your problem in embarrassingly parallel maps
maps = [map.s(), map.s(), map.s(), map.s(), map.s(), map.s(), map.s(), map.s()]
#and put them in a chord that executes them in parallel and after they finish calls 'reduce'
mapreduce = celery.chord(maps)(reduce.s())
return "{0} mapper ran on {1}".format(celery.current_task.request.id, celery.current_task.request.hostname)
#celery.task(name='ic.map')
def map():
#do something useful here
import time
time.sleep(10.0)
return "{0} map ran on {1}".format(celery.current_task.request.id, celery.current_task.request.hostname)
#celery.task(name='ic.reduce')
def reduce(results):
#put the maps together and do something with the results
return "{0} reduce ran on {1}".format(celery.current_task.request.id, celery.current_task.request.hostname)
When the mapper is executed on a cluster of three workers/servers it first executes the mapper which splits your problem and the creates new subtasks that are again submitted to the broker. These run in parallel because the queue is consumed by all brokers. Also an chord task is created that polls all maps to see if they have finished. When done the reduce task is executed where you can glue your results back together.
In all: yes it is possible. Thanks for the vegetable guys!
I have a number of asynchronous tasks to run in parallel. All the tasks can be divided into two types, lets call one - type A (that are time consuming) and everything else type B (faster and quick to execute ones).
with a single ScheduledThreadPoolExecutor with x poolsize, eventually at some point all threads are busy executing type A, as a resul type B gets blocked and delayed.
what im trying to accomplish is to run a type A tasks parallel to type B, and i want tasks in both the types to run parallel within their group for performance .
Would you think its prudent to have two instances of ScheduledThreadPoolExecutor for the type A and B exclusively with their own thread pools ? Do you see any issues with this approach?
No, that's seems reasonable.
I am doing something similar i.e. I need to execute tasks in serial fashion depending on some id e.g. all the tasks which are for component with id="1" need to be executed serially to each another and in parallel to all other tasks which are for components with different ids.
so basically I need a separate queue of tasks for each different component, the tasks are pulled one after another from each specific queue.
In order to achieve that I use
Executors.newSingleThreadExecutor(new JobThreadFactory(componentId));
for each component.
Additionally I need ExecutorService for a different type of tasks which are not bound to componentIds, for that I create additional ExecutorService instance
Executors.newFixedThreadPool(DEFAULT_THREAD_POOL_SIZE, new JobThreadFactory());
This works fine for my case at least.
The only problem I can think of if there is a need of ordered execution of the tasks i.e.
task2 NEEDS to be executed after task1 and so on... But I doubt this the case here ...
I'm using the Perl client of beanstalkd. I need a simple way to not enqueue the same work twice.
I need something that needs to basically wait until there are K elements, and then groups them together. To accomplish this, I have the producer:
insert item(s) into DB
insert a queue item into beanstalkd
And the consumer:
while ( 1 ) {
beanstalkd.retrieve
if ( DB items >= K )
func_to_process_all_items
kill job
}
This is linear in the number of requests/processing, but in the case of:
insert 1 item
... repeat many times ...
insert 1 item
Assuming all these insertions happened before a job was retrieved, this would add N queue items, and it would do something as such:
check DB, process N items
check DB, no items
... many times ...
check DB, no items
Is there a smarter way to do this so that it does not insert/process the later job requests unnecessarily?
I had a related requirement. I only wanted to process a specific job once within a few minutes, but the producer could queue several instances of the same job. I used memcache to store the job identifier and set the expiry of the key to just a few minutes.
When a worker tried to add the job identifier to memcache, only the first would succeed - on failure to add the job id, the worker would delete the job. After a few minutes, the key expires from memcache and the job can be processed again.
Not particularly elegant, but it works.
Will this work for you?:
Create two Tubes "buffer" and "live". Your producer always only adds to the "buffer" tube.
Create two workers one watches the "buffer" and the other watches the "live" that call the blocking reserve() call
Whenever the "buffer" worker returns on reserve, it buries the job if there are less than K items. If there are exactly K, then it "kicks" all K jobs and transfers them to the "live" tube.
The "live" watcher will now return on its own reserve()
You just need to take care that a job does not ever return to the buffer queue from the buried state. A failsafe way to do this might be to delete it and then add it to live.
The two separate queues are only for cleaner separation. You could do the same with a single queue by burying everyjob until there are K-1 and then on the arrival of the K-th job, kicking all of them live.