I am using celery with SQS.
One of my celery worker go to DB and check for pending messages and sends notification.
I have 2 worker doing same job.
The problem here is this job is scheduled and there are 2 worker for job so this causing a problem that some of the messages are getting sent twice.
How to avoid 2 jobs picking same message?
Should I stop using 2 worker for processing scheduled jobs?
Related
We are using remote chunking with one master and 2 workers. While testing with small job all chunk messages has been sent to Queue in one go and then Master went down but workers are continuing to work on chunks from request Queue.
When Master restarted , all the replies from the worker are queued in Reply-Queue but due to miss match in correlation ID of reply messages and new message container created , messages are not consumed from the reply queue.
How to deal with such scenarios while using Spring Batch.
<int-jms:outbound-gateway
id="masterOutboundGateway"
connection-factory="masterJMSConnectionFactory"
correlation-key="JMSCorrelationID"
request-channel="ChunkMasterRequestChannel"
request-destination-name="ChunkRequestQueue"
receive-timeout="50000"
reply-channel="ChunkMasterReplyChannel"
reply-destination-name="ChunkReplyQueue" async="true" >
<int-jms:reply-listener/>
</int-jms:outbound-gateway>
Using its statsd plugin, Airflow can report on metric executor.queued_tasks as well as some others.
I am using CeleryExecutor and need to know how many tasks are waiting in the Celery broker, so I know when new workers should be spawned. Indeed, I set my workers so they cannot take many tasks concurrently. Is this metric what I need?
Nope. If you want to know how many TIs are waiting in the broker, you'll have to connect to it.
Task instances that are waiting to get picked up in the celery broker are queued according to the Airflow DB, but running according to the CeleryExecutor. This is because the CeleryExecutor considers that any task instance that was successfully sent to the broker is now running (unlike the DB, which waits for a worker to pick it up before marking it as running).
Metric executor.queued_tasks reports the number of tasks queued according to the executor, not the DB.
The number of queued task instances according to the DB is not exactly what you need either, because it reports the number of task instances that are waiting in the broker plus the number of task instances queued to the executor. But when would TIs be stuck in the executor's queue, you ask? When the parallelism setting of Airflow prevents the executor from sending them to the broker.
I'm working on a cluster that uses SGE to manage jobs across the worker nodes. Is there a way to use the SGE queue as the broker in a way that will cooperate with other people submitting jobs through non-celery means. I currently use python-gridmap to submit python jobs to the SGE queue but I'd like to use the feature-set from Celery.
Would I need to make a new Broker, or Consumer, both?
I'm running multiple celery worker processes on a AWS c3.xlarge (4 core machine). There is a "batch" worker process with its --concurrency parameter set to 2, and a "priority" process with its --concurrency parameter set to 1. Both worker processes draw from the same priority queue. I am using Mongo as my broker. When I submit multiple jobs to the priority queue they are processed serially, one after the other, even though multiple workers are available. All items are processed by the "priority" process, but if I stop the "priority" process, the "batch" process will process everything (still serially). What could I have configured incorrectly that prevents celery from processing jobs asynchronously?
EDIT: It turned out that the synchronous bottleneck is in the server submitting the jobs rather than in celery.
By default the worker will prefetch 4 * concurrency number tasks to execute, which means that your first running worker is prefetching 4 tasks, so if you are queuing 4 or less tasks they will be all processed by this worker alone, and there won't be any other messages in the queue to be consumed by the second worker.
You should set the CELERYD_PREFETCH_MULTIPLIER to a number that works best for you, I had this problem before and set this option to 1, now all my tasks are fairly consumed by the workers.
we use celery with rabbitMQ backend and some of our servers hang with error: "[Errno 113] No route to host"(which can be a result of half of our servers being in US and half in Europe).
I need to be sure that every task is being delivered, unfortunately I have no idea how to retry tasks sent using send_task/string identifier(server that sends tasks has no access to code of remote worker) like this:
send_task("remote1.tasks.add_data", args=[...], kwargs={}, queue="remote1")
Is it possible to retry such task?
sent_task send just the message to the broker, if the exception is raised on the servers that call the sent_task, probably the message simply doesn't reach the broker, than there is no task to retry but just an exception to be handled.
Otherwise if all you workers randomly raise this exception because they can't reach the broker for some reason probably you can solve by set to true the celery conf vars
CELERY_ACKS_LATE = True
"Late ack means the task messages will be acknowledged after the task has been executed, not just before, which is the default behavior."
This means that if something go mad during the execution of the task in the worker, the broker doesn't receive the acks and another worker will execute the task.