CeleryExecutor: Does the airflow metric "executor.queued_tasks" report the number of tasks in the celery broker? - celery

Using its statsd plugin, Airflow can report on metric executor.queued_tasks as well as some others.
I am using CeleryExecutor and need to know how many tasks are waiting in the Celery broker, so I know when new workers should be spawned. Indeed, I set my workers so they cannot take many tasks concurrently. Is this metric what I need?

Nope. If you want to know how many TIs are waiting in the broker, you'll have to connect to it.
Task instances that are waiting to get picked up in the celery broker are queued according to the Airflow DB, but running according to the CeleryExecutor. This is because the CeleryExecutor considers that any task instance that was successfully sent to the broker is now running (unlike the DB, which waits for a worker to pick it up before marking it as running).
Metric executor.queued_tasks reports the number of tasks queued according to the executor, not the DB.
The number of queued task instances according to the DB is not exactly what you need either, because it reports the number of task instances that are waiting in the broker plus the number of task instances queued to the executor. But when would TIs be stuck in the executor's queue, you ask? When the parallelism setting of Airflow prevents the executor from sending them to the broker.

Related

spark streaming - waiting for a dead executor

I have a spark streaming application running inside a k8s cluster (using spark-operator).
I have 1 executor, reading batches every 5s from a Kinesis stream.
The Kinesis stream has 12 shards, and the executor creates 1 receiver per shard. I gave it 24 cores, so it should be more than enough to handle it.
For some unknown reason, sometimes the executor crashes. I suspect it is due to memory going over the k8s pod memory limit, which would cause k8s to kill the pod. But I have not been able to prove this theory yet.
After the executor crashes, a new one is created.
However, the "work" stops. The new executor is not doing anything.
I investigated a bit:
Looking at the logs of the pod - I saw that it did execute a few tasks successfully after it was created, and then it stopped because it did not get any more tasks.
Looking in Spark Web UI - I see that there is 1 “running batch” that is not finishing.
I found some docs that say there can always be only 1 active batch at a time. So this is why the work stopped - it is waiting for this batch to finish.
Digging a bit deeper in the UI, I found this page that shows details about the tasks.
So executor 2 finished doing all the tasks it was assigned.
There are 12 tasks that were assigned to executor 1 which are still waiting to finish, but executor 1 is dead.
Why does this happen? Why does Spark not know that executor 1 is dead and never going to finish it's assigned tasks? I would expect Spark to reassign these tasks to executor 2.

Batch creation of Kafka Connector

I'm trying to create 1000 Connectors, each one with one task, his own consumer group and unique topic in my Kafka Kubernetes cluster
(My end goal after creating the connectors, is sending a lot of requests to the connectors' topics and measure performances for our implemented connector sink).
Every creation triggers rebalancing across the cluster which "blocks" the Connector RestAPI (returns 409 for everything) and shutting down tasks.
Therefore I have three questions:
Is the rebalancing a sort of downtime for the connector (As I said, there are tasks shutting down and restart while rebalancing and one task for connector)?
Can I configure the rebalancing schedule?
Is there a way of creating Connectors in batches so it would be fast (say creating 100 connectors in less than one second) and won't cause downtime (if the answer for the first question is yes)?
One way around the problem would be to start 1000 Connect Clusters (say, via Docker Orchestration), all with one or few Connectors.
There isn't a way around the rebalancing. You're adding consumers to the same consumer group, so that'll always rebalance.
Rather than running one task per connector, I would suggest grouping multiple topics/tasks together than share similar configurations, that way you limit how much rebalancing would be done.

Submitting celery jobs to an SGE queue

I'm working on a cluster that uses SGE to manage jobs across the worker nodes. Is there a way to use the SGE queue as the broker in a way that will cooperate with other people submitting jobs through non-celery means. I currently use python-gridmap to submit python jobs to the SGE queue but I'd like to use the feature-set from Celery.
Would I need to make a new Broker, or Consumer, both?

How to start a celery worker that only pushes tasks to the broker but does not consume them?

I have the main producer of tasks in a webserver. I do not want the webserver to consume any tasks, so it should only send the tasks to the broker which get consumed by other nodes.
Right now I route tasks using the -Q option in the nodes by specifying the particular queues for each node. Is there a way to specify 0 queues for a worker?
Any help appreciated, thanks!
You do not need to use a worker to push tasks to the broker - you can do that from a regular python process.

Why are my celery worker processes processing everything in serial?

I'm running multiple celery worker processes on a AWS c3.xlarge (4 core machine). There is a "batch" worker process with its --concurrency parameter set to 2, and a "priority" process with its --concurrency parameter set to 1. Both worker processes draw from the same priority queue. I am using Mongo as my broker. When I submit multiple jobs to the priority queue they are processed serially, one after the other, even though multiple workers are available. All items are processed by the "priority" process, but if I stop the "priority" process, the "batch" process will process everything (still serially). What could I have configured incorrectly that prevents celery from processing jobs asynchronously?
EDIT: It turned out that the synchronous bottleneck is in the server submitting the jobs rather than in celery.
By default the worker will prefetch 4 * concurrency number tasks to execute, which means that your first running worker is prefetching 4 tasks, so if you are queuing 4 or less tasks they will be all processed by this worker alone, and there won't be any other messages in the queue to be consumed by the second worker.
You should set the CELERYD_PREFETCH_MULTIPLIER to a number that works best for you, I had this problem before and set this option to 1, now all my tasks are fairly consumed by the workers.